Testing Skills

skill-kit ships a test harness that drives skills through their steps without an agent. Import everything from the /test subpath:

import { runSkill, mockModel } from '@contentful/skill-kit/test';

runSkill(skill, opts)

Drives a skill to completion with a model adapter and returns the full execution trace.

const result = await runSkill(mySkill, {
  model: mockModel({
    /* step responses */
  }),
  params: { repoPath: '.' }, // optional — parsed against the skill's params schema
  host: { host: 'claude-code' }, // optional — defaults to generic
});

Options

OptionTypeRequiredDescription
modelModelAdapteryesProvides responses for each step (use mockModel)
paramsobjectnoSkill-wide params, validated against the skill’s params schema
hostHandshakenoHost identity and available tools. Defaults to generic host

Return value

FieldTypeDescription
pathstring[]Sequence of step names visited
outputsRecord<string, unknown>Raw model responses keyed by step name
outputunknownFinal output (from the terminal step)
historyreadonly StepResult[]Validated outputs and action results

mockModel(map)

Maps step names to canned responses. Three response types:

Static value

Returns the same response every time the step is visited:

mockModel({
  diagnose: { checks: [{ name: 'ci', status: 'fail' }] },
});

Array (for loops)

Cycles through entries on repeated visits. Throws if the array is exhausted:

mockModel({
  'ask-hobby': [
    { hobby: 'climbing', wantsMore: true },
    { hobby: 'cooking', wantsMore: false },
  ],
});

Function

Called with the step’s prompt string. Use for conditional responses:

mockModel({
  report: (prompt) => ({
    summary: prompt.includes('fail') ? 'issues found' : 'all clear',
  }),
});

Testing example

A complete test for a two-step skill using node:test and node:assert/strict:

import { describe, it } from 'node:test';
import assert from 'node:assert/strict';
import { runSkill, mockModel } from '@contentful/skill-kit/test';
import greet from './skill.ts';

describe('greet skill', () => {
  it('walks ask → welcome', async () => {
    const result = await runSkill(greet, {
      model: mockModel({
        ask: { name: 'Alice' },
        welcome: { message: 'Hello, Alice!' },
      }),
    });

    assert.deepStrictEqual(result.path, ['ask', 'welcome']);
    assert.deepStrictEqual(result.output, { message: 'Hello, Alice!' });
  });
});

Run with:

node --test --import tsx/esm src/skill.test.ts

Testing host-specific behavior

Pass a host option to simulate different agent hosts. This affects how primitives like askUser and confirm generate their prose:

const result = await runSkill(mySkill, {
  model: mockModel({
    /* ... */
  }),
  host: { host: 'claude-code' },
});

The host value controls which verb mappings are active (e.g., ASK_STRUCTURED maps to AskUserQuestion on Claude Code, prose with option list on generic hosts). Test with different hosts to verify your skill works across environments.

Testing validation errors

Use a function response in mockModel to simulate the agent returning invalid output. The engine will fire onStepValidationFailed and retry, passing the validation error back in the prompt:

const result = await runSkill(mySkill, {
  model: mockModel({
    'my-step': (prompt) => {
      if (prompt.includes('validation')) {
        return { valid: 'output' };
      }
      return { bad: 'shape' }; // triggers retry
    },
  }),
});

See the Handling validation errors section in the workflow skills guide for more detail.

Testing composite skills

Use runComposite to drive a composite skill through its dispatcher and into sub-skills or topics:

import { runComposite, mockModel } from '@contentful/skill-kit/test';
import skill from './skill.ts';

const result = await runComposite(skill, {
  refs, // ReferenceLoader for topic content
  model: mockModel({
    choose: { choice: 'doctor' },
    'get-space': { spaceId: 'abc123' },
    'doctor/diagnose': { issues: [], healthy: true },
    'doctor/report-clean': { summary: 'All good!' },
  }),
});

// result.path → ['choose', 'get-space', 'doctor/diagnose', 'doctor/report-clean']
// result.redirectedTo → { kind: 'subskill', name: 'doctor' }

Sub-skill steps use prefixed names in the mock model ('doctor/diagnose'). The redirectedTo field tells you which sub-skill or topic was activated.

To test a sub-skill in isolation, bypassing the dispatcher:

const result = await runComposite(skill, {
  directSubskill: 'doctor',
  params: { spaceId: 'test-space' },
  model: mockModel({
    'doctor/diagnose': { issues: [], healthy: true },
    'doctor/report-clean': { summary: 'All good!' },
  }),
});

See the Composite Skills guide for the full API.