Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.voxworks.ai/llms.txt

Use this file to discover all available pages before exploring further.

What is a Voice Step?

A Voice Step represents a conversational turn:
  1. Assistant speaks — The configured script is spoken via text-to-speech
  2. User responds — The system listens and transcribes the response
  3. AI evaluates — The LLM analyzes the response against defined conditions
  4. Navigation — The conversation moves to the appropriate next step

Voice Step Structure

PropertyDescription
scriptWhat the assistant says
conditionsRules for navigation based on response
eagernessResponse timing (keen/normal/patient)
thinking_effortLLM depth (fast/normal/deep)
silence_toleranceHow to handle user silence

Writing Scripts

The script defines what your assistant communicates. Scripts can be written in two styles:

Verbatim Scripts

Use the word “Say” when you want the assistant to speak exact wording:
Say: "Hi, this is Sarah from Voxworks. Is now a good time to chat?"
Say: "Great! What day works best for you — this week or next?"

Instructional Scripts

Use describing words like “consider”, “ask about”, or “explain” for flexible delivery:
Explain the benefits of our platform and gauge their interest.
Ask about their availability for a follow-up call.
The assistant will interpret instructional scripts and generate natural responses that convey the intended meaning.

Tips for Scripts

  • Keep sentences short and clear
  • Ask one question at a time
  • Include natural transitions

Conditions

Conditions determine where the conversation goes based on the user’s response:
Step: "Would you like to schedule a call this week?"
   Condition: "User agrees or shows interest"
   → Next: booking_step
   → Script: "Perfect! Let me check what times are available."

   Condition: "User wants more information first"
   → Next: info_step
   → Script: "Of course! What would you like to know?"

   Condition: "Otherwise"
   → Next: objection_step
   → Script: "I understand. Can I ask what's holding you back?"

How Conditions Are Evaluated

The AI doesn’t do simple keyword matching. Instead, it:
  1. Understands context — Considers the full conversation history
  2. Interprets intent — Determines what the user actually means
  3. Matches conditions — Selects the most appropriate condition
  4. Generates response — Creates a natural response incorporating the next script
This means users can express the same intent in many ways:
User SaysMatched Condition
”Yes, let’s do it”User agrees
”Sure, sounds good”User agrees
”I’m interested”User agrees
”Tell me more first”User wants more info
”What does it cost?”User wants more info
”Not right now”Otherwise

Special Conditions

Otherwise

The “otherwise” condition is a fallback when no specific condition matches:
  • Always include an “otherwise” condition
  • Use it to handle unexpected responses gracefully
  • Often loops back to clarify or rephrase

Question Answering

If the user asks a question that can be answered from the Knowledge Base or context, the assistant can respond without leaving the current step. This is handled automatically — the assistant answers the question and then continues with the current step’s flow.

Interrupts

Voice steps are interruptible — users can speak over the assistant, causing it to stop and process the interruption. When an interrupt occurs, if the assistant believes it hasn’t fully conveyed its message for the step, it will attempt to repeat the key information before continuing. This ensures important details aren’t lost due to interruptions. The assistant can detect when its message has been cut off, but it doesn’t inherently “know” it has been interrupted in the conversational sense. See Assistant Behaviour for details on interrupt handling limitations.

Best Practices

  1. One question per step — Don’t overload the user
  2. Anticipate responses — Think about all ways users might reply
  3. Include fallbacks — Use “Otherwise” to loop back to the current step
  4. Keep transitions natural — Next steps should flow from the response
  5. Test with real language — Users don’t speak in keywords

Next Steps