Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.voxworks.ai/llms.txt

Use this file to discover all available pages before exploring further.

Why Thinking Effort Matters

Language models can struggle with certain types of reasoning, particularly:
  • Numbers and calculations — Arithmetic, quantities, totals, percentages
  • Dates and scheduling — Day of week calculations, time differences, availability checks
  • Logic and comparisons — If/then reasoning, comparing options, eligibility checks
  • Multi-step reasoning — Tasks requiring several logical steps to reach a conclusion
These quantitative and logical tasks benefit significantly from deeper thinking. When the model has more time to reason, it makes fewer errors with numbers, dates, and complex logic. However, deeper thinking comes with a direct tradeoff: increased latency. Every step set to deep thinking adds delay before the assistant responds. In Voxworks, the latency differences between effort levels are subtle:
  • Fast — Approximately 200ms faster than normal
  • Normal — Baseline latency
  • Deep — Approximately 500ms slower than normal
These differences are small enough that they won’t be obviously discernible on individual steps, but the cumulative effect matters if many steps use deep thinking. The key is to use deep thinking strategically — on steps what will benefit most from accuracy.

What is Thinking Effort?

When the assistant generates a response, it can use different levels of reasoning:
  • Fast — Quick, direct responses for simple situations
  • Normal — Balanced reasoning for standard interactions
  • Deep — Deep reasoning for complex or important moments

Effort Levels

LevelResponse SpeedReasoning DepthBest For
fastFastestSurface-levelSimple acknowledgments, quick replies
normalBalancedModerateStandard conversation, most steps
deepSlowerDeepComplex questions, important decisions

When to Use Each Level

Fast

Use for steps where you want quicker responses:
  • Acknowledgments and confirmations
  • Simple follow-up questions
  • Transitions between topics
  • Routine conversation
User: "Yes, that time works."
Assistant: "Great! I'll send you a confirmation." [fast effort sufficient]

Normal (Default)

Use for:
  • Standard questions requiring context
  • Responses that need to incorporate multiple factors
  • Most conversational turns
Assistant: "What time works best for you next week?"
User: "How about Thursday afternoon?"

Deep

Use for:
  • Complex questions or objections
  • Sensitive topics requiring careful handling
  • Important decision points
  • When accuracy is critical

Interaction with Other Settings

Combined WithEffect
Patient eagernessWait longer + think deeper = very deliberate
Keen eagernessFast effort is typical; deep effort adds delay
Patient silence toleranceDeep effort makes sense — user is thinking too

Best Practices

  1. Default to normal — Start with normal effort and adjust from there
  2. Elevate strategically — Use deep effort for moments that matter
  3. Consider step complexity — If a step has more than 3 conditions or requires quantitative/logical reasoning, consider using deep effort
  4. Test response quality — Verify fast effort responses are still good

Next Steps