Skip to main content

Multi-Model Support

No single model is the best at everything.

Claude's structured reasoning is unmatched. DeepSeek offers the best cost-performance for coding. GPT-4o is the fastest at image understanding. The o1 series leads in deep reasoning. Rather than betting on a single model, let each task use the one that fits best.

Helix supports all major AI models, and you can switch freely mid-conversation.


Model Selection Guide

ModelBest ForSpeedCost
Claude Sonnet 4Everyday coding, balanced performance⚡⚡⚡$$
Claude Opus 4Complex reasoning, architecture design⚡⚡$$$
DeepSeek CoderCode understanding, completion, high value⚡⚡$
GPT-4oVision tasks, image analysis⚡⚡⚡⚡$$
GPT-4o-miniQuick tasks, cost-sensitive⚡⚡⚡⚡$
Gemini 2.5 ProLong context, deep analysis⚡⚡$$
o1 / o1-miniDeep reasoning, math, algorithms$$$$

Task-Based Model Strategy

No need to use one model for everything. Choose flexibly based on the task:

Task TypeRecommended ModelRationale
Quick questions, small editsGPT-4o-miniFast response, low cost
Everyday coding, feature workClaude Sonnet / DeepSeekBest balance of code quality and cost
Architecture review, design decisionsClaude Opus / o1Needs deep reasoning and holistic perspective
UI / image analysisGPT-4oStrongest multimodal capabilities
Large file analysisGemini 2.5 ProUltra-long context window

Switch Models Mid-Conversation

This is one of Helix's killer features. You don't need to start a new chat to change models:

You: [Using DeepSeek] "Implement this user authentication module"
→ DeepSeek quickly generates the code

You: [Switch to Claude] "Review the code you just generated, focus on security"
→ Claude performs deep review, identifies potential issues

You: [Switch to GPT-4o] "Look at this UI screenshot, tell me what's wrong with the layout"
→ GPT-4o analyzes the image, gives specific suggestions

When switching, conversation history is fully preserved — the new model can see all previous context.

How the Backend Implements This

Helix's backend model adapter layer unifies four types of Provider interfaces:

  • Anthropic — Claude series, supports extended thinking
  • DeepSeek — passes through reasoning_content
  • OpenAI — standard Chat Completions API
  • OpenAI Responses API — next-generation interface for GPT-5.x

Model routing supports two methods:

  • Exact specification: providerId:modelId (e.g., anthropic:claude-sonnet-4)
  • Prefix inference: pass just the model name, and the system auto-matches the Provider by name prefix

Deep Thinking Mode

For complex problems, you need the model to "think it through before speaking," not just output a shallow quick answer.

Anthropic Extended Thinking

Claude models support Extended Thinking mode, which lets the model engage in deep reasoning before producing its formal answer:

  • Default thinking budget: 32K tokens — the model can use up to 32K tokens of "internal thinking" to analyze the problem
  • Thinking process is visible — you can expand and view the model's reasoning steps in the UI
  • Best for: Architecture decisions, algorithm optimization, bug root cause analysis, security vulnerability research, complex refactoring planning

DeepSeek Reasoning

DeepSeek's thinking mode passes through reasoning_content, showing the model's reasoning process while consuming fewer tokens. A more cost-effective deep thinking option.

When to Enable Deep Thinking

✅ Worth Enabling❌ Not Needed
Architecture decisions — weighing multiple approachesSimple code changes — renaming a variable
Algorithm optimization — analyzing time/space complexityFormatting adjustments — fixing indentation or style
Bug root cause analysis — tracing complex call chainsInformation lookup — "Which file is this function in?"
Security review — considering various attack surfacesRepetitive tasks — batch-modifying similar code

Multimodal Support

Helix handles more than just text. Models with multimodal support (like GPT-4o) can understand image inputs:

  • 📸 Architecture diagram analysis — "Explain the data flow in this system diagram"
  • 📊 Chart interpretation — "What problem does this performance monitoring chart indicate?"
  • 🎨 UI feedback — "What could be improved in this design mockup?"
  • 📱 Screenshot debugging — "My app looks like this — why?"

In helix, simply paste or drag-and-drop images into the chat to send them.


Dual Agent Mode: Two-Model Collaboration

When one model isn't enough, use two.

Dual Agent mode lets two different models (typically Claude + DeepSeek) engage in structured four-phase collaboration on the same problem:

  1. Independent Thinking — Claude and DeepSeek each think about the same problem independently
  2. Cross-Review (Discussion) — each model sees the other's answer and points out strengths and weaknesses, across multiple rounds
  3. Synthesis — Claude combines the best of both approaches into a final solution
  4. Execution — execute according to the final plan (optional)

Why Better Than a Single Model?

Every model has blind spots. Claude might over-focus on security and miss performance concerns; DeepSeek might produce a quick solution but overlook edge cases. Cross-review means blind spots get caught by the other party, and the final solution is more comprehensive.

The UI Experience

helix displays the entire process with clear phase separators:

  • Each phase has a --- Phase Name --- divider
  • Different models' responses carry role labels
  • The final synthesis is marked with 🎯 Final Solution
  • You can watch how the two models inspire each other to arrive at a better answer

Custom Configuration

Custom Providers

Helix supports connecting any OpenAI-compatible model endpoint. Add a Provider in settings:

  • Set the Base URL to point to your endpoint
  • Choose the interface type: OpenAI-compatible / OpenAI Responses API / Anthropic
  • Enter your API Key
  • Add specific models, configuring context window size, max output tokens, temperature, and more

Model Configuration in Agent Profiles

Through YAML-format Agent Profiles, you can preset models and parameters for different tasks:

profiles:
code-reviewer:
model: claude-opus-4
system_prompt: |
You are a meticulous code reviewer focused on security,
performance, and maintainability. Always explain the
reasoning behind your suggestions.
thinking_enabled: true

quick-helper:
model: gpt-4o-mini
system_prompt: |
Answer questions quickly and concisely.
Prefer giving directly usable code.
temperature: 0.3

Temperature Control

Adjust the balance between creativity and determinism based on task nature:

Temperature RangeUse Case
0.0 – 0.3High determinism: test generation, bug fixes, precise code
0.4 – 0.7Balanced: everyday coding, refactoring
0.8 – 1.0High creativity: brainstorming, documentation, naming suggestions