Multi-Model Support

No single model is the best at everything.

Claude's structured reasoning is unmatched. DeepSeek offers the best cost-performance for coding. GPT-4o is the fastest at image understanding. The o1 series leads in deep reasoning. Rather than betting on a single model, let each task use the one that fits best.

Helix supports all major AI models, and you can switch freely mid-conversation.

Model Selection Guide

Model	Best For	Speed	Cost
Claude Sonnet 4	Everyday coding, balanced performance	⚡⚡⚡	$$
Claude Opus 4	Complex reasoning, architecture design	⚡⚡	$$$
DeepSeek Coder	Code understanding, completion, high value	⚡⚡	$
GPT-4o	Vision tasks, image analysis	⚡⚡⚡⚡	$$
GPT-4o-mini	Quick tasks, cost-sensitive	⚡⚡⚡⚡	$
Gemini 2.5 Pro	Long context, deep analysis	⚡⚡	$$
o1 / o1-mini	Deep reasoning, math, algorithms	⚡	$$$$

Task-Based Model Strategy

No need to use one model for everything. Choose flexibly based on the task:

Task Type	Recommended Model	Rationale
Quick questions, small edits	GPT-4o-mini	Fast response, low cost
Everyday coding, feature work	Claude Sonnet / DeepSeek	Best balance of code quality and cost
Architecture review, design decisions	Claude Opus / o1	Needs deep reasoning and holistic perspective
UI / image analysis	GPT-4o	Strongest multimodal capabilities
Large file analysis	Gemini 2.5 Pro	Ultra-long context window

Switch Models Mid-Conversation

This is one of Helix's killer features. You don't need to start a new chat to change models:

You: [Using DeepSeek] "Implement this user authentication module"
    → DeepSeek quickly generates the code

You: [Switch to Claude] "Review the code you just generated, focus on security"
    → Claude performs deep review, identifies potential issues

You: [Switch to GPT-4o] "Look at this UI screenshot, tell me what's wrong with the layout"
    → GPT-4o analyzes the image, gives specific suggestions

When switching, conversation history is fully preserved — the new model can see all previous context.

How the Backend Implements This

Helix's backend model adapter layer unifies four types of Provider interfaces:

Anthropic — Claude series, supports extended thinking
DeepSeek — passes through reasoning_content
OpenAI — standard Chat Completions API
OpenAI Responses API — next-generation interface for GPT-5.x

Model routing supports two methods:

Exact specification: providerId:modelId (e.g., anthropic:claude-sonnet-4)
Prefix inference: pass just the model name, and the system auto-matches the Provider by name prefix

Deep Thinking Mode

For complex problems, you need the model to "think it through before speaking," not just output a shallow quick answer.

Anthropic Extended Thinking

Claude models support Extended Thinking mode, which lets the model engage in deep reasoning before producing its formal answer:

Default thinking budget: 32K tokens — the model can use up to 32K tokens of "internal thinking" to analyze the problem
Thinking process is visible — you can expand and view the model's reasoning steps in the UI
Best for: Architecture decisions, algorithm optimization, bug root cause analysis, security vulnerability research, complex refactoring planning

DeepSeek Reasoning

DeepSeek's thinking mode passes through reasoning_content, showing the model's reasoning process while consuming fewer tokens. A more cost-effective deep thinking option.

When to Enable Deep Thinking

✅ Worth Enabling	❌ Not Needed
Architecture decisions — weighing multiple approaches	Simple code changes — renaming a variable
Algorithm optimization — analyzing time/space complexity	Formatting adjustments — fixing indentation or style
Bug root cause analysis — tracing complex call chains	Information lookup — "Which file is this function in?"
Security review — considering various attack surfaces	Repetitive tasks — batch-modifying similar code

Multimodal Support

Helix handles more than just text. Models with multimodal support (like GPT-4o) can understand image inputs:

📸 Architecture diagram analysis — "Explain the data flow in this system diagram"
📊 Chart interpretation — "What problem does this performance monitoring chart indicate?"
🎨 UI feedback — "What could be improved in this design mockup?"
📱 Screenshot debugging — "My app looks like this — why?"

In helix, simply paste or drag-and-drop images into the chat to send them.

Dual Agent Mode: Two-Model Collaboration

When one model isn't enough, use two.

Dual Agent mode lets two different models (typically Claude + DeepSeek) engage in structured four-phase collaboration on the same problem:

Independent Thinking — Claude and DeepSeek each think about the same problem independently
Cross-Review (Discussion) — each model sees the other's answer and points out strengths and weaknesses, across multiple rounds
Synthesis — Claude combines the best of both approaches into a final solution
Execution — execute according to the final plan (optional)

Why Better Than a Single Model?

Every model has blind spots. Claude might over-focus on security and miss performance concerns; DeepSeek might produce a quick solution but overlook edge cases. Cross-review means blind spots get caught by the other party, and the final solution is more comprehensive.

The UI Experience

helix displays the entire process with clear phase separators:

Each phase has a --- Phase Name --- divider
Different models' responses carry role labels
The final synthesis is marked with 🎯 Final Solution
You can watch how the two models inspire each other to arrive at a better answer

Custom Configuration

Custom Providers

Helix supports connecting any OpenAI-compatible model endpoint. Add a Provider in settings:

Set the Base URL to point to your endpoint
Choose the interface type: OpenAI-compatible / OpenAI Responses API / Anthropic
Enter your API Key
Add specific models, configuring context window size, max output tokens, temperature, and more

Model Configuration in Agent Profiles

Through YAML-format Agent Profiles, you can preset models and parameters for different tasks:

profiles:
  code-reviewer:
    model: claude-opus-4
    system_prompt: |
      You are a meticulous code reviewer focused on security,
      performance, and maintainability. Always explain the
      reasoning behind your suggestions.
    thinking_enabled: true
  
  quick-helper:
    model: gpt-4o-mini
    system_prompt: |
      Answer questions quickly and concisely.
      Prefer giving directly usable code.
    temperature: 0.3

Temperature Control

Adjust the balance between creativity and determinism based on task nature:

Temperature Range	Use Case
0.0 – 0.3	High determinism: test generation, bug fixes, precise code
0.4 – 0.7	Balanced: everyday coding, refactoring
0.8 – 1.0	High creativity: brainstorming, documentation, naming suggestions

Workspace Architecture — choose different models per Workspace
Context Management — all models share context management mechanisms
Feature Overview — back to the core capabilities overview

Model Selection Guide​

Task-Based Model Strategy​

Switch Models Mid-Conversation​

How the Backend Implements This​

Deep Thinking Mode​

Anthropic Extended Thinking​

DeepSeek Reasoning​

When to Enable Deep Thinking​

Multimodal Support​

Dual Agent Mode: Two-Model Collaboration​

Why Better Than a Single Model?​

The UI Experience​

Custom Configuration​

Custom Providers​

Model Configuration in Agent Profiles​

Temperature Control​

Related Documentation​