Skip to content

Agent

The agent configuration defines the core settings for your Datus Agent, including the target model selection and all available LLM providers that can be used throughout the system.

Configuration Structure

LLM Configuration (Two-Tier Provider Model)

LLM selection uses a two-tier system. Most users only need the provider-level configuration; custom entries are for self-hosted or private endpoints.

Provider-Level Configuration (Preferred)

Configure credentials under agent.providers.<name>. Available models and metadata are loaded from conf/providers.yml automatically. Use the /model slash command in the CLI to switch between providers and models interactively.

agent:
  providers:
    openai:
      api_key: ${OPENAI_API_KEY}
    deepseek:
      api_key: ${DEEPSEEK_API_KEY}
    claude_subscription:
      auth_type: subscription
    codex:
      auth_type: oauth

Only credentials need to be specified — type, base_url, and model lists are inherited from conf/providers.yml.

Custom / Legacy Models

For self-hosted or private-deployment models not covered by providers.yml, use agent.models:

agent:
  models:
    my-internal:
      type: openai
      base_url: https://internal.example.com/v1
      api_key: ${MY_KEY}
      model: internal-gpt-4

Custom entries appear under the Custom tab in /model and can be activated with /model custom:my-internal.

Target Model Selection

The active model is determined by project-level override (.datus/config.yml) or the base agent.target field. The /model command writes to the project override automatically.

Project-Level Override (.datus/config.yml)

The /model command persists selections to .datus/config.yml in the current working directory:

# Provider-level selection
target:
  provider: openai
  model: gpt-4.1

# Or, custom model selection
target:
  custom: my-internal

Legacy Target (backwards compatible)

The agent.target field in agent.yml is still honored when no project-level override exists:

agent:
  target: openai  # Legacy: key from agent.models section

Resolution order: .datus/config.ymlagent.target.

Response Language

language is an optional field that pins the natural language used by every agentic node for user-facing outputs — replies, summaries, clarifying questions, sub-agent prompts issued via task, and prose written into files. Code, SQL, identifiers, file paths, URLs, and JSON keys stay in their original form regardless of the setting.

When omitted, no directive is injected into the system prompt and the model picks its own response language per turn.

agent:
  # Leave it out entirely to let the model decide.
  # Set a code to pin every agentic node to that language.
  language: zh   # Common codes: en, zh, ja, ko, es, fr, de, pt, ru, it

Built-in code → name mapping (injected into the system prompt): en → English, zh / zh-cn → Chinese, zh-tw → Traditional Chinese, ja → Japanese, ko → Korean, es → Spanish, fr → French, de → German, pt → Portuguese, ru → Russian, it → Italian. Unknown codes are used verbatim.

Chat API requests can override this per task by sending a language field in the request body (see Chat API). CLI usage inherits the yaml default.

Models Configuration (Custom Entries)

The agent.models section is used for self-hosted or private-deployment LLM endpoints. For standard providers (OpenAI, DeepSeek, etc.), use agent.providers instead.

Required Parameters per custom entry:

  • Entry key (models.<key>) — Logical identifier, referenced by target: {custom: <key>} or node model fields
  • type — Interface type (openai, claude, deepseek, kimi, gemini, minimax, glm, codex)
  • base_url — API endpoint URL
  • api_key — API key (supports ${ENV_VAR} substitution)
  • model — Model name / SKU
agent:
  models:
    my-internal:
      type: openai
      base_url: https://internal.example.com/v1
      api_key: ${MY_KEY}
      model: internal-gpt-4

Environment Variables

Use environment variables to securely store API keys and other sensitive information:

# Recommended: Using environment variables
api_key: ${YOUR_API_KEY}

# Not recommended for production
api_key: "sk-your-actual-key-here"

Supported LLM Providers

Providers are defined in conf/providers.yml and activated by adding credentials under agent.providers. Use the /model command to configure and switch providers interactively.

  • Provider credentials live under agent.providers.<name> in agent.yml
  • The active provider/model is stored in .datus/config.yml (written by /model)
  • Node-level model overrides can still reference entries in agent.models for custom endpoints

General-purpose providers

Provider Typical models Interface Type Auth
openai gpt-5.2, gpt-4.1, o3 openai API key
deepseek deepseek-chat, deepseek-reasoner deepseek API key
claude claude-sonnet-4-5, claude-opus-4-5 claude API key
kimi kimi-k2.5, kimi-k2-thinking kimi API key
qwen qwen3-max, qwen3-coder-plus openai API key
gemini gemini-2.5-flash, gemini-2.5-pro gemini API key
minimax MiniMax-M2.7, MiniMax-M2.5 minimax API key
glm glm-5, glm-4.7 glm API key

Special-auth providers

Provider Interface Type Auth Notes
claude_subscription claude Claude subscription token The wizard first tries to auto-detect a local subscription credential and otherwise prompts for sk-ant-oat01-...
codex codex OAuth Uses locally available Codex OAuth credentials and verifies connectivity

Coding Plan providers

These providers target coding/planning-oriented endpoints. Even though their names include coding, they are configured exactly like any other model entry and can be used as agent.target or referenced from node-level model fields.

Provider Default model Interface Type Notes
alibaba_coding qwen3-coder-plus claude DashScope Anthropic-compatible coding endpoint
bigmodel_coding glm-5.1 claude BigModel Anthropic-compatible coding endpoint
zai_coding glm-5.1 claude Z.AI Anthropic-compatible coding endpoint
minimax_coding MiniMax-M2.7 claude MiniMax Anthropic-compatible coding endpoint
kimi_coding kimi-for-coding claude Kimi coding endpoint

When to choose a coding plan provider

If your priority is general chat, SQL generation, or cost efficiency, start with a regular provider.

If you want a default model that is better aligned with planning, code generation, and structured task decomposition, or if you use Plan Mode frequently, add one of the *_coding providers and route specific nodes to it.

Environment variables and model overrides

All providers support environment-variable references in api_key, for example:

api_key: ${OPENAI_API_KEY}

For OpenAI, DeepSeek, Claude, Kimi, Qwen, and Gemini, the configuration wizard can prompt with provider-specific environment variable hints. For minimax, glm, and the *_coding providers, you can still enter values such as ${MINIMAX_API_KEY}, ${GLM_API_KEY}, ${KIMI_API_KEY}, or ${DASHSCOPE_API_KEY} directly.

The current implementation also auto-applies fixed parameter overrides for a few models:

  • kimi-k2.5: temperature: 1.0, top_p: 0.95
  • qwen3-coder-plus: temperature: 1.0, top_p: 0.95

Provider Configuration Examples

With the new provider-level configuration, you only need to set credentials. All other fields (type, base_url, models list) are inherited from conf/providers.yml:

agent:
  providers:
    openai:
      api_key: ${OPENAI_API_KEY}
    deepseek:
      api_key: ${DEEPSEEK_API_KEY}
    claude:
      api_key: ${ANTHROPIC_API_KEY}
    gemini:
      api_key: ${GEMINI_API_KEY}
    kimi:
      api_key: ${KIMI_API_KEY}
    qwen:
      api_key: ${DASHSCOPE_API_KEY}
agent:
  providers:
    claude_subscription:
      auth_type: subscription
      # Token is auto-detected or entered via /model
agent:
  providers:
    codex:
      auth_type: oauth
      # OAuth flow is handled via /model
agent:
  models:
    my-internal:
      type: openai
      base_url: https://internal.example.com/v1
      api_key: ${MY_KEY}
      model: internal-gpt-4

Legacy Format

The previous format with full type, base_url, api_key, and model under agent.models is still supported for backward compatibility. Existing configurations continue to work without changes.

Agentic Nodes

agent.agentic_nodes is where Datus configures chat and subagent behavior.

This section is used for:

  • built-in agentic nodes such as chat, explore, gen_sql, gen_report, gen_dashboard, and scheduler
  • custom subagents created with the unified agent TUI (/agent or /subagent)
  • advanced manual aliases that point a custom name at a built-in node class

Common Fields

The runtime currently reads these commonly used fields from agentic_nodes entries:

  • model: provider key from agent.models (custom entries only; omit to inherit the active provider/model)
  • system_prompt: subagent name / prompt template base name
  • node_class: node implementation to use, such as gen_sql, gen_report, explore, gen_table, gen_skill, gen_dashboard, or scheduler
  • prompt_version, prompt_language
  • agent_description
  • tools, mcp, skills
  • rules
  • max_turns
  • workspace_root
  • scoped_context
  • subagents
  • semantic_adapter for semantic-model and metrics agents
  • bi_platform for dashboard agents
  • scheduler_service for scheduler agents

scoped_kb_path is deprecated. New configs use shared global storage with query-time filters instead of per-subagent scoped KB directories.

subagents Delegation Control

subagents controls whether the node exposes the task() tool and which subagent types it may delegate to.

  • subagents: "*": allow all discoverable subagents except self
  • subagents: explore, gen_sql: allow only a named subset
  • blank or omitted:
  • chat defaults to *
  • most other agentic nodes default to explore
  • explicitly setting an empty value disables delegation

scoped_context

scoped_context limits what the subagent should see from shared metadata and knowledge:

scoped_context:
  datasource: finance
  tables: mart.finance_daily, mart.finance_budget
  metrics: finance.revenue.daily_revenue
  sqls: finance.revenue.region_rollup

When writing YAML manually, set datasource explicitly. The unified agent TUI's Custom-tab wizard fills it from the current database automatically.

Example

agent:
  agentic_nodes:
    chat:
      model: claude
      max_turns: 50
      subagents: "*"

    finance_report:
      node_class: gen_report
      model: claude
      system_prompt: finance_report
      prompt_version: "1.0"
      prompt_language: en
      agent_description: "Finance reporting assistant"
      tools: semantic_tools.*, db_tools.*, context_search_tools.list_subject_tree
      subagents: explore, gen_sql
      max_turns: 30
      scoped_context:
        datasource: finance
        tables: mart.finance_daily
        metrics: finance.revenue.daily_revenue
        sqls: finance.revenue.region_rollup

    sales_dashboard:
      node_class: gen_dashboard
      model: claude
      bi_platform: superset
      max_turns: 30

    semantic_metrics:
      node_class: gen_metrics
      model: claude
      semantic_adapter: metricflow
      max_turns: 30

    etl_scheduler:
      node_class: scheduler
      model: claude
      scheduler_service: airflow_prod
      max_turns: 30

Complete Configuration Example

Here's a comprehensive agent configuration example with the provider-level format:

agent.yml
agent:
  # Provider credentials (models and metadata from conf/providers.yml)
  providers:
    openai:
      api_key: ${OPENAI_API_KEY}
    deepseek:
      api_key: ${DEEPSEEK_API_KEY}
    claude:
      api_key: ${ANTHROPIC_API_KEY}
    gemini:
      api_key: ${GEMINI_API_KEY}
    claude_subscription:
      auth_type: subscription
    codex:
      auth_type: oauth

  # Custom models for self-hosted endpoints (optional)
  models:
    my-internal:
      type: openai
      base_url: https://internal.example.com/v1
      api_key: ${MY_KEY}
      model: internal-gpt-4

And the corresponding project-level override:

.datus/config.yml
target:
  provider: openai
  model: gpt-4.1
default_datasource: my_duckdb