Introduction¶
Datus is an open-source data engineering agent that builds evolvable context for your data systems. Unlike traditional tools that merely move data, Datus captures, learns, and evolves the knowledge surrounding your data—transforming metadata, reference SQL, semantic models, and metrics into a living knowledge base that grounds AI queries and eliminates hallucinations.
With Datus, data engineers shift from writing repetitive SQL to building reusable, AI-ready context. Every query, correction, and domain rule becomes long-term memory—enabling specialized subagents that deliver accurate, domain-aware analytics to your entire organization.

Three Entry Points for Different Users
- Datus-CLI: An AI-powered command-line interface for data engineers—think "Claude Code for data engineers." Write SQL, build subagents, and construct context interactively.
- Datus-Chat: A web chatbot providing multi-turn conversations with built-in feedback mechanisms (upvotes, issue reports, success stories) for data analysts.
- Datus-API: RESTful APIs for other agents or applications that need stable, accurate data services.
Two Execution Modes
- Agentic Mode: Ideal for ad-hoc development and exploratory workflows. Flexible, conversational, and context-aware through specialized subagents.
- Workflow Mode: Optimized for production scenarios requiring high stability and orchestration. Workflows can use subagents as nodes for complex pipelines.
Context Engine at the Core
The heart of Datus is its Context Engine, which combines human expertise with AI capabilities:
- Automatically captures metadata, metrics, reference SQL, documents, and success stories
- Supports human-in-the-loop curation and refinement
- Powers both subagents and workflows with rich, domain-specific context
Flexible Integration Layer
Datus integrates seamlessly with your existing stack:
- LLMs: OpenAI, Claude, DeepSeek, Qwen, Kimi, and more (Configuration)
- Data Warehouses: StarRocks, Snowflake, DuckDB, SQLite, PostgreSQL, and others (Namespace Setup)
- Semantic Layers: MetricFlow support for metric definitions and queries
- Extensibility: Add custom integrations via MCP (Model Context Protocol)
Getting Started¶
Get your Datus Agent up and running in minutes.
Start Here
Discover how Datus leverages contextual data engineering from your data assets to continuously learn and improve
Learn Key Concepts
Important Topics¶
-
Datus CLI
Command-line interface for local development and real-time preview of your data workflows.
-
Knowledge Base
Centralized repository for organizing and managing your data assets and documentation.
-
Subagent System
Extend Datus with specialized subagents for different data engineering tasks and workflows.
-
Workflow Management
Design and orchestrate complex data pipelines with configurable workflow builder.