Skip to content

Quickstart

Get started with Datus Agent in just a few minutes. This guide will walk you through installation, setup, and your first interactions with Datus.

Step 1: Installation & Setup

Install Python 3.12

Datus requires a Python 3.12 environment. Choose your preferred method:

conda create -n datus python=3.12
conda activate datus
virtualenv datus --python=python3.12
source datus/bin/activate
uv venv datus --python 3.12
source datus/bin/activate

Install Datus Agent

pip install datus-agent
pip install --no-deps -i https://test.pypi.org/simple/ datus-agent

Initialize Configuration

Run the initialization command:

datus-agent init

The setup will guide you through:

1. LLM Configuration - Configure your preferred LLM provider (OpenAI, DeepSeek, Claude, Kimi, Qwen)

2. Namespace Setup - Connect to your database. For a quick start, use the demo database:

Demo Database

Datus provides a pre-configured demo DuckDB database for testing.

Connection string: ~/.datus/sample/duckdb-demo.duckdb

3. Workspace Configuration - Set up your SQL files directory (default: ~/.datus/workspace)

4. Knowledge Base (Optional) - Initialize vector DB for metadata and reference SQL

After setup completes, you're ready to launch Datus!

Step 2: Launch Datus CLI

Start the Datus CLI with your configured namespace:

Configuration

You can connect to different databases by adding namespaces in agent.yml. See our Configuration guide for details.

Terminal
datus-cli --namespace duckdb-demo
Initializing AI capabilities in background...

Datus - AI-powered SQL command-line interface
Type '.help' for a list of commands or '.exit' to quit.

Namespace duckdb selected
Connected to duckdb using database duckdb
Context: Current: database: duckdb
Type SQL statements or use ! @ . commands to interact.
Datus>

Step 3: Start Using Datus

Tip

You can execute SQL in Datus just like in a SQL editor.

List all tables:

Terminal
Datus> .tables
Tables in Database duckdb-demo
+---------------------+
| Table Name          |
+=====================+
| bank_failures       |
| boxplot             |
| calendar            |
| candle              |
| christmas_cost      |
| companies           |
| country_stats_scatter|
| gold_vs_bitcoin     |
| japan_births_deaths |
| japan_population    |
| metrics             |
| niger_population    |
| quotes              |
| radar               |
| sankey              |
| search_trends       |
| tree                |
+---------------------+

Ask any question you like. Here is a table named gold_vs_bitcoin. Let's view its schema:

Terminal
Datus> desc gold_vs_bitcoin
+------------------+------------------+------------------+------------------+------------------+------------------+
| column_name      | column_type      | null             | key              | default          | extra            |
+==================+==================+==================+==================+==================+==================+
| time             | TIMESTAMP        | YES              | None             | None             | None             |
| gold             | DOUBLE           | YES              | None             | None             | None             |
| bitcoin          | DOUBLE           | YES              | None             | None             | None             |
+------------------+------------------+------------------+------------------+------------------+------------------+
Returned 3 rows in 0.04 seconds

Tip

To chat, start a line with / and continue in natural language.

Suppose we want to understand the correlation between gold and Bitcoin. We can ask the Datus agent directly:

Terminal
Datus> /Detailed analysis of gold–Bitcoin correlation.

Datus will display its thinking process to show how it approaches the task, then provide a detailed report.

Detailed Analysis Output
Creating new chat session...
Processing chat request...
+---------------------------------------------------------------------------------------------------------------------------------------+
|                                                    Action Stream                                                                      |
+=======================================================================================================================================+
| πŸ”§ Tool call - search_files('{"path": "/Users/yt/.datus/workspace", "pattern": ...') - βœ“                                            |
|     No matches found (0.5s)                                                                                                          |
| πŸ’¬ Thinking: Let me check the directory structure to see what's available:                                                           |
| πŸ”§ Tool call - list_directory('{"path": "/Users/yt/.datus/workspace"}') - βœ“ (0.5s)                                                   |
| πŸ’¬ Thinking: Now let me analyze the gold-Bitcoin correlation by examining the data and performing a comprehensive correlation        |
| analysis:                                                                                                                            |
| πŸ”§ Tool call - read_query('{"sql": "SELECT \n    COUNT(*) as total_records,\n...') - βœ“                                               |
|     βœ“ 1 rows (0.5s)                                                                                                                  |
| πŸ’¬ Thinking: Now let me calculate the correlation coefficient and perform a detailed analysis:                                       |
| πŸ”§ Tool call - read_query('{"sql": "SELECT \n    -- Correlation coefficient\n...') - βœ“                                               |
|     βœ“ 1 rows (0.5s)                                                                                                                  |
| πŸ’¬ Thinking: Let me get more detailed statistics and time-based analysis:                                                            |
| πŸ”§ Tool call - read_query('{"sql": "WITH daily_aggregates AS (\n    SELECT \n...') - βœ“                                               |
|     βœ“ 1 rows (0.5s)                                                                                                                  |
| πŸ’¬ Thinking: Let me check the distribution of the data and look for any patterns:                                                    |
| πŸ”§ Tool call - read_query('{"sql": "SELECT \n    -- Price movement analysis\n...') - βœ“                                               |
|     βœ— Failed:(error_code=500006, e...) (0.5s)                                                                                        |
| πŸ’¬ Thinking: Let me fix the query to properly analyze price movement patterns:                                                       |
| πŸ”§ Tool call - read_query('{"sql": "WITH price_changes AS (\n    SELECT \n   ...') - βœ“                                               |
|     βœ“ 5 rows (0.5s)                                                                                                                  |
| πŸ’¬ Thinking: Now let me analyze the strength of the correlation across different time periods and price levels:                      |
| πŸ”§ Tool call - read_query('{"sql": "WITH price_quartiles AS (\n    SELECT \n ...') - βœ“                                               |
|     βœ“ 16 rows (0.5s)                                                                                                                 |
| πŸ’¬ Thinking: Let me also analyze the volatility and risk characteristics:                                                            |
| πŸ”§ Tool call - read_query('{"sql": "SELECT \n    -- Risk and return metrics\n...') - βœ“                                               |
|     βœ— Failed:(error_code=500006, e...) (0.5s)                                                                                        |
| πŸ’¬ Thinking: Let me correct the query with the proper function names:                                                                |
| πŸ”§ Tool call - read_query('{"sql": "SELECT \n    -- Risk and return metrics\n...') - βœ“                                               |
|     βœ“ 1 rows (0.5s)                                                                                                                  |
| πŸ’¬ Thinking: Finally, let me examine the time series data to understand the temporal patterns:                                       |
| πŸ”§ Tool call - read_query('{"sql": "SELECT \n    DATE(time) as trading_date,\...') - βœ“                                               |
|     βœ“ 1 rows (0.5s)                                                                                                                  |
| πŸ’¬ Chat interaction completed successfully                                                                                           |
+---------------------------------------------------------------------------------------------------------------------------------------+

Generated SQL:

Generated Query
-- Generated SQL (copied)
SELECT
    -- Overall correlation and basic stats
    CORR(gold, bitcoin) as correlation_coefficient,
    COUNT(*) as total_observations,
    AVG(gold) as avg_gold_price,
    AVG(bitcoin) as avg_bitcoin_price,
    STDDEV(gold) as gold_volatility,
    STDDEV(bitcoin) as bitcoin_volatility,
    COVAR_POP(gold, bitcoin) as covariance,

    -- Price movement patterns
    (SELECT COUNT(*) FROM (
        SELECT
            CASE
                WHEN gold_change > 0 AND bitcoin_change > 0 THEN 'Both Up'
                WHEN gold_change < 0 AND bitcoin_change < 0 THEN 'Both Down'
                WHEN gold_change > 0 AND bitcoin_change < 0 THEN 'Gold Up, Bitcoin Down'
                WHEN gold_change < 0 AND bitcoin_change > 0 THEN 'Gold Down, Bitcoin Up'
            END as pattern
        FROM (
            SELECT
                gold - LAG(gold) OVER (ORDER BY time) as gold_change,
                bitcoin - LAG(bitcoin) OVER (ORDER BY time) as bitcoin_change
            FROM gold_vs_bitcoin
        ) WHERE gold_change IS NOT NULL
    ) WHERE pattern = 'Both Up') as both_up_count,

    (SELECT COUNT(*) FROM (
        SELECT
            CASE
                WHEN gold_change > 0 AND bitcoin_change > 0 THEN 'Both Up'
                WHEN gold_change < 0 AND bitcoin_change < 0 THEN 'Both Down'
                WHEN gold_change > 0 AND bitcoin_change < 0 THEN 'Gold Up, Bitcoin Down'
                WHEN gold_change < 0 AND bitcoin_change > 0 THEN 'Gold Down, Bitcoin Up'
            END as pattern
        FROM (
            SELECT
                gold - LAG(gold) OVER (ORDER BY time) as gold_change,
                bitcoin - LAG(bitcoin) OVER (ORDER BY time) as bitcoin_change
            FROM gold_vs_bitcoin
        ) WHERE gold_change IS NOT NULL
    ) WHERE pattern = 'Both Down') as both_down_count
FROM gold_vs_bitcoin

Analysis Results:

Gold-Bitcoin Correlation Analysis
+===============================================================================================================+
|                                        Gold-Bitcoin Correlation Analysis                                     |
+===============================================================================================================+

                                                      πŸ“Š Overall Correlation

β€’ Correlation Coefficient: -0.225 (Weak negative correlation)
β€’ Total Observations: 954 data points
β€’ Time Period: Single day - December 1, 2024 (16:00 to 16:15)


                                                        πŸ’° Price Statistics

β€’ Average Gold Price: 0.684
β€’ Average Bitcoin Price: 0.785
β€’ Gold Volatility: 0.200
β€’ Bitcoin Volatility: 0.233
β€’ Covariance: -0.0105


                                                    πŸ“ˆ Price Movement Patterns

Based on 953 observed price changes:

β€’ Both Assets Moving Up: 216 times (22.67%)
β€’ Both Assets Moving Down: 203 times (21.30%)
β€’ Gold Up, Bitcoin Down: 189 times (19.83%)
β€’ Gold Down, Bitcoin Up: 190 times (19.94%)


                                                          🎯 Key Insights

                                                   1. Weak Negative Correlation

The -0.225 correlation suggests these assets tend to move in opposite directions more often than together, though the relationship
is weak.

                                                   2. Similar Movement Patterns

Both assets show remarkably similar movement patterns:

β€’ Approximately 44% of the time they move in the same direction
β€’ Approximately 40% of the time they move in opposite directions

                                                     3. Volatility Comparison

Bitcoin shows slightly higher volatility (0.233) compared to Gold (0.200), which aligns with Bitcoin's reputation as a more
volatile asset.

                                                   4. Risk-Adjusted Performance

β€’ Gold Sharpe Ratio: 3.42
β€’ Bitcoin Sharpe Ratio: 3.37 Both assets show strong risk-adjusted returns during this period.


                                                           ⚠️ Limitations

This analysis is based on a very short time window (15 minutes of data). For more robust correlation analysis, data spanning
multiple days/weeks would be needed to observe:

β€’ Longer-term correlation trends
β€’ Market regime changes
β€’ Impact of macroeconomic events
β€’ Seasonal patterns


                                                    πŸ’‘ Investment Implications

The weak negative correlation suggests these assets could provide some diversification benefits, though the relationship is not
strong enough for reliable hedging strategies based on this limited dataset.

If you want detailed information for every step, press y at the end:

Terminal
Would you like to check the details? (y/n): y

Datus will then show details in a new screen.

Step 4: Advanced Features

Context Management

Note

If you have initialized the knowledge base for metadata.

If you want to add items such as a file, metric or table to the context, use @. Manual retrieval gives the LLM more precise context, narrows the search scope, and improves answer accuracy.

Terminal
Datus> /@Table duckdb-demo.main.bank_failures

Datus will automatically analyze the table and add metadata to the context.

Context Analysis Output
Processing chat request...
+----------------------------------------------------------------------------------------------------------------------------------+
|                                                    Action Stream                                                                 |
+==================================================================================================================================+
| 🟑 User: @Table duckdb-demo.main.bank_failures                                                                                  |
| πŸ’¬ Generating response with tools...                                                                                            |
| πŸ’¬ Thinking: I'll provide a comprehensive analysis of the bank_failures table. Let me start by examining the data structure     |
| and then perform various analyses.                                                                                              |
| πŸ”§ Tool call - describe_table('{"table_name": "bank_failures"}') - βœ“                                                            |
|     βœ“ 7 columns (0.5s)                                                                                                          |
| πŸ’¬ Thinking: Now let me perform a comprehensive analysis of the bank failures data:                                             |
| πŸ”§ Tool call - read_query('{"sql": "-- Comprehensive Bank Failures Analysis\n...') - βœ“                                          |
|     βœ— Failed:(error_code=500006, e...) (0.5s)                                                                                   |
| πŸ’¬ Thinking: Let me break this down into separate queries for a comprehensive analysis:                                         |
| πŸ”§ Tool call - read_query('{"sql": "-- 1. Basic overview statistics\nSELECT \...') - βœ“                                          |
|     βœ“ 1 rows (0.5s)                                                                                                             |
| πŸ”§ Tool call - read_query('{"sql": "-- 2. Yearly analysis of bank failures\nS...') - βœ“                                          |
|     βœ“ 14 rows (0.5s)                                                                                                            |
| πŸ”§ Tool call - read_query('{"sql": "-- 3. State-wise analysis (top 15 states)...') - βœ“                                          |
|     βœ“ 15 rows (0.5s)                                                                                                            |
| πŸ”§ Tool call - read_query('{"sql": "-- 4. Largest bank failures (top 20 by as...') - βœ“                                          |
|     βœ“ 20 rows (0.5s)                                                                                                            |
| πŸ”§ Tool call - read_query('{"sql": "-- 5. Acquisition analysis (top acquirers...') - βœ“                                          |
|     βœ“ 15 rows (0.5s)                                                                                                            |
| πŸ”§ Tool call - read_query('{"sql": "-- 6. Asset size distribution analysis\nW...') - βœ“                                          |
|     βœ“ 6 rows (0.5s)                                                                                                             |
| πŸ”§ Tool call - read_query('{"sql": "-- 7. Recent failures (2023-2024)\nSELECT...') - βœ“                                          |
|     βœ“ 8 rows (0.5s)                                                                                                             |
| πŸ’¬ Chat interaction completed successfully                                                                                      |
+----------------------------------------------------------------------------------------------------------------------------------+

+==================================================================================================================================+
|                                       🏦 Comprehensive Bank Failures Analysis (2008-2024)                                       |
+==================================================================================================================================+

                                                       πŸ“Š Executive Summary

β€’ Total Failures: 545 banks failed between 2008-2024
β€’ Total Assets: $1.696 trillion in failed bank assets
β€’ Time Span: 16 years (January 2008 - October 2024)
β€’ Geographic Reach: 42 states, 419 cities affected
β€’ Average Bank Size: $3.11 billion


                                                    πŸ“ˆ Key Historical Patterns

                                                       Major Crisis Periods

1 2008-2009 Financial Crisis: 166 failures, $938.1B assets
β€’ Peak: 2009 (140 failures, $169.5B assets)
β€’ Includes Bear Stearns ($395B) and Washington Mutual ($307B)
2 2010-2014 Post-Crisis Cleanup: 270 failures, $150.0B assets
β€’ Peak: 2010 (157 failures - highest count)
β€’ Smaller institutions, regulatory cleanup
3 2023 Banking Stress: 6 failures, $572.7B assets
β€’ Major failures: First Republic ($229B), Silicon Valley Bank ($209B), Signature Bank ($118B)


                                                    πŸ—ΊοΈ Geographic Distribution

                                                  Top 5 States by Failure Count

1 Georgia: 91 failures (16.7%) - $34.4B assets
2 Florida: 73 failures (13.4%) - $39.2B assets
3 Illinois: 66 failures (12.1%) - $37.7B assets
4 California: 43 failures (7.9%) - $559.8B assets
5 Minnesota: 23 failures (4.2%) - $3.2B assets

                                                   Top 5 States by Assets Lost

1 California: $559.8B (33.0% of total)
2 New York: $513.4B (30.3% of total)*
3 Washington: $318.9B (18.8% of total)
4 Illinois: $37.7B (2.2% of total)
5 Florida: $39.2B (2.3% of total)

*Includes Bear Stearns and Signature Bank


                                                      πŸ’° Asset Size Analysis

                                                        Size Distribution


Asset Range   Failures   % of Total   Total Assets   % of Assets   Avg Size
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Under $100M   136        24.9%        $7.7B          0.5%          $56M
$100M-$500M   275        50.5%        $66.5B         3.9%          $242M
$500M-$1B     57         10.5%        $40.3B         2.4%          $707M
$1B-$5B       55         10.1%        $119.6B        7.1%          $2.2B
$5B-$10B      7          1.3%         $43.8B         2.6%          $6.3B
Over $10B     15         2.8%         $1,418.2B      83.6%         $94.5B


Key Insight: While 75% of failures were under $1B in assets, the 15 largest failures (>$10B) represent 83.6% of all failed assets.


                                                     πŸ† Largest Bank Failures

1 Bear Stearns (2008) - $395.0B (J.P. Morgan Chase)
2 Washington Mutual (2008) - $307.0B (JPMorgan Chase)
3 First Republic Bank (2023) - $229.0B (JPMorgan Chase)
4 Silicon Valley Bank (2023) - $209.0B (First Citizens)
5 Signature Bank (2023) - $118.4B (Flagstar Bank)


                                                     πŸ›οΈ Acquisition Landscape

                                                    Top Acquiring Institutions

1 U.S. Bank, N.A.: 13 acquisitions ($38.7B)
2 State Bank and Trust: 10 acquisitions ($3.5B)
3 Ameris Bank: 9 acquisitions ($1.9B)
4 Stearns Bank, N.A.: 9 acquisitions ($1.7B)
5 First-Citizens Bank: 7 acquisitions ($7.8B)

                                                         Notable Patterns

β€’ JPMorgan Chase: Acquired 3 institutions but captured $930B+ in assets
β€’ 20 banks: Had no acquirer (insured depositors paid directly)
β€’ Consolidation: Larger banks acquired most failed institutions


                                                   πŸ“… Recent Trends (2023-2024)

                                                       2023 Major Failures

β€’ First Republic Bank: $229B (May 2023)
β€’ Silicon Valley Bank: $209B (March 2023)
β€’ Signature Bank: $118B (March 2023)
β€’ Silvergate Bank: $16B (March 2023)

                                                          2024 Failures

β€’ Republic First Bancorp: $6B (April 2024)
β€’ First National Bank of Lindsay: $108M (October 2024)


                                                         🎯 Key Insights

                                                  1. Systemic Risk Concentration

β€’ 15 largest failures (2.8% of total) represent 83.6% of failed assets
β€’ "Too big to fail" remains a significant concern

                                                   2. Geographic Vulnerability

β€’ Southeast (GA, FL) had highest failure counts
β€’ West Coast (CA, WA) had largest asset concentrations

                                                      3. Temporal Clustering

β€’ Clear crisis periods: 2008-2009, 2010-2014, 2023
β€’ Regulatory changes followed each crisis period

                                                         4. Size Matters

β€’ Most failures were small banks
β€’ But systemic risk comes from very large institutions

                                                     5. Acquisition Patterns

β€’ Large banks benefit from crisis acquisitions
β€’ Significant industry consolidation


                                                  ⚠️ Risk Management Implications

1 Regulatory Focus: Largest institutions require heightened supervision
2 Regional Monitoring: Southeast banking markets need closer oversight
3 Crisis Preparedness: Friday intervention patterns suggest need for rapid response capabilities
4 Systemic Monitoring: Asset concentration remains critical risk indicator

This analysis provides a comprehensive view of US bank failures over 16 years, highlighting patterns, risks, and regulatory
implications for the banking sector.
Would you like to check the details? (y/n): n

Tip

For more command references and options, see CLI or simply type .help.

Next Steps

Now that you're up and running with Datus, explore more advanced features: