Quickstart¶
Get started with Datus Agent in just a few minutes. This guide will walk you through installation, setup, and your first interactions with Datus.
Step 1: Installation & Setup¶
Install Python 3.12¶
Datus requires a Python 3.12 environment. Choose your preferred method:
Install Datus Agent¶
Initialize Configuration¶
Run the initialization command:
The setup will guide you through:
1. LLM Configuration - Configure your preferred LLM provider (OpenAI, DeepSeek, Claude, Kimi, Qwen)
2. Namespace Setup - Connect to your database. For a quick start, use the demo database:
Demo Database
Datus provides a pre-configured demo DuckDB database for testing.
Connection string: ~/.datus/sample/duckdb-demo.duckdb
3. Workspace Configuration - Set up your SQL files directory (default: ~/.datus/workspace)
4. Knowledge Base (Optional) - Initialize vector DB for metadata and reference SQL
After setup completes, you're ready to launch Datus!
Step 2: Launch Datus CLI¶
Start the Datus CLI with your configured namespace:
Configuration
You can connect to different databases by adding namespaces in agent.yml. See our Configuration guide for details.
Initializing AI capabilities in background...
Datus - AI-powered SQL command-line interface
Type '.help' for a list of commands or '.exit' to quit.
Namespace duckdb selected
Connected to duckdb using database duckdb
Context: Current: database: duckdb
Type SQL statements or use ! @ . commands to interact.
Datus>
Step 3: Start Using Datus¶
Tip
You can execute SQL in Datus just like in a SQL editor.
List all tables:
Tables in Database duckdb-demo
+---------------------+
| Table Name |
+=====================+
| bank_failures |
| boxplot |
| calendar |
| candle |
| christmas_cost |
| companies |
| country_stats_scatter|
| gold_vs_bitcoin |
| japan_births_deaths |
| japan_population |
| metrics |
| niger_population |
| quotes |
| radar |
| sankey |
| search_trends |
| tree |
+---------------------+
Ask any question you like. Here is a table named gold_vs_bitcoin. Let's view its schema:
+------------------+------------------+------------------+------------------+------------------+------------------+
| column_name | column_type | null | key | default | extra |
+==================+==================+==================+==================+==================+==================+
| time | TIMESTAMP | YES | None | None | None |
| gold | DOUBLE | YES | None | None | None |
| bitcoin | DOUBLE | YES | None | None | None |
+------------------+------------------+------------------+------------------+------------------+------------------+
Returned 3 rows in 0.04 seconds
Tip
To chat, start a line with / and continue in natural language.
Suppose we want to understand the correlation between gold and Bitcoin. We can ask the Datus agent directly:
Datus will display its thinking process to show how it approaches the task, then provide a detailed report.
Detailed Analysis Output
Creating new chat session...
Processing chat request...
+---------------------------------------------------------------------------------------------------------------------------------------+
| Action Stream |
+=======================================================================================================================================+
| π§ Tool call - search_files('{"path": "/Users/yt/.datus/workspace", "pattern": ...') - β |
| No matches found (0.5s) |
| π¬ Thinking: Let me check the directory structure to see what's available: |
| π§ Tool call - list_directory('{"path": "/Users/yt/.datus/workspace"}') - β (0.5s) |
| π¬ Thinking: Now let me analyze the gold-Bitcoin correlation by examining the data and performing a comprehensive correlation |
| analysis: |
| π§ Tool call - read_query('{"sql": "SELECT \n COUNT(*) as total_records,\n...') - β |
| β 1 rows (0.5s) |
| π¬ Thinking: Now let me calculate the correlation coefficient and perform a detailed analysis: |
| π§ Tool call - read_query('{"sql": "SELECT \n -- Correlation coefficient\n...') - β |
| β 1 rows (0.5s) |
| π¬ Thinking: Let me get more detailed statistics and time-based analysis: |
| π§ Tool call - read_query('{"sql": "WITH daily_aggregates AS (\n SELECT \n...') - β |
| β 1 rows (0.5s) |
| π¬ Thinking: Let me check the distribution of the data and look for any patterns: |
| π§ Tool call - read_query('{"sql": "SELECT \n -- Price movement analysis\n...') - β |
| β Failed:(error_code=500006, e...) (0.5s) |
| π¬ Thinking: Let me fix the query to properly analyze price movement patterns: |
| π§ Tool call - read_query('{"sql": "WITH price_changes AS (\n SELECT \n ...') - β |
| β 5 rows (0.5s) |
| π¬ Thinking: Now let me analyze the strength of the correlation across different time periods and price levels: |
| π§ Tool call - read_query('{"sql": "WITH price_quartiles AS (\n SELECT \n ...') - β |
| β 16 rows (0.5s) |
| π¬ Thinking: Let me also analyze the volatility and risk characteristics: |
| π§ Tool call - read_query('{"sql": "SELECT \n -- Risk and return metrics\n...') - β |
| β Failed:(error_code=500006, e...) (0.5s) |
| π¬ Thinking: Let me correct the query with the proper function names: |
| π§ Tool call - read_query('{"sql": "SELECT \n -- Risk and return metrics\n...') - β |
| β 1 rows (0.5s) |
| π¬ Thinking: Finally, let me examine the time series data to understand the temporal patterns: |
| π§ Tool call - read_query('{"sql": "SELECT \n DATE(time) as trading_date,\...') - β |
| β 1 rows (0.5s) |
| π¬ Chat interaction completed successfully |
+---------------------------------------------------------------------------------------------------------------------------------------+
Generated SQL:
-- Generated SQL (copied)
SELECT
-- Overall correlation and basic stats
CORR(gold, bitcoin) as correlation_coefficient,
COUNT(*) as total_observations,
AVG(gold) as avg_gold_price,
AVG(bitcoin) as avg_bitcoin_price,
STDDEV(gold) as gold_volatility,
STDDEV(bitcoin) as bitcoin_volatility,
COVAR_POP(gold, bitcoin) as covariance,
-- Price movement patterns
(SELECT COUNT(*) FROM (
SELECT
CASE
WHEN gold_change > 0 AND bitcoin_change > 0 THEN 'Both Up'
WHEN gold_change < 0 AND bitcoin_change < 0 THEN 'Both Down'
WHEN gold_change > 0 AND bitcoin_change < 0 THEN 'Gold Up, Bitcoin Down'
WHEN gold_change < 0 AND bitcoin_change > 0 THEN 'Gold Down, Bitcoin Up'
END as pattern
FROM (
SELECT
gold - LAG(gold) OVER (ORDER BY time) as gold_change,
bitcoin - LAG(bitcoin) OVER (ORDER BY time) as bitcoin_change
FROM gold_vs_bitcoin
) WHERE gold_change IS NOT NULL
) WHERE pattern = 'Both Up') as both_up_count,
(SELECT COUNT(*) FROM (
SELECT
CASE
WHEN gold_change > 0 AND bitcoin_change > 0 THEN 'Both Up'
WHEN gold_change < 0 AND bitcoin_change < 0 THEN 'Both Down'
WHEN gold_change > 0 AND bitcoin_change < 0 THEN 'Gold Up, Bitcoin Down'
WHEN gold_change < 0 AND bitcoin_change > 0 THEN 'Gold Down, Bitcoin Up'
END as pattern
FROM (
SELECT
gold - LAG(gold) OVER (ORDER BY time) as gold_change,
bitcoin - LAG(bitcoin) OVER (ORDER BY time) as bitcoin_change
FROM gold_vs_bitcoin
) WHERE gold_change IS NOT NULL
) WHERE pattern = 'Both Down') as both_down_count
FROM gold_vs_bitcoin
Analysis Results:
Gold-Bitcoin Correlation Analysis
+===============================================================================================================+
| Gold-Bitcoin Correlation Analysis |
+===============================================================================================================+
π Overall Correlation
β’ Correlation Coefficient: -0.225 (Weak negative correlation)
β’ Total Observations: 954 data points
β’ Time Period: Single day - December 1, 2024 (16:00 to 16:15)
π° Price Statistics
β’ Average Gold Price: 0.684
β’ Average Bitcoin Price: 0.785
β’ Gold Volatility: 0.200
β’ Bitcoin Volatility: 0.233
β’ Covariance: -0.0105
π Price Movement Patterns
Based on 953 observed price changes:
β’ Both Assets Moving Up: 216 times (22.67%)
β’ Both Assets Moving Down: 203 times (21.30%)
β’ Gold Up, Bitcoin Down: 189 times (19.83%)
β’ Gold Down, Bitcoin Up: 190 times (19.94%)
π― Key Insights
1. Weak Negative Correlation
The -0.225 correlation suggests these assets tend to move in opposite directions more often than together, though the relationship
is weak.
2. Similar Movement Patterns
Both assets show remarkably similar movement patterns:
β’ Approximately 44% of the time they move in the same direction
β’ Approximately 40% of the time they move in opposite directions
3. Volatility Comparison
Bitcoin shows slightly higher volatility (0.233) compared to Gold (0.200), which aligns with Bitcoin's reputation as a more
volatile asset.
4. Risk-Adjusted Performance
β’ Gold Sharpe Ratio: 3.42
β’ Bitcoin Sharpe Ratio: 3.37 Both assets show strong risk-adjusted returns during this period.
β οΈ Limitations
This analysis is based on a very short time window (15 minutes of data). For more robust correlation analysis, data spanning
multiple days/weeks would be needed to observe:
β’ Longer-term correlation trends
β’ Market regime changes
β’ Impact of macroeconomic events
β’ Seasonal patterns
π‘ Investment Implications
The weak negative correlation suggests these assets could provide some diversification benefits, though the relationship is not
strong enough for reliable hedging strategies based on this limited dataset.
If you want detailed information for every step, press y at the end:
Datus will then show details in a new screen.
Step 4: Advanced Features¶
Context Management¶
Note
If you have initialized the knowledge base for metadata.
If you want to add items such as a file, metric or table to the context, use @. Manual retrieval gives the LLM more precise context, narrows the search scope, and improves answer accuracy.
Datus will automatically analyze the table and add metadata to the context.
Context Analysis Output
Processing chat request...
+----------------------------------------------------------------------------------------------------------------------------------+
| Action Stream |
+==================================================================================================================================+
| π‘ User: @Table duckdb-demo.main.bank_failures |
| π¬ Generating response with tools... |
| π¬ Thinking: I'll provide a comprehensive analysis of the bank_failures table. Let me start by examining the data structure |
| and then perform various analyses. |
| π§ Tool call - describe_table('{"table_name": "bank_failures"}') - β |
| β 7 columns (0.5s) |
| π¬ Thinking: Now let me perform a comprehensive analysis of the bank failures data: |
| π§ Tool call - read_query('{"sql": "-- Comprehensive Bank Failures Analysis\n...') - β |
| β Failed:(error_code=500006, e...) (0.5s) |
| π¬ Thinking: Let me break this down into separate queries for a comprehensive analysis: |
| π§ Tool call - read_query('{"sql": "-- 1. Basic overview statistics\nSELECT \...') - β |
| β 1 rows (0.5s) |
| π§ Tool call - read_query('{"sql": "-- 2. Yearly analysis of bank failures\nS...') - β |
| β 14 rows (0.5s) |
| π§ Tool call - read_query('{"sql": "-- 3. State-wise analysis (top 15 states)...') - β |
| β 15 rows (0.5s) |
| π§ Tool call - read_query('{"sql": "-- 4. Largest bank failures (top 20 by as...') - β |
| β 20 rows (0.5s) |
| π§ Tool call - read_query('{"sql": "-- 5. Acquisition analysis (top acquirers...') - β |
| β 15 rows (0.5s) |
| π§ Tool call - read_query('{"sql": "-- 6. Asset size distribution analysis\nW...') - β |
| β 6 rows (0.5s) |
| π§ Tool call - read_query('{"sql": "-- 7. Recent failures (2023-2024)\nSELECT...') - β |
| β 8 rows (0.5s) |
| π¬ Chat interaction completed successfully |
+----------------------------------------------------------------------------------------------------------------------------------+
+==================================================================================================================================+
| π¦ Comprehensive Bank Failures Analysis (2008-2024) |
+==================================================================================================================================+
π Executive Summary
β’ Total Failures: 545 banks failed between 2008-2024
β’ Total Assets: $1.696 trillion in failed bank assets
β’ Time Span: 16 years (January 2008 - October 2024)
β’ Geographic Reach: 42 states, 419 cities affected
β’ Average Bank Size: $3.11 billion
π Key Historical Patterns
Major Crisis Periods
1 2008-2009 Financial Crisis: 166 failures, $938.1B assets
β’ Peak: 2009 (140 failures, $169.5B assets)
β’ Includes Bear Stearns ($395B) and Washington Mutual ($307B)
2 2010-2014 Post-Crisis Cleanup: 270 failures, $150.0B assets
β’ Peak: 2010 (157 failures - highest count)
β’ Smaller institutions, regulatory cleanup
3 2023 Banking Stress: 6 failures, $572.7B assets
β’ Major failures: First Republic ($229B), Silicon Valley Bank ($209B), Signature Bank ($118B)
πΊοΈ Geographic Distribution
Top 5 States by Failure Count
1 Georgia: 91 failures (16.7%) - $34.4B assets
2 Florida: 73 failures (13.4%) - $39.2B assets
3 Illinois: 66 failures (12.1%) - $37.7B assets
4 California: 43 failures (7.9%) - $559.8B assets
5 Minnesota: 23 failures (4.2%) - $3.2B assets
Top 5 States by Assets Lost
1 California: $559.8B (33.0% of total)
2 New York: $513.4B (30.3% of total)*
3 Washington: $318.9B (18.8% of total)
4 Illinois: $37.7B (2.2% of total)
5 Florida: $39.2B (2.3% of total)
*Includes Bear Stearns and Signature Bank
π° Asset Size Analysis
Size Distribution
Asset Range Failures % of Total Total Assets % of Assets Avg Size
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Under $100M 136 24.9% $7.7B 0.5% $56M
$100M-$500M 275 50.5% $66.5B 3.9% $242M
$500M-$1B 57 10.5% $40.3B 2.4% $707M
$1B-$5B 55 10.1% $119.6B 7.1% $2.2B
$5B-$10B 7 1.3% $43.8B 2.6% $6.3B
Over $10B 15 2.8% $1,418.2B 83.6% $94.5B
Key Insight: While 75% of failures were under $1B in assets, the 15 largest failures (>$10B) represent 83.6% of all failed assets.
π Largest Bank Failures
1 Bear Stearns (2008) - $395.0B (J.P. Morgan Chase)
2 Washington Mutual (2008) - $307.0B (JPMorgan Chase)
3 First Republic Bank (2023) - $229.0B (JPMorgan Chase)
4 Silicon Valley Bank (2023) - $209.0B (First Citizens)
5 Signature Bank (2023) - $118.4B (Flagstar Bank)
ποΈ Acquisition Landscape
Top Acquiring Institutions
1 U.S. Bank, N.A.: 13 acquisitions ($38.7B)
2 State Bank and Trust: 10 acquisitions ($3.5B)
3 Ameris Bank: 9 acquisitions ($1.9B)
4 Stearns Bank, N.A.: 9 acquisitions ($1.7B)
5 First-Citizens Bank: 7 acquisitions ($7.8B)
Notable Patterns
β’ JPMorgan Chase: Acquired 3 institutions but captured $930B+ in assets
β’ 20 banks: Had no acquirer (insured depositors paid directly)
β’ Consolidation: Larger banks acquired most failed institutions
π
Recent Trends (2023-2024)
2023 Major Failures
β’ First Republic Bank: $229B (May 2023)
β’ Silicon Valley Bank: $209B (March 2023)
β’ Signature Bank: $118B (March 2023)
β’ Silvergate Bank: $16B (March 2023)
2024 Failures
β’ Republic First Bancorp: $6B (April 2024)
β’ First National Bank of Lindsay: $108M (October 2024)
π― Key Insights
1. Systemic Risk Concentration
β’ 15 largest failures (2.8% of total) represent 83.6% of failed assets
β’ "Too big to fail" remains a significant concern
2. Geographic Vulnerability
β’ Southeast (GA, FL) had highest failure counts
β’ West Coast (CA, WA) had largest asset concentrations
3. Temporal Clustering
β’ Clear crisis periods: 2008-2009, 2010-2014, 2023
β’ Regulatory changes followed each crisis period
4. Size Matters
β’ Most failures were small banks
β’ But systemic risk comes from very large institutions
5. Acquisition Patterns
β’ Large banks benefit from crisis acquisitions
β’ Significant industry consolidation
β οΈ Risk Management Implications
1 Regulatory Focus: Largest institutions require heightened supervision
2 Regional Monitoring: Southeast banking markets need closer oversight
3 Crisis Preparedness: Friday intervention patterns suggest need for rapid response capabilities
4 Systemic Monitoring: Asset concentration remains critical risk indicator
This analysis provides a comprehensive view of US bank failures over 16 years, highlighting patterns, risks, and regulatory
implications for the banking sector.
Would you like to check the details? (y/n): n
Tip
For more command references and options, see CLI or simply type .help.
Next Steps¶
Now that you're up and running with Datus, explore more advanced features:
- Contextual Data Engineering - Learn how to use data assets as context
- Configuration Guide - Connect to your own databases and customize settings
- CLI Reference - Discover all available commands and options
- MetricFlow - Generate and query metrics with datus-metricflow