Database Adapters¶
Datus Agent supports connecting to various databases through a plugin-based adapter system. This document explains the available adapters, how to install them, and how to configure your database connections.
Overview¶
Datus uses a modular adapter architecture that allows you to connect to different databases:
- Built-in Adapters: SQLite and DuckDB are included with the core package
- Plugin Adapters: Additional databases (MySQL, Snowflake, StarRocks, etc.) can be installed as separate packages
This design keeps the core package lightweight while allowing you to add support for specific databases as needed.
Supported Databases¶
| Database | Package | Installation | Status |
|---|---|---|---|
| SQLite | Built-in | Included | Ready |
| DuckDB | Built-in | Included | Ready |
| MySQL | datus-mysql | pip install datus-mysql |
Ready |
| PostgreSQL | datus-postgresql | pip install datus-postgresql |
Ready |
| StarRocks | datus-starrocks | pip install datus-starrocks |
Ready |
| Snowflake | datus-snowflake | pip install datus-snowflake |
Ready |
| ClickZetta | datus-clickzetta | pip install datus-clickzetta |
Ready |
| Hive | datus-hive | pip install datus-hive |
Ready |
| Spark | datus-spark | pip install datus-spark |
Ready |
| ClickHouse | datus-clickhouse | pip install datus-clickhouse |
Ready |
| Trino | datus-trino | pip install datus-trino |
Ready |
Installation¶
Built-in Databases¶
SQLite and DuckDB are included with Datus Agent and require no additional installation.
Plugin Adapters¶
Install the adapter package for your database:
# MySQL
pip install datus-mysql
# PostgreSQL
pip install datus-postgresql
# Snowflake
pip install datus-snowflake
# StarRocks
pip install datus-starrocks
# ClickZetta
pip install datus-clickzetta
# Hive
pip install datus-hive
# Spark
pip install datus-spark
# ClickHouse
pip install datus-clickhouse
# Trino
pip install datus-trino
Once installed, Datus Agent will automatically detect and load the adapter.
Configuration¶
Configure your database connection in the agent.yml file:
SQLite¶
DuckDB¶
MySQL¶
agent:
namespace:
production:
type: mysql
host: localhost
port: 3306
username: your_username
password: your_password
database: your_database
PostgreSQL¶
agent:
namespace:
production:
type: postgresql
host: localhost
port: 5432
username: your_username
password: your_password
database: your_database
schema: public # optional, default is public
sslmode: prefer # optional, default is prefer
Snowflake¶
agent:
namespace:
warehouse:
type: snowflake
account: your_account
username: your_username
password: your_password
warehouse: your_warehouse
database: your_database
schema: your_schema
role: your_role # optional
StarRocks¶
agent:
namespace:
analytics:
type: starrocks
host: localhost
port: 9030
username: root
password: your_password
database: your_database
ClickZetta¶
agent:
namespace:
lakehouse:
type: clickzetta
service: CLICKZETTA_SERVICE
username: CLICKZETTA_USERNAME
password: CLICKZETTA_PASSWORD
instance: CLICKZETTA_INSTANCE
workspace: CLICKZETTA_WORKSPACE
schema: CLICKZETTA_SCHEMA
vcluster: CLICKZETTA_VCLUSTER
Hive¶
agent:
namespace:
hive_data:
type: hive
host: 127.0.0.1
port: 10000
username: hive
database: default
auth: NONE # optional: NONE, LDAP, CUSTOM, KERBEROS
configuration: # optional Hive session config
hive.execution.engine: spark
Spark¶
agent:
namespace:
spark_data:
type: spark
host: localhost
port: 10000
username: spark
database: default
auth_mechanism: NONE # optional: NONE, PLAIN, KERBEROS
ClickHouse¶
agent:
namespace:
analytics:
type: clickhouse
host: localhost
port: 8123
username: default
password: your_password
database: your_database
Trino¶
agent:
namespace:
trino_data:
type: trino
host: localhost
port: 8080
username: trino
catalog: hive
schema: default
http_scheme: http # optional: http or https
Features by Adapter¶
Common Features¶
All adapters support:
- SQL query execution (SELECT, INSERT, UPDATE, DELETE)
- DDL operations (CREATE, ALTER, DROP)
- Metadata retrieval (tables, views, schemas)
- Sample data retrieval
- Connection pooling and timeout management
Adapter-Specific Features¶
MySQL¶
- INFORMATION_SCHEMA queries
- SHOW CREATE TABLE/VIEW support
- Full CRUD operations
PostgreSQL¶
- INFORMATION_SCHEMA queries
- Tables, views, and materialized views support
- Multi-schema namespace support
- SSL connection modes (disable, allow, prefer, require, verify-ca, verify-full)
- SQLAlchemy-based (psycopg2 driver)
Snowflake¶
- Multi-database and schema support
- Tables, views, and materialized views
- Arrow format for efficient data transfer
- Native SDK integration
StarRocks¶
- Multi-Catalog support
- Materialized view support
- MySQL protocol compatibility
ClickZetta¶
- Workspace and schema management
- Volume/Stage file operations
- Native SDK integration
Hive¶
- HiveServer2/Thrift protocol connection
- Hive session configuration support
- Multiple auth mechanisms (NONE, LDAP, CUSTOM, KERBEROS)
- Database context switching (USE statement)
Spark¶
- Spark Thrift Server connection via HiveServer2 protocol
- Multiple auth mechanisms (NONE, PLAIN, KERBEROS)
- Spark SQL dialect support
ClickHouse¶
- HTTP protocol connection
- No schema layer (databases serve as schemas)
- ClickHouse-specific DML syntax (ALTER TABLE UPDATE)
- Lightweight deletes support
Trino¶
- Three-level hierarchy: catalog → schema → table
- Cross-catalog query support
- Built-in TPC-H connector for benchmarking
- HTTP/HTTPS connection with SSL support
Troubleshooting¶
Adapter Not Found¶
If you see an error like Connector 'mysql' not found, make sure you have installed the corresponding adapter package:
Connection Issues¶
Check the following:
- Network connectivity: Ensure you can reach the database server
- Credentials: Verify username and password are correct
- Port: Confirm the correct port is specified
- Database name: Ensure the database exists
Driver Dependencies¶
Some adapters require additional system dependencies:
- MySQL: Requires
pymysql(installed automatically) - PostgreSQL: Requires
psycopg2-binary(installed automatically) - Snowflake: Requires
snowflake-connector-python(installed automatically) - Hive: Requires
pyhive,thrift,thrift-sasl,pure-sasl(installed automatically) - Spark: Requires
pyhive,thrift,thrift-sasl,pure-sasl(installed automatically) - ClickHouse: Requires
clickhouse-sqlalchemy(installed automatically) - Trino: Requires
trino(installed automatically)
Architecture¶
datus-agent (Core)
├── Built-in Adapters
│ ├── SQLite Connector
│ └── DuckDB Connector
│
└── Plugin System (Entry Points)
├── datus-sqlalchemy (Base layer)
│ ├── datus-mysql
│ ├── datus-postgresql
│ ├── datus-starrocks
│ ├── datus-hive
│ ├── datus-spark
│ ├── datus-clickhouse
│ └── datus-trino
│
└── Native SDK Adapters
├── datus-snowflake
└── datus-clickzetta
The adapter system uses Python's entry points mechanism for automatic discovery. When you install an adapter package, it registers itself with Datus Agent and becomes available for use.
Next Steps¶
- Quick Start Guide - Get started with Datus Agent
- Configuration Reference - Detailed configuration options