Clickhouse

Description: Enable agents to interact with Clickhouse databases (read only).

Author: Arcade

Code: GitHub

Auth: database connection string

PyPI VersionLicensePython VersionsWheel StatusDownloads

The Arcade Clickhouse toolkit provides a pre-built set of tools for interacting with Clickhouse databases in a read-only manner. This toolkit enables agents to discover database schemas, explore table structures, and execute SELECT queries safely. This toolkit is a companion to the blog post Designing SQL Tools for AI Agents.

This toolkit is meant to be an example of how to build a toolkit for a database, and is not intended to be used in production - you won’t find it listed in the Arcade dashboard or APIs. For production use, we recommend forking this repository and building your own toolkit with use-case specific tools.

Key Features

This toolkit demonstrates several important concepts for LLM-powered database interactions:

  • Schema Discovery: Automatically discover available database schemas
  • Table Exploration: Find all tables within a specific schema
  • Schema Inspection: Get detailed column information including data types, primary keys, and indexes
  • Safe Query Execution: Execute SELECT queries with built-in safety measures
  • Connection Pooling: Reuse database connections efficiently
  • Read-Only Access: Enforce read-only access to prevent data modification
  • Row Limits: Automatically limit query results to prevent overwhelming responses

Available Tools

Tool NameDescription
Clickhouse.DiscoverDatabasesDiscover all databases in a Clickhouse database.
Clickhouse.DiscoverSchemasDiscover all schemas in a Clickhouse database.
Clickhouse.DiscoverTablesDiscover all tables in a specific schema.
Clickhouse.GetTableSchemaGet the detailed schema of a specific table.
Clickhouse.ExecuteSelectQueryExecute a SELECT query with comprehensive clause support.

Note that all tools require the CLICKHOUSE_DATABASE_CONNECTION_STRING secret to be set.

Clickhouse.DiscoverDatabases

Discover all databases in a Clickhouse database. This tool returns a list of all available databases.

Clickhouse.DiscoverSchemas

Discover all schemas in a Clickhouse database. This tool returns a list of all available schemas, excluding the information_schema for security.

Clickhouse.DiscoverTables

Discover all tables in a specific schema. This tool should be used before any other tool that requires a table name.

Parameters:

  • schema_name (str): The database schema to discover tables in (default: “public”)

Clickhouse.GetTableSchema

Get the detailed schema of a specific table. This tool provides column information including data types, primary key indicators, and index information. Always use this tool before executing any query.

Parameters:

  • schema_name (str): The database schema containing the table
  • table_name (str): The name of the table to inspect

Clickhouse.ExecuteSelectQuery

Execute a SELECT query with comprehensive clause support. This tool allows you to build complex queries using individual clauses while maintaining safety and performance.

Parameters:

  • select_clause (str): Columns to select (without SELECT keyword)
  • from_clause (str): Table(s) to query from (without FROM keyword)
  • limit (int): Maximum rows to return (default: 100)
  • offset (int): Number of rows to skip (default: 0)
  • join_clause (str, optional): JOIN conditions (without JOIN keyword)
  • where_clause (str, optional): WHERE conditions (without WHERE keyword)
  • having_clause (str, optional): HAVING conditions (without HAVING keyword)
  • group_by_clause (str, optional): GROUP BY columns (without GROUP BY keyword)
  • order_by_clause (str, optional): ORDER BY columns (without ORDER BY keyword)
  • with_clause (str, optional): WITH clause for CTEs (without WITH keyword)

Query Construction: The final query is constructed as:

SELECT {select_clause} FROM {from_clause}
JOIN {join_clause}
WHERE {where_clause}
HAVING {having_clause}
GROUP BY {group_by_clause}
ORDER BY {order_by_clause}
LIMIT {limit} OFFSET {offset}

Best Practices:

  • Always use discover_tables and get_table_schema before executing queries
  • Never use “SELECT *” - always specify the columns you need
  • Order results by primary keys or important columns
  • Use case-insensitive string matching
  • Trim strings in queries
  • Prefer LIKE queries over exact matches or regex
  • Only join on indexed columns or primary keys

Usage Workflow

For optimal results, follow this workflow when using the Clickhouse toolkit:

  1. Discover Schemas: Use discover_schemas to see available schemas
  2. Discover Tables: Use discover_tables with your target schema
  3. Get Table Schema: Use get_table_schema for each table you plan to query
  4. Execute Query: Use execute_select_query with the schema information

This workflow ensures your agent has complete information about the database structure before attempting queries, reducing errors and improving query performance.