MongoDB
Description: Enable agents to interact with MongoDB databases (read only).
Author: Arcade
Code: GitHub
Auth: database connection string
The Arcade MongoDB toolkit provides a pre-built set of tools for interacting with MongoDB databases in a read-only manner. This toolkit enables agents to discover databases and collections, explore document structures, and execute queries safely. This toolkit is a companion to the blog post Designing SQL Tools for AI Agents.
This toolkit is meant to be an example of how to build a toolkit for a database, and is not intended to be used in production - you won’t find it listed in the Arcade dashboard or APIs. For production use, we recommend forking this repository and building your own toolkit with use-case specific tools.
Key Features
This toolkit demonstrates several important concepts for LLM-powered database interactions:
- Database Discovery: Automatically discover available databases in the MongoDB instance
- Collection Exploration: Find all collections within a specific database
- Schema Inference: Sample documents to infer schema structure and data types
- Safe Query Execution: Execute find queries with built-in safety measures
- Aggregation Support: Run complex aggregation pipelines for data analysis
- Document Counting: Count documents matching specific criteria
- Connection Pooling: Reuse database connections efficiently
- Read-Only Access: Enforce read-only access to prevent data modification
- Result Limits: Automatically limit query results to prevent overwhelming responses
Available Tools
Tool Name | Description |
---|---|
MongoDB.DiscoverDatabases | Discover all databases in the MongoDB instance. |
MongoDB.DiscoverCollections | Discover all collections in a specific database. |
MongoDB.GetCollectionSchema | Get the schema structure of a collection by sampling documents. |
MongoDB.FindDocuments | Find documents in a collection with filtering, projection, and sorting. |
MongoDB.CountDocuments | Count documents matching a specific filter. |
MongoDB.AggregateDocuments | Execute aggregation pipelines for complex data analysis. |
Note that all tools require the MONGODB_CONNECTION_STRING
secret to be set.
MongoDB.DiscoverDatabases
Discover all databases in the MongoDB instance. This tool returns a list of all available databases, excluding system databases like admin
, config
, and local
for security.
MongoDB.DiscoverCollections
Discover all collections in a specific database. This tool should be used before any other tool that requires a collection name.
Parameters:
database_name
(str): The database name to discover collections in
MongoDB.GetCollectionSchema
Get the schema structure of a collection by sampling documents. Since MongoDB is schema-less, this tool samples a configurable number of documents to infer the schema structure and data types. Always use this tool before executing any query.
Parameters:
database_name
(str): The database name containing the collectioncollection_name
(str): The name of the collection to inspectsample_size
(int): The number of documents to sample for schema discovery (default: 100)
MongoDB.FindDocuments
Find documents in a collection with filtering, projection, and sorting. This tool allows you to build complex queries using MongoDB’s query operators while maintaining safety and performance.
Parameters:
database_name
(str): The database name to querycollection_name
(str): The collection name to queryfilter_dict
(str, optional): MongoDB filter/query as JSON string. Leave None for no filterprojection
(str, optional): Fields to include/exclude as JSON string. Use 1 to include, 0 to excludesort
(list[str], optional): Sort criteria as list of JSON strings with ‘field’ and ‘direction’ keyslimit
(int): Maximum number of documents to return (default: 100)skip
(int): Number of documents to skip (default: 0)
Best Practices:
- Always use
discover_collections
andget_collection_schema
before executing queries - Always specify projection to limit fields returned if you don’t need all data
- Always sort your results by the most important fields first
- Use appropriate MongoDB query operators for complex filtering ($gte, $lte, $in, $regex, etc.)
- Be mindful of case sensitivity when querying string fields
- Use indexes when possible (typically on _id and commonly queried fields)
MongoDB.CountDocuments
Count documents in a collection matching the given filter. This tool is useful for getting quick counts without retrieving the actual documents.
Parameters:
database_name
(str): The database name to querycollection_name
(str): The collection name to queryfilter_dict
(str, optional): MongoDB filter/query as JSON string. Leave None to count all documents
MongoDB.AggregateDocuments
Execute aggregation pipelines for complex data analysis. This tool allows you to run sophisticated data processing operations including grouping, filtering, and transformations.
Parameters:
database_name
(str): The database name to querycollection_name
(str): The collection name to querypipeline
(list[str]): MongoDB aggregation pipeline as a list of JSON stringslimit
(int): Maximum number of results to return (default: 100)
Common Aggregation Stages:
$match
- filter documents$group
- group documents and perform calculations$project
- reshape documents$sort
- sort documents$limit
- limit results$lookup
- join with other collections
Usage Workflow
For optimal results, follow this workflow when using the MongoDB toolkit:
- Discover Databases: Use
discover_databases
to see available databases - Discover Collections: Use
discover_collections
with your target database - Get Collection Schema: Use
get_collection_schema
for each collection you plan to query - Execute Queries: Use
find_documents
,count_documents
, oraggregate_documents
with the schema information
This workflow ensures your agent has complete information about the database structure before attempting queries, reducing errors and improving query performance.