A Backend API that provides controlled access to Snowflake tables (and optionally DuckDB tables) for AI agents. This API acts as a secure intermediary between AI agents and the underlying databases, providing features for data access, sampling, and artifact management.
Detailed documentation can be found in the DOCS
folder:
- Instructions - Comprehensive guide for the API implementation
- Sync Strategy - Technical details of Snowflake to DuckDB replication
- Plan - Development roadmap and milestones
- Todo - Current development tasks and progress
- Secure authentication with Swarm and Agent tokens
- Data access through DuckDB with Snowflake synchronization
- Data sampling and profiling capabilities
- Artifact storage and sharing (up to 50MB)
- Support for CSV and Parquet export formats
- Comprehensive logging and auditing
- Asynchronous task processing with Celery
- Task monitoring with Flower dashboard
- Python 3.11+
- Poetry for dependency management
- Docker and Docker Compose
- Access to Snowflake database
- Clone the repository:
git clone https://github.com/padak/ai_agents_data_api.git
cd ai_agents_data_api
- Set up environment variables:
cp .env.example .env
# Edit .env with your configuration
- Run with Docker Compose:
docker compose up --build
This will start all services:
- FastAPI application (http://localhost:8000)
- Celery workers for async tasks
- Celery Beat for scheduled tasks
- Flower dashboard (http://localhost:5555)
- Redis for message broker
Alternatively, for development without Docker:
- Install dependencies:
poetry install
- Run the development server:
poetry run uvicorn app.main:app --reload
-
Access the API documentation:
- Open http://localhost:8000/docs in your browser
- You'll see all available endpoints with interactive documentation
-
Test Query Endpoints:
# Example: Execute a query curl -X POST "http://localhost:8000/api/v1/queries/" \ -H "Content-Type: application/json" \ -d '{"query": "SELECT * FROM my_table LIMIT 10", "output_format": "CSV"}' # Check query status curl "http://localhost:8000/api/v1/queries/{job_id}"
-
Test Sync Operations:
# Start table sync curl -X POST "http://localhost:8000/api/v1/sync/start" \ -H "Content-Type: application/json" \ -d '{"table_name": "my_table", "schema_name": "my_schema", "strategy": "FULL"}' # Check sync status curl "http://localhost:8000/api/v1/sync/jobs/{sync_id}"
-
Access Flower Dashboard:
- Open http://localhost:5555 in your browser
- Monitor active workers, task history, and success rates
- View detailed task information and results
-
Scheduled Tasks:
- Query result cleanup: Runs daily to remove old query results
- Table sync status updates: Runs hourly to update sync statistics
- Configure schedules in
app/tasks/celery_app.py
For detailed development instructions and documentation, please refer to the Development Guide.
MIT License - see the LICENSE file for details
# Get admin token
curl -X POST "http://localhost:8000/api/v1/auth/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "username=admin&password=admin_test_password"
- List Tables in Schema:
curl -X GET "http://localhost:8000/api/v1/sync/tables/WORKSPACE_833213390" \
-H "Authorization: Bearer your_access_token"
- Register a Table:
curl -X POST "http://localhost:8000/api/v1/sync/tables/register" \
-H "Authorization: Bearer your_access_token" \
-H "Content-Type: application/json" \
-d '{
"table_name": "data",
"schema_name": "WORKSPACE_833213390"
}'
- Remove a Table:
curl -X DELETE "http://localhost:8000/api/v1/sync/tables/WORKSPACE_833213390/data" \
-H "Authorization: Bearer your_access_token"
- Start Table Sync:
curl -X POST "http://localhost:8000/api/v1/sync/start" \
-H "Authorization: Bearer your_access_token" \
-H "Content-Type: application/json" \
-d '{
"sync_request": {
"table_name": "data",
"schema_name": "WORKSPACE_833213390",
"strategy": "full"
}
}'
- Check Sync Job Status:
curl -X GET "http://localhost:8000/api/v1/sync/jobs/your_sync_id" \
-H "Authorization: Bearer your_access_token"
- Check Table Sync Status:
curl -X GET "http://localhost:8000/api/v1/sync/tables/WORKSPACE_833213390/data/status" \
-H "Authorization: Bearer your_access_token"
For more details about the API endpoints and their parameters, please refer to the API Documentation.