Skip to content

padak/ai_agents_data_api

Repository files navigation

AI Agents Data API

A Backend API that provides controlled access to Snowflake tables (and optionally DuckDB tables) for AI agents. This API acts as a secure intermediary between AI agents and the underlying databases, providing features for data access, sampling, and artifact management.

Documentation

Detailed documentation can be found in the DOCS folder:

  • Instructions - Comprehensive guide for the API implementation
  • Sync Strategy - Technical details of Snowflake to DuckDB replication
  • Plan - Development roadmap and milestones
  • Todo - Current development tasks and progress

Features

  • Secure authentication with Swarm and Agent tokens
  • Data access through DuckDB with Snowflake synchronization
  • Data sampling and profiling capabilities
  • Artifact storage and sharing (up to 50MB)
  • Support for CSV and Parquet export formats
  • Comprehensive logging and auditing
  • Asynchronous task processing with Celery
  • Task monitoring with Flower dashboard

Requirements

  • Python 3.11+
  • Poetry for dependency management
  • Docker and Docker Compose
  • Access to Snowflake database

Quick Start

  1. Clone the repository:
git clone https://github.com/padak/ai_agents_data_api.git
cd ai_agents_data_api
  1. Set up environment variables:
cp .env.example .env
# Edit .env with your configuration
  1. Run with Docker Compose:
docker compose up --build

This will start all services:

Alternatively, for development without Docker:

  1. Install dependencies:
poetry install
  1. Run the development server:
poetry run uvicorn app.main:app --reload

Testing the API

  1. Access the API documentation:

  2. Test Query Endpoints:

    # Example: Execute a query
    curl -X POST "http://localhost:8000/api/v1/queries/" \
      -H "Content-Type: application/json" \
      -d '{"query": "SELECT * FROM my_table LIMIT 10", "output_format": "CSV"}'
    
    # Check query status
    curl "http://localhost:8000/api/v1/queries/{job_id}"
  3. Test Sync Operations:

    # Start table sync
    curl -X POST "http://localhost:8000/api/v1/sync/start" \
      -H "Content-Type: application/json" \
      -d '{"table_name": "my_table", "schema_name": "my_schema", "strategy": "FULL"}'
    
    # Check sync status
    curl "http://localhost:8000/api/v1/sync/jobs/{sync_id}"

Monitoring Tasks

  1. Access Flower Dashboard:

    • Open http://localhost:5555 in your browser
    • Monitor active workers, task history, and success rates
    • View detailed task information and results
  2. Scheduled Tasks:

    • Query result cleanup: Runs daily to remove old query results
    • Table sync status updates: Runs hourly to update sync statistics
    • Configure schedules in app/tasks/celery_app.py

Development

For detailed development instructions and documentation, please refer to the Development Guide.

License

MIT License - see the LICENSE file for details

API Usage Examples

Authentication

# Get admin token
curl -X POST "http://localhost:8000/api/v1/auth/token" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=admin&password=admin_test_password"

Sync Operations

  1. List Tables in Schema:
curl -X GET "http://localhost:8000/api/v1/sync/tables/WORKSPACE_833213390" \
  -H "Authorization: Bearer your_access_token"
  1. Register a Table:
curl -X POST "http://localhost:8000/api/v1/sync/tables/register" \
  -H "Authorization: Bearer your_access_token" \
  -H "Content-Type: application/json" \
  -d '{
    "table_name": "data",
    "schema_name": "WORKSPACE_833213390"
  }'
  1. Remove a Table:
curl -X DELETE "http://localhost:8000/api/v1/sync/tables/WORKSPACE_833213390/data" \
  -H "Authorization: Bearer your_access_token"
  1. Start Table Sync:
curl -X POST "http://localhost:8000/api/v1/sync/start" \
  -H "Authorization: Bearer your_access_token" \
  -H "Content-Type: application/json" \
  -d '{
    "sync_request": {
      "table_name": "data",
      "schema_name": "WORKSPACE_833213390",
      "strategy": "full"
    }
  }'
  1. Check Sync Job Status:
curl -X GET "http://localhost:8000/api/v1/sync/jobs/your_sync_id" \
  -H "Authorization: Bearer your_access_token"
  1. Check Table Sync Status:
curl -X GET "http://localhost:8000/api/v1/sync/tables/WORKSPACE_833213390/data/status" \
  -H "Authorization: Bearer your_access_token"

For more details about the API endpoints and their parameters, please refer to the API Documentation.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published