Skip to content

Enterprise AI agent demonstrating MCP integration with Snowflake for autonomous data analysis

License

Notifications You must be signed in to change notification settings

pushkersahai/dataops-mcp-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataOps MCP Agent

MCP-powered agent integrating with Snowflake for autonomous data analysis.

Overview

This project showcases how to build production-ready AI agents that can:

  • Query enterprise data warehouses using natural language
  • Execute cross-database joins autonomously
  • Maintain security through read-only access patterns
  • Provide audit trails for compliance

Architecture

User Query
    ↓
Claude (Anthropic API)
    ↓ (MCP Protocol)
MCP Server (Python)
    ↓
Snowflake Tools Layer
    ↓
Snowflake Data Warehouse

Key Components

  1. MCP Server: Standardized protocol for AI-tool integration
  2. Snowflake Tools: Secure, read-only database query layer
  3. Claude Integration: LLM orchestration with autonomous discovery
  4. Cross-Database Queries: Enterprise pattern for federated data access

Technical Stack

  • LLM: Claude 3.5 Sonnet (Anthropic)
  • Protocol: Model Context Protocol (MCP)
  • Data Platform: Snowflake
  • Language: Python 3.10
  • Key Libraries: anthropic, mcp, snowflake-connector-python

Business Value

For Enterprise Customers

  • Secure by design: Read-only queries, no data exfiltration
  • Governed access: All queries logged and auditable
  • Cross-platform: MCP works with any compatible LLM
  • Incremental adoption: Start with analytics, expand to operations

Features

  • Natural language to SQL translation
  • Cross-database join execution
  • Autonomous schema discovery
  • Business-focused response formatting
  • Query safety validation (SELECT-only)
  • Conversation context retention

Quick Start

Prerequisites

  • Python 3.10+
  • Snowflake trial account (free at snowflake.com)
  • Anthropic API key

Installation

git clone https://github.com/pushkersahai/dataops-mcp-agent.git
cd dataops-mcp-agent

python -m venv venv
source venv/Scripts/activate  # Windows Git Bash
# source venv/bin/activate     # Mac/Linux

pip install -r requirements.txt

cp .env.example .env
# Edit .env with your credentials

Setup Demo Data

python src/setup_demo_data.py

Creates two databases:

  • CUSTOMER_DB.MASTER.CUSTOMERS (10 customers)
  • SALES_DB.TRANSACTIONS.SALES_ORDERS (15 orders)

Run the Agent

# Autonomous mode (recommended for demos)
python src/claude_mcp_client_v2.py

# Interactive mode (asks for clarification)
python src/claude_mcp_client.py

Example Usage

You: What are the top 5 largest closed deals?

Claude is using 1 tool(s):
  - execute_query

Claude: Based on the sales data, here are the top 5 largest closed deals:

1. Retail Giants Inc - Snowflake Enterprise License: $1,200,000
2. Finance Partners Group - Snowflake Enterprise License: $1,100,000
3. Energy Solutions Corp - Snowflake Enterprise License: $950,000
4. Finance Partners Group - AWS Data Lake Solution: $890,000
5. Acme Corporation - Snowflake Enterprise License: $450,000

Total value: $4,590,000

More examples:

  • "Show me total revenue by industry"
  • "Which account manager has the most closed revenue?"
  • "What's our average deal size for Enterprise customers?"
  • "Show me all Technology customers with their purchase history"

Project Structure

dataops-mcp-agent/
├── src/
│   ├── mcp_server/
│   │   └── server.py                 # MCP server implementation
│   ├── tools/
│   │   └── snowflake_tools.py        # Snowflake query tools
│   ├── claude_mcp_client.py          # Interactive mode
│   ├── claude_mcp_client_v2.py       # Autonomous mode
│   ├── setup_demo_data.py            # Demo data generator
│   ├── test_snowflake_connection.py  # Connection validator
│   └── test_mcp_server.py            # Tool testing
├── docs/
│   ├── ARCHITECTURE.md               # Technical deep dive
│   └── PRODUCTIONIZATION.md          # Deployment guide
├── .env.example                      # Environment template
├── requirements.txt                  # Dependencies
└── README.md                         # This file

Security & Governance

Built-in Safeguards

  1. Read-only access: Only SELECT queries permitted
  2. Query validation: Blocks DROP, DELETE, UPDATE, INSERT
  3. No data storage: Results streamed, not persisted
  4. Credential isolation: API keys in environment variables
  5. Audit trail: All queries logged with timestamps

Enterprise Considerations

  • Role-Based Access Control (RBAC) in Snowflake
  • Network policies for IP whitelisting
  • OAuth integration for SSO
  • Query result size limits
  • Rate limiting on API calls

How MCP Differs from Traditional RAG

Aspect Traditional RAG MCP Agent
Data Access Copy to vector DB Query live data
Freshness Stale (periodic sync) Real-time
Security Data duplication Source of truth
Maintenance Embedding updates Schema-aware
Cost Storage + compute Query compute only

Production Deployment

See docs/PRODUCTIONIZATION.md for:

  • AWS Lambda deployment
  • Container orchestration (ECS/Kubernetes)
  • Monitoring and observability
  • Cost optimization strategies
  • Multi-tenant architectures
  • Compliance and governance

Testing

# Test Snowflake connection
python src/test_snowflake_connection.py

# Test MCP tools locally
python src/test_mcp_server.py

Roadmap

Potential enhancements:

  • Slack integration for team queries
  • dbt Cloud API integration for lineage analysis
  • Web UI with Streamlit
  • Query result caching
  • Natural language to dbt model generation
  • Multi-LLM support (GPT-4, Gemini)

Related Projects

  • Macro Regime Decision Agent - dbt + Snowflake financial analysis

Why MCP?

Model Context Protocol standardizes AI-tool integration:

  1. Interoperability: Same tools work with Claude, GPT-4, Gemini
  2. Reusability: Write once, use across projects
  3. Governance: Centralized access control
  4. Future-proof: Swap LLMs without rewriting integrations

License

MIT License

Author

Pushker Sahai www.linkedin.com/in/pushkersahai

About

Enterprise AI agent demonstrating MCP integration with Snowflake for autonomous data analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages