MCP-powered agent integrating with Snowflake for autonomous data analysis.
This project showcases how to build production-ready AI agents that can:
- Query enterprise data warehouses using natural language
- Execute cross-database joins autonomously
- Maintain security through read-only access patterns
- Provide audit trails for compliance
User Query
↓
Claude (Anthropic API)
↓ (MCP Protocol)
MCP Server (Python)
↓
Snowflake Tools Layer
↓
Snowflake Data Warehouse
- MCP Server: Standardized protocol for AI-tool integration
- Snowflake Tools: Secure, read-only database query layer
- Claude Integration: LLM orchestration with autonomous discovery
- Cross-Database Queries: Enterprise pattern for federated data access
- LLM: Claude 3.5 Sonnet (Anthropic)
- Protocol: Model Context Protocol (MCP)
- Data Platform: Snowflake
- Language: Python 3.10
- Key Libraries: anthropic, mcp, snowflake-connector-python
- Secure by design: Read-only queries, no data exfiltration
- Governed access: All queries logged and auditable
- Cross-platform: MCP works with any compatible LLM
- Incremental adoption: Start with analytics, expand to operations
- Natural language to SQL translation
- Cross-database join execution
- Autonomous schema discovery
- Business-focused response formatting
- Query safety validation (SELECT-only)
- Conversation context retention
- Python 3.10+
- Snowflake trial account (free at snowflake.com)
- Anthropic API key
git clone https://github.com/pushkersahai/dataops-mcp-agent.git
cd dataops-mcp-agent
python -m venv venv
source venv/Scripts/activate # Windows Git Bash
# source venv/bin/activate # Mac/Linux
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your credentialspython src/setup_demo_data.pyCreates two databases:
- CUSTOMER_DB.MASTER.CUSTOMERS (10 customers)
- SALES_DB.TRANSACTIONS.SALES_ORDERS (15 orders)
# Autonomous mode (recommended for demos)
python src/claude_mcp_client_v2.py
# Interactive mode (asks for clarification)
python src/claude_mcp_client.pyYou: What are the top 5 largest closed deals?
Claude is using 1 tool(s):
- execute_query
Claude: Based on the sales data, here are the top 5 largest closed deals:
1. Retail Giants Inc - Snowflake Enterprise License: $1,200,000
2. Finance Partners Group - Snowflake Enterprise License: $1,100,000
3. Energy Solutions Corp - Snowflake Enterprise License: $950,000
4. Finance Partners Group - AWS Data Lake Solution: $890,000
5. Acme Corporation - Snowflake Enterprise License: $450,000
Total value: $4,590,000
More examples:
- "Show me total revenue by industry"
- "Which account manager has the most closed revenue?"
- "What's our average deal size for Enterprise customers?"
- "Show me all Technology customers with their purchase history"
dataops-mcp-agent/
├── src/
│ ├── mcp_server/
│ │ └── server.py # MCP server implementation
│ ├── tools/
│ │ └── snowflake_tools.py # Snowflake query tools
│ ├── claude_mcp_client.py # Interactive mode
│ ├── claude_mcp_client_v2.py # Autonomous mode
│ ├── setup_demo_data.py # Demo data generator
│ ├── test_snowflake_connection.py # Connection validator
│ └── test_mcp_server.py # Tool testing
├── docs/
│ ├── ARCHITECTURE.md # Technical deep dive
│ └── PRODUCTIONIZATION.md # Deployment guide
├── .env.example # Environment template
├── requirements.txt # Dependencies
└── README.md # This file
- Read-only access: Only SELECT queries permitted
- Query validation: Blocks DROP, DELETE, UPDATE, INSERT
- No data storage: Results streamed, not persisted
- Credential isolation: API keys in environment variables
- Audit trail: All queries logged with timestamps
- Role-Based Access Control (RBAC) in Snowflake
- Network policies for IP whitelisting
- OAuth integration for SSO
- Query result size limits
- Rate limiting on API calls
| Aspect | Traditional RAG | MCP Agent |
|---|---|---|
| Data Access | Copy to vector DB | Query live data |
| Freshness | Stale (periodic sync) | Real-time |
| Security | Data duplication | Source of truth |
| Maintenance | Embedding updates | Schema-aware |
| Cost | Storage + compute | Query compute only |
See docs/PRODUCTIONIZATION.md for:
- AWS Lambda deployment
- Container orchestration (ECS/Kubernetes)
- Monitoring and observability
- Cost optimization strategies
- Multi-tenant architectures
- Compliance and governance
# Test Snowflake connection
python src/test_snowflake_connection.py
# Test MCP tools locally
python src/test_mcp_server.pyPotential enhancements:
- Slack integration for team queries
- dbt Cloud API integration for lineage analysis
- Web UI with Streamlit
- Query result caching
- Natural language to dbt model generation
- Multi-LLM support (GPT-4, Gemini)
- Macro Regime Decision Agent - dbt + Snowflake financial analysis
Model Context Protocol standardizes AI-tool integration:
- Interoperability: Same tools work with Claude, GPT-4, Gemini
- Reusability: Write once, use across projects
- Governance: Centralized access control
- Future-proof: Swap LLMs without rewriting integrations
MIT License
Pushker Sahai www.linkedin.com/in/pushkersahai