Multi-Tenant Gateway Architecture
One gateway, unlimited APIs. Built for scale, designed for simplicity.
Evolution to Gateway-Only
As of August 2025, we've consolidated to a true gateway-only architecture. No more individual wrapper services - the gateway handles ALL MCP services through multi-tenant routing based on the Host header. Add 100+ services with just configuration files.
System Architecture
┌──────────────────────────────────────────────┐
│ Client (Claude.ai, ChatGPT) │
└────────────────┬─────────────────────────────┘
│ Calls your-api-mcp.mcpify.org
↓
┌──────────────────────────────────────────────┐
│ Nginx (Wildcard) │
│ Routes ALL *-mcp.mcpify.org → Gateway:8080 │
└────────────────┬─────────────────────────────┘
│ Preserves Host header
↓
┌──────────────────────────────────────────────┐
│ MCP Gateway Service (Port 8080) │
│ │
│ • Parses service from Host header │
│ • Handles MCP protocol for ALL services │
│ • Token counting (90%+ cache hit rate) │
│ • OAuth vault with auto-refresh │
│ • Cross-service cache sharing │
│ • Global rate limiting │
└────────────────┬─────────────────────────────┘
│ Loads config from storage
↓
┌──────────────────────────────────────────────┐
│ Tool Configurations (Cloud Storage) │
│ │
│ • hubspot.json • stripe.json │
│ • slack.json • YOUR-API.json │
└──────────────────────────────────────────────┘Why Gateway-First?
The Numbers Don't Lie
When calling external APIs:
- • Your MCP → Gateway: ~10ms (same region)
- • Your MCP → External API: 200-800ms
Gateway overhead is less than 5% of total latency!
Performance Characteristics
| Operation | Library (Local) | Gateway | Impact | |-----------|----------------|---------|--------| | Token Count (cached) | 1ms | 10ms | +9ms | | Token Count (uncached) | 50ms | 15ms | -35ms ✅ | | Truncation | 10ms | 15ms | +5ms | | Cache Check | 1ms | 10ms | +9ms | | **Total per API call** | 62ms | 50ms | **-12ms** ✅ |
Gateway is actually FASTER for real workloads due to 90%+ cache hit rate and cross-service sharing.
The 90/10 Rule
90% of functionality runs locally, 10% leverages the gateway for compute-intensive operations.
┌─────────────────────────────────────────────┐
│ Your MCP Service │
├─────────────────────────────────────────────┤
│ 90% Local Library │ 10% Gateway (opt) │
├─────────────────────┼───────────────────────┤
│ • Tool definitions │ • Heavy token ops │
│ • Basic validation │ • OAuth vault │
│ • Local caching │ • Cross-service cache │
│ • Field filtering │ • Analytics │
│ • Error handling │ • Global rate limits │
└─────────────────────┴───────────────────────┘
↓ ↓
Local Execution Network Call to Gateway
(No overhead) (When beneficial)Core Services
🧮 Token Intelligence
- • Centralized token counting
- • 90%+ cache hit rate in production
- • Uses tiktoken for precise counts
- • ~5000 counts/second with caching
💾 Smart Caching
- • Cross-service response caching
- • Field filtering support
- • Cache full, serve partial
- • Redis-backed with TTL control
🔐 OAuth Vault
- • Secure token storage
- • Automatic refresh handling
- • Multi-tenant isolation
- • Encrypted at rest (Fernet)
📊 Analytics
- • Usage tracking per service
- • Performance metrics
- • Cost optimization insights
- • Real-time dashboards
🚦 Rate Limiting
- • Global API protection
- • Multi-tier limits
- • Per-service overrides
- • Burst handling
📄 Pagination
- • Consistent handling
- • Data snapshots
- • Prevent mixed versions
- • Session management
Multi-Tenant Routing
How It Works
- 1. Request arrives at subdomain:
hubspot-mcp.mcpify.org - 2. Nginx routes ALL *-mcp domains to gateway
→ gateway:8080 - 3. Gateway parses service from Host header
service = "hubspot" - 4. Loads configuration from storage
configs/hubspot.json - 5. Handles MCP protocol for that service
Returns tools, executes calls
Resource Usage
Response Times
Benefits at Scale
For 20-30 Wrappers
Development Time
30 wrappers × 50 lines = 1,500 lines
vs 15,000 lines without gateway
Bug Fixes
Fix once in gateway
vs 30 different places
Monitoring
1 dashboard
vs 30 separate dashboards
Updates
Single gateway deployment
All wrappers benefit instantly
Security
API Key Authentication
All endpoints except /health require API key
OAuth Token Encryption
Tokens encrypted at rest using Fernet
CORS Protection
Only trusted origins allowed
Service Isolation
Multi-tenant with namespace separation
Minimal Permissions
Service account follows principle of least privilege