Gateway Architecture
The multi-tenant gateway that powers MCPify's AI integrations at scale
The MCPify Gateway is a high-performance, multi-tenant gateway that sits between AI agents and your APIs. It handles authentication, routing, rate limiting, caching, and monitoring while providing a unified interface for all your MCP services.
Architecture Overview
AI Agents Layer
Claude, GPT-5, Custom AgentsLLMs and AI agents connect via MCP protocol
MCPify Gateway
Core InfrastructureYour APIs
External ServicesCRM, ERP, Databases, Custom APIs, Third-party services
Core Components
Request Router
Intelligent routing based on service discovery and load balancing:
// Routing configuration example
{
"routes": [
{
"pattern": "/services/crm/*",
"backend": "https://api.crm.example.com",
"loadBalancing": "round-robin",
"healthCheck": "/health",
"timeout": 30000
},
{
"pattern": "/services/erp/*",
"backend": ["https://erp1.example.com", "https://erp2.example.com"],
"loadBalancing": "least-connections",
"circuitBreaker": {
"threshold": 5,
"timeout": 60000
}
}
]
}
- Service discovery and registration
- Multiple load balancing strategies
- Circuit breaker for fault tolerance
- Automatic failover and retry logic
Authentication Manager
Centralized authentication and credential management:
Supported Auth Types
- • OAuth 2.0 (all flows)
- • API Key authentication
- • JWT tokens
- • Basic authentication
- • Custom headers
- • mTLS certificates
Security Features
- • Encrypted credential vault
- • Automatic token refresh
- • Credential rotation
- • Scope management
- • Multi-tenant isolation
- • Audit logging
Rate Limiter
Sophisticated rate limiting to protect your APIs and manage costs:
// Rate limiting configuration
{
"rateLimits": {
"global": {
"requests": 10000,
"window": "1h"
},
"perTenant": {
"requests": 1000,
"window": "1h",
"burst": 50
},
"perEndpoint": {
"/services/crm/contacts": {
"requests": 100,
"window": "1m"
},
"/services/ai/generate": {
"requests": 10,
"window": "1m",
"costBased": true,
"maxCostPerHour": 100.00
}
}
}
}
- Token bucket algorithm
- Sliding window counters
- Distributed rate limiting
- Cost-based throttling
- Priority queuing
- Graceful degradation
Intelligent Cache Layer
Multi-tier caching with transparency for AI agents:
L1 Cache - Edge (CDN)
Static resources, schemas, metadata • TTL: 1-24 hours
L2 Cache - Redis Cluster
API responses, session data • TTL: 5-60 minutes
L3 Cache - Application
Computed results, aggregations • TTL: 1-5 minutes
Streaming & Real-time Support
The gateway supports multiple streaming protocols for real-time AI interactions:
Server-Sent Events (SSE)
Primary protocol for MCP streaming responses
GET /services/chat/stream
Accept: text/event-stream
data: {"type":"delta","content":"Hello"}
data: {"type":"delta","content":" world"}
data: {"type":"done","tokens":15}
WebSocket
Bidirectional communication for interactive sessions
ws://gateway.mcpify.org/ws
{
"type": "subscribe",
"channels": ["updates", "alerts"]
}
Monitoring & Analytics
Real-time Metrics
- • Request volume
- • Response times
- • Error rates
- • Cache hit ratio
Usage Analytics
- • API usage by agent
- • Token consumption
- • Cost attribution
- • Popular endpoints
Health Monitoring
- • Service availability
- • Circuit breaker status
- • Queue depths
- • Resource utilization
Deployment Options
🌐 Cloud (Default)
Fully managed multi-tenant gateway hosted by MCPify
- • No infrastructure to manage
- • Automatic scaling and updates
- • Global edge locations
- • 99.99% SLA
🏢 Private Cloud
Dedicated gateway instance in your cloud account
- • Complete data isolation
- • Custom configuration
- • VPC peering support
- • Managed by MCPify
🏠 On-Premise
Self-hosted gateway in your data center
- • Full control and customization
- • Air-gapped deployment option
- • License-based pricing
- • Enterprise support
Performance Characteristics
Latency
- • P50: < 10ms overhead
- • P95: < 25ms overhead
- • P99: < 50ms overhead
- • Cache hits: < 5ms
Throughput
- • 100K+ requests/second
- • 10K+ concurrent connections
- • 1M+ cached responses/second
- • Auto-scaling to demand