Gateway Architecture

The multi-tenant gateway that powers MCPify's AI integrations at scale

The MCPify Gateway is a high-performance, multi-tenant gateway that sits between AI agents and your APIs. It handles authentication, routing, rate limiting, caching, and monitoring while providing a unified interface for all your MCP services.

Architecture Overview

AI Agents Layer

Claude, GPT-5, Custom Agents

LLMs and AI agents connect via MCP protocol

↓ MCP Protocol (JSON-RPC over SSE/HTTP)

MCPify Gateway

Core Infrastructure
• Authentication & Authorization
• Request Routing & Load Balancing
• Rate Limiting & Throttling
• Response Caching
• Monitoring & Analytics
• Error Handling & Retries
↓ REST/GraphQL/SOAP

Your APIs

External Services

CRM, ERP, Databases, Custom APIs, Third-party services

Core Components

Request Router

Intelligent routing based on service discovery and load balancing:

// Routing configuration example
{
  "routes": [
    {
      "pattern": "/services/crm/*",
      "backend": "https://api.crm.example.com",
      "loadBalancing": "round-robin",
      "healthCheck": "/health",
      "timeout": 30000
    },
    {
      "pattern": "/services/erp/*",
      "backend": ["https://erp1.example.com", "https://erp2.example.com"],
      "loadBalancing": "least-connections",
      "circuitBreaker": {
        "threshold": 5,
        "timeout": 60000
      }
    }
  ]
}
  • Service discovery and registration
  • Multiple load balancing strategies
  • Circuit breaker for fault tolerance
  • Automatic failover and retry logic

Authentication Manager

Centralized authentication and credential management:

Supported Auth Types

  • • OAuth 2.0 (all flows)
  • • API Key authentication
  • • JWT tokens
  • • Basic authentication
  • • Custom headers
  • • mTLS certificates

Security Features

  • • Encrypted credential vault
  • • Automatic token refresh
  • • Credential rotation
  • • Scope management
  • • Multi-tenant isolation
  • • Audit logging

Rate Limiter

Sophisticated rate limiting to protect your APIs and manage costs:

// Rate limiting configuration
{
  "rateLimits": {
    "global": {
      "requests": 10000,
      "window": "1h"
    },
    "perTenant": {
      "requests": 1000,
      "window": "1h",
      "burst": 50
    },
    "perEndpoint": {
      "/services/crm/contacts": {
        "requests": 100,
        "window": "1m"
      },
      "/services/ai/generate": {
        "requests": 10,
        "window": "1m",
        "costBased": true,
        "maxCostPerHour": 100.00
      }
    }
  }
}
  • Token bucket algorithm
  • Sliding window counters
  • Distributed rate limiting
  • Cost-based throttling
  • Priority queuing
  • Graceful degradation

Intelligent Cache Layer

Multi-tier caching with transparency for AI agents:

L1 Cache - Edge (CDN)

Static resources, schemas, metadata • TTL: 1-24 hours

L2 Cache - Redis Cluster

API responses, session data • TTL: 5-60 minutes

L3 Cache - Application

Computed results, aggregations • TTL: 1-5 minutes

Streaming & Real-time Support

The gateway supports multiple streaming protocols for real-time AI interactions:

Server-Sent Events (SSE)

Primary protocol for MCP streaming responses

GET /services/chat/stream
Accept: text/event-stream

data: {"type":"delta","content":"Hello"}
data: {"type":"delta","content":" world"}
data: {"type":"done","tokens":15}

WebSocket

Bidirectional communication for interactive sessions

ws://gateway.mcpify.org/ws
{
  "type": "subscribe",
  "channels": ["updates", "alerts"]
}

Monitoring & Analytics

Real-time Metrics

  • • Request volume
  • • Response times
  • • Error rates
  • • Cache hit ratio

Usage Analytics

  • • API usage by agent
  • • Token consumption
  • • Cost attribution
  • • Popular endpoints

Health Monitoring

  • • Service availability
  • • Circuit breaker status
  • • Queue depths
  • • Resource utilization

Deployment Options

🌐 Cloud (Default)

Fully managed multi-tenant gateway hosted by MCPify

  • • No infrastructure to manage
  • • Automatic scaling and updates
  • • Global edge locations
  • • 99.99% SLA

🏢 Private Cloud

Dedicated gateway instance in your cloud account

  • • Complete data isolation
  • • Custom configuration
  • • VPC peering support
  • • Managed by MCPify

🏠 On-Premise

Self-hosted gateway in your data center

  • • Full control and customization
  • • Air-gapped deployment option
  • • License-based pricing
  • • Enterprise support

Performance Characteristics

Latency

  • • P50: < 10ms overhead
  • • P95: < 25ms overhead
  • • P99: < 50ms overhead
  • • Cache hits: < 5ms

Throughput

  • • 100K+ requests/second
  • • 10K+ concurrent connections
  • • 1M+ cached responses/second
  • • Auto-scaling to demand

Related Documentation