Securing AI Knowledge Access: HTTP MCP Server Architecture for Enterprise

The AI Security Problem Nobody Talks About

AI assistants fundamentally change how we think about data access security. Traditional APIs serve deterministic requests with predictable outputs. AI assistants, however, synthesize information across multiple sources, cache context in conversation history, and may inadvertently leak sensitive data through generated responses.

When we deployed STDIO MCP servers for local development, security was straightforward: the server inherited the user’s desktop credentials, ran in their security context, and accessed only what they could access. But AI security challenges emerged immediately:

Conversation History Leakage: AI assistants maintain conversation context. If an assistant retrieves sensitive data in one turn, that context may persist across subsequent queries, even when discussing unrelated topics.

Cross-Tenant Data Bleed: Without per-request authentication, STDIO servers grant blanket access. An AI assistant authenticated for one team could theoretically access another team’s data if the server doesn’t enforce request-level authorization.

Credential Inheritance Risks: Desktop credentials include cached tokens, SSH keys, and environment variables. STDIO servers inherit everything, creating an unnecessarily broad attack surface.

HTTP MCP servers with proper authentication architecture solve these problems, but the implementation details matter critically. This is what works in production.

Security-First Transport Selection

The choice between STDIO and HTTP transport directly impacts your security posture with AI assistants:

STDIO Security Model:

Credential Scope: Inherits all user environment credentials (broad access)
Authentication: One-time at process launch (session-based)
Audit Trail: Limited to process lifecycle logs
Data Isolation: Shared memory space with AI assistant process
Token Management: Uses cached desktop credentials (no rotation)

HTTP Security Model:

Credential Scope: Bearer token per request (minimal privilege)
Authentication: Every request validated independently (stateless)
Audit Trail: Complete request/response logging with user context
Data Isolation: Network boundary between AI assistant and data
Token Management: Short-lived tokens with automatic rotation

For production AI knowledge systems, HTTP transport provides defense-in-depth that STDIO cannot match. The architectural pattern enforces zero-trust principles: authenticate every request, authorize every action, audit every access.

Decision Framework:

Use STDIO when:

Local development with personal knowledge bases
Single-user desktop workflows
Trusted environment without sensitive data
Sub-millisecond latency critical

Use HTTP when:

Shared corporate knowledge bases
Multi-tenant AI assistant deployments
Sensitive data requiring access control
Compliance demands audit trails (SOC 2, ISO 27001, GDPR)
AI assistants running in untrusted environments

MCP HTTP Lifecycle and Transport Patterns

Official MCP Lifecycle (Source: MCP Specification)

Every HTTP MCP server follows this three-phase lifecycle. Understanding where security controls apply in each phase is critical:

sequenceDiagram
    participant Client as AI Assistant
    participant Server as HTTP MCP Server
    
    Note over Client,Server: Phase 1: Initialization
    Client->>Server: initialize request (protocol version, capabilities)
    Server->>Client: initialize response (server capabilities, session ID)
    Client->>Server: initialized notification
    
    Note over Client,Server: Phase 2: Operation
    loop Message Exchange
        Client->>Server: Tool requests, notifications
        Server->>Client: Tool responses, server requests
    end
    
    Note over Client,Server: Phase 3: Shutdown
    Client->>Server: Close HTTP connection(s)

Security Enhancement Points:

Phase 1 (Initialization): MISE validates OAuth token, establishes session ID, negotiates security capabilities
Phase 2 (Operation): Every tool request re-validates authentication, checks scopes, logs user context
Phase 3 (Shutdown): Audit logs finalize, session terminates, tokens invalidated

HTTP Streamable Transport Pattern (Source: MCP Specification)

The official MCP HTTP transport supports both simple POST/response and SSE streaming:

sequenceDiagram
    participant Client as AI Assistant
    participant Server as HTTP MCP Server
    
    Note over Client,Server: Initialization
    Client->>Server: POST /mcp (InitializeRequest)
    Server->>Client: InitializeResponse + MCP-Session-Id header
    Client->>Server: POST /mcp (InitializedNotification) + Session-Id
    Server->>Client: 202 Accepted
    
    Note over Client,Server: Client Requests (Two Patterns)
    
    alt Single HTTP Response
        Client->>Server: POST /mcp (tool request) + Session-Id
        Server->>Client: JSON response (tool result)
    else Server Opens SSE Stream
        Client->>Server: POST /mcp (tool request) + Session-Id
        Server->>Client: SSE stream starts
        loop While Connection Open
            Server->>Client: SSE event: progress update
            Server->>Client: SSE event: partial results
        end
        Server->>Client: SSE event: final response
        Note over Server: Stream terminates
    end
    
    Note over Client,Server: Server Notifications/Requests
    Client->>Server: GET /mcp + Session-Id
    loop While Connection Open
        Server->>Client: SSE messages from server
    end

Security Implications:

MCP-Session-Id Header: Cryptographically secure session ID (JWT/UUID) prevents session hijacking. Validated on every request.
SSE Stream Security: Each SSE event includes user context for audit. If token expires mid-stream, stream terminates gracefully without exposing error details.
Multiple Connection Handling: Each stream gets unique event IDs. Server never broadcasts same message across streams (prevents cross-user data leakage).
Resumability: Last-Event-ID header allows stream resumption after network failures without replaying events from other users’ streams.

Reference: MCP Transports Specification

Defense-in-Depth Authentication for AI Assistants

AI assistants introduce authentication challenges that traditional APIs don’t face. An AI assistant making dozens of knowledge base queries per minute needs:

Stateless Token Validation: Each request must be independently authenticated. Session-based auth (cookies, sticky sessions) fails with distributed AI assistant instances.

Scope-Based Authorization: AI assistants should access only the minimum data needed. Blanket “read all” permissions create data leakage risks when assistants synthesize responses.

Token Expiration Enforcement: Long-lived tokens cached by AI assistants become security liabilities. Short-lived tokens (15-60 minutes) with automatic rotation prevent credential theft impact.

Audit Trail with User Context: When an AI assistant retrieves sensitive data, security teams need to know which human user initiated the request, not just “AI assistant accessed database.”

MCP Authorization Specification: The official MCP authorization spec defines OAuth 2.1 requirements but leaves implementation details to server builders. This creates security gaps that production servers must address.

Production-Tested Authentication Architecture

The knowledge-mcp server implements a layered security model:

┌─────────────────────────────────────────────────────────┐
│ Layer 1: Network Security (HTTPS, TLS 1.3)              │
├─────────────────────────────────────────────────────────┤
│ Layer 2: OAuth 2.1 Token Validation (Azure AD)          │
│ • Validates JWT signature                               │
│ • Checks token expiration (15-minute window)            │
│ • Verifies issuer and audience                          │
├─────────────────────────────────────────────────────────┤
│ Layer 3: Scope-Based Authorization                      │
│ • Required: knowledge_read (basic search)               │
│ • Optional: knowledge_admin (management functions)      │
├─────────────────────────────────────────────────────────┤
│ Layer 4: Request Validation (JSON-RPC 2.0)              │
│ • Protocol compliance checks                            │
│ • Parameter type validation                             │
│ • Malicious payload detection                           │
├─────────────────────────────────────────────────────────┤
│ Layer 5: Audit Logging (User + Tool + Data Access)      │
│ • User identity (UPN, object ID)                        │
│ • Tool invoked (search_knowledge, get_document)         │
│ • Data accessed (document IDs, query terms)             │
│ • Response metadata (result count, sensitivity level)   │
└─────────────────────────────────────────────────────────┘

Why This Architecture Prevents AI Data Leaks:

Per-Request Token Validation: Even if an AI assistant caches a token, it expires within 15 minutes. Stolen tokens have minimal exploitation window.
Scope Isolation: AI assistants searching documentation (knowledge_read) cannot access administrative functions (knowledge_admin). Compromised assistants can’t escalate privileges.
User Context Preservation: Audit logs record the human user behind every AI assistant request. Security teams can trace data access back to the initiating user, not just “AI made a request.”
No Ambient Authority: Unlike STDIO servers that inherit desktop credentials, HTTP servers grant zero implicit permissions. Every capability must be explicitly authorized via token scopes.

Implementation: MISE vs Manual OAuth 2.1

Building OAuth 2.1 validation manually in Python required 400+ lines of code for:

JWT signature validation
Token expiration checks
Issuer/audience verification
Refresh token handling
Conditional access policy enforcement
Multi-tenant tenant ID validation

Using C# with MISE (Microsoft Identity Service Essentials) reduces this to a single configuration block:

services.AddAuthentication(JwtBearerDefaults.AuthenticationScheme)
    .AddMicrosoftIdentityWebApi(configuration.GetSection("AzureAd"));

MISE provides:

20+ years of security hardening: Edge cases, attack vectors, and compliance requirements built-in
Automatic token validation: Signature, expiration, issuer, audience checked on every request
Conditional access integration: Device compliance, IP restrictions, MFA enforcement
Audit logging: User context automatically captured and logged
Token cache management: Secure storage, automatic rotation, multi-tenant isolation

For production AI knowledge systems handling sensitive data, the maturity difference matters. Custom OAuth implementations miss edge cases that attackers exploit.

Protocol Compliance as Security Control

JSON-RPC 2.0 validation isn’t just about protocol compliance. It’s a critical security boundary preventing malicious payloads from reaching your knowledge base.

Attack Vectors JSON-RPC Middleware Prevents:

Parameter Injection: Malformed parameters attempting SQL injection, command injection, or path traversal
Method Enumeration: Probing for undocumented methods or internal APIs
Batch Request Amplification: Sending thousands of requests in a single batch to cause denial-of-service
Type Confusion Attacks: Sending strings where integers expected to trigger parsing errors and expose stack traces
Oversized Payloads: Multi-megabyte requests designed to exhaust memory

Production Validation Pipeline:

public class JsonRpcMiddleware
{
    private readonly IJsonRpcValidationPipeline validationPipeline;
    private readonly ILogger<JsonRpcMiddleware> logger;

    public async Task InvokeAsync(HttpContext context)
    {
        if (context.Request.Path.StartsWithSegments("/mcp"))
        {
            // Enable buffering for multiple reads (validation + handler)
            context.Request.EnableBuffering();
            
            // Validate payload before processing
            var result = await validationPipeline.ValidateAsync(context);
            
            if (!result.IsValid)
            {
                // Return JSON-RPC error, don't expose internal details
                context.Response.ContentType = "application/json";
                await context.Response.WriteAsync(result.JsonRpcError);
                
                // Log for security monitoring
                logger.LogWarning(
                    "JSON-RPC validation failed: {Error}, User: {User}",
                    result.ErrorCode, 
                    context.User.Identity?.Name
                );
                return;
            }
        }
        
        await next(context);
    }
}

Critical Implementation Pattern: Request Buffering

HTTP request bodies are forward-only streams. Reading them consumes the stream, leaving nothing for downstream handlers. The EnableBuffering() call converts the stream to seekable, allowing validation middleware to read it, then reset the position for the MCP handler.

Without buffering:

1. Middleware reads body → Stream position = end
2. MCP handler reads body → Gets empty stream → Silent failure

With buffering:

EnableBuffering() → Stream becomes seekable
Middleware reads body → Validates
Middleware sets Position = 0 → Resets stream
MCP handler reads body → Gets full content

Validation Checks Implemented:

JSON Structure: Valid JSON syntax, no truncated payloads
Required Fields: jsonrpc: "2.0", method, id present
Method Allowlist: Only registered MCP methods allowed (prevents enumeration)
Parameter Types: Schema validation for each tool’s parameters
Batch Limits: Maximum 10 requests per batch (prevents amplification attacks)
Payload Size: 1MB limit (prevents memory exhaustion)

These validations prevent malformed requests from reaching your business logic, reducing attack surface before authentication even occurs.

Streaming Responses and Data Leak Prevention

Server-Sent Events (SSE) for streaming MCP responses introduces both usability benefits and security considerations that don’t exist with traditional request-response patterns.

Security Challenge: Partial Response Exposure

With synchronous HTTP responses, either the entire response succeeds (200 OK) or fails (401/403/500). Authentication failures prevent any data transmission. SSE streaming, however, can begin sending data before completing full authorization checks:

Scenario: AI assistant queries "show me customer PII data"

SSE connection established (200 OK)
Start streaming: "Found 1000 customer records..."
Mid-stream authorization check: User lacks PII access
Connection terminated
Result: AI assistant received partial data before termination

Secure Streaming Pattern:

[HttpGet("/mcp/sse")]
public async Task StreamSearchResults(
    string query, 
    CancellationToken cancellationToken)
{
    // CRITICAL: Complete authorization BEFORE starting stream
    var user = HttpContext.User;
    if (!user.HasClaim("scp", "knowledge_read"))
    {
        return Forbid(); // Fails before any data sent
    }
    
    Response.Headers.Add("Content-Type", "text/event-stream");
    Response.Headers.Add("Cache-Control", "no-cache");
    Response.Headers.Add("X-Accel-Buffering", "no"); // Disable proxy buffering
    
    await foreach (var result in knowledgeSearch.StreamResultsAsync(query))
    {
        // Each chunk includes user context for audit
        var eventData = new 
        {
            Type = "search_result",
            Result = result,
            UserId = user.FindFirst("oid")?.Value, // Azure AD object ID
            Timestamp = DateTime.UtcNow
        };
        
        await Response.WriteAsync(
            $"data: {JsonSerializer.Serialize(eventData)}\n\n",
            cancellationToken
        );
        await Response.Body.FlushAsync(cancellationToken);
    }
}

Security Controls for Streaming:

Pre-Stream Authorization: Verify all permissions before sending first byte
Per-Chunk Audit Logging: Log each streamed chunk with user context (not just initial request)
Timeout Enforcement: Set maximum stream duration (5-10 minutes) to prevent connection exhaustion
Graceful Degradation: If authorization changes mid-stream (token expires), terminate cleanly without exposing error details
No Buffering Headers: X-Accel-Buffering: no prevents proxies from caching sensitive data

When Streaming Increases Risk:

SSE streaming is appropriate for progress updates (“Searching 1000 documents, 25% complete”), but NOT for actual data transmission when:

Data contains PII or confidential information
User permissions might change during query execution
Audit requirements demand complete request/response pairs
Network reliability is poor (partial responses harder to retry)

When Streaming Reduces Risk:

Streaming actually improves security for:

Large result sets (prevents memory exhaustion from buffering full response)
Long-running queries (provides liveness without keeping full response in memory)
Real-time feedback (user knows request is processing, reducing duplicate retries)

The knowledge-mcp server uses SSE for progress notifications only, not actual data streaming. Search results return via standard POST responses after complete authorization validation.

Multi-Tenant Security Isolation

Deploying AI knowledge systems on Azure Container Apps introduces multi-tenant security challenges that require architectural decisions beyond basic authentication.

Tenant Isolation Attack Vectors:

Cross-Tenant Query Injection: AI assistant from Tenant A crafts query attempting to access Tenant B’s documents
Shared Cache Poisoning: Search result cache leaks Tenant A’s data to Tenant B
Log Aggregation Exposure: Centralized logging exposes one tenant’s queries to another
Resource Exhaustion: Tenant A’s excessive queries consume resources, degrading Tenant B’s service

Production Security Architecture:

┌──────────────────────────────────────────────────────────┐
│ Azure Container Apps (Ingress + TLS Termination)         │
│ • Per-tenant subdomain: tenantA.knowledge-mcp.io         │
│ • Or path-based: knowledge-mcp.io/tenants/tenantA        │
└──────────────────────────────────────────────────────────┘
         ↓
┌──────────────────────────────────────────────────────────┐
│ Tenant Resolver Middleware                               │
│ • Extracts tenant ID from JWT claims                     │
│ • Validates tenant exists and is active                  │
│ • Rejects requests with invalid/suspended tenants        │
└──────────────────────────────────────────────────────────┘
         ↓
┌──────────────────────────────────────────────────────────┐
│ Tenant Context Injection                                 │
│ • Adds tenant ID to HttpContext.Items                    │
│ • All downstream services inherit tenant context         │
│ • Ensures queries include "tenantId = X" filter          │
└──────────────────────────────────────────────────────────┘
         ↓
┌──────────────────────────────────────────────────────────┐
│ Per-Tenant Data Isolation                                │
│ • Search queries: WHERE tenantId = {context.TenantId}    │
│ • Cache keys: Include tenant ID prefix                   │
│ • Logs: Tenant ID in every structured log entry          │
└──────────────────────────────────────────────────────────┘

Implementation Pattern:

public class TenantResolverMiddleware
{
    public async Task InvokeAsync(HttpContext context)
    {
        // Extract tenant ID from Azure AD token
        var tenantIdClaim = context.User.FindFirst("tid");
        
        if (tenantIdClaim == null)
        {
            context.Response.StatusCode = 401;
            await context.Response.WriteAsync(
                "Missing tenant ID claim in token"
            );
            return;
        }
        
        var tenantId = tenantIdClaim.Value;
        
        // Validate tenant is active (check database or cache)
        var tenant = await tenantService.GetTenantAsync(tenantId);
        
        if (tenant == null || !tenant.IsActive)
        {
            context.Response.StatusCode = 403;
            await context.Response.WriteAsync(
                "Tenant is not active or does not exist"
            );
            return;
        }
        
        // Inject tenant context for downstream services
        context.Items["TenantId"] = tenantId;
        context.Items["TenantContext"] = tenant;
        
        await next(context);
    }
}

// Usage in MCP tool
[McpServerTool]
public async Task<object> SearchKnowledgeAsync(string query)
{
    // CRITICAL: Tenant context automatically injected
    var tenantId = httpContextAccessor.HttpContext
        .Items["TenantId"] as string;
    
    // All queries MUST include tenant filter
    var results = await searchService.SearchAsync(
        query, 
        tenantId, // Prevents cross-tenant access
        maxResults: 10
    );
    
    return results;
}

Tenant Isolation Enforcement:

Database Level: Row-level security policies filter by tenant ID
Search Index Level: Partition indexes by tenant for performance + isolation
Cache Level: Key prefix includes tenant ID (tenant:A:query:hash)
Logging Level: Every log entry tagged with tenant ID for audit
Rate Limiting: Per-tenant quotas prevent resource monopolization

Dynamic URL Resolution for Container Apps:

Azure Container Apps generate environment-specific URLs:

Dev: knowledge-mcp-dev.region.azurecontainerapps.io
Prod: knowledge-mcp.region.azurecontainerapps.io

Hardcoding URLs breaks OAuth redirect URIs. Production pattern:

public static string GetServerUrl(
    IConfiguration config, 
    IHostEnvironment env)
{
    // Check explicit configuration first
    var configuredUrl = config["ServerUrl"];
    if (!string.IsNullOrEmpty(configuredUrl))
    {
        return configuredUrl;
    }
    
    // Azure Container Apps set this environment variable
    var containerAppUrl = Environment.GetEnvironmentVariable(
        "CONTAINER_APP_URL"
    );
    if (!string.IsNullOrEmpty(containerAppUrl))
    {
        return containerAppUrl;
    }
    
    // Fallback to localhost for development
    return env.IsDevelopment() 
        ? "http://localhost:8080" 
        : throw new InvalidOperationException(
            "Server URL not configured for production"
        );
}

This ensures OAuth redirect URIs work across environments without manual configuration changes.

Secure Tool Design Patterns

MCP tools are the attack surface where AI assistants interact with your knowledge base. Security patterns at the tool level prevent data exfiltration attempts.

Tool Security Principles:

[McpServerToolType]
public class SearchKnowledgeTool(
    IKnowledgeSearch knowledgeSearch,
    IHttpContextAccessor httpContextAccessor,
    ILogger<SearchKnowledgeTool> logger)
{
    [McpServerTool(Idempotent = true, ReadOnly = true)]
    [DisplayName("search_knowledge")]
    [Description("Search knowledge base with security controls")]
    public async Task<object> SearchAsync(
        [Description("Search query (max 500 characters)")] string query,
        [Description("Max results (1-20, default 5)")] int maxResults = 5)
    {
        // SECURITY: Parameter validation prevents injection attacks
        if (string.IsNullOrWhiteSpace(query) || query.Length > 500)
        {
            return new { success = false, error = "Invalid query length" };
        }
        
        if (maxResults < 1 || maxResults > 20)
        {
            return new { success = false, error = "Invalid maxResults range" };
        }
        
        // SECURITY: Extract user context for audit logging
        var userId = httpContextAccessor.HttpContext?.User
            .FindFirst("oid")?.Value ?? "unknown";
        var tenantId = httpContextAccessor.HttpContext?.Items["TenantId"] as string;
        
        logger.LogInformation(
            "Tool: search_knowledge invoked | User: {UserId} | Tenant: {TenantId} | Query: {Query}",
            userId, tenantId, query
        );
        
        try
        {
            // SECURITY: Tenant context automatically filters results
            var results = await knowledgeSearch.SearchAsync(
                query, tenantId, maxResults
            );
            
            // SECURITY: Redact sensitive fields before returning
            var sanitizedResults = results.Select(r => new
            {
                Title = r.Title,
                Summary = r.Summary,
                Url = r.Url,
                // OMIT: InternalId, AuthorEmail, RawContent
            });
            
            logger.LogInformation(
                "Tool: search_knowledge completed | Results: {Count}",
                sanitizedResults.Count()
            );
            
            return new 
            { 
                success = true, 
                results = sanitizedResults,
                resultCount = sanitizedResults.Count()
            };
        }
        catch (Exception ex)
        {
            // SECURITY: Log exception details but return generic error
            logger.LogError(ex, "Tool: search_knowledge failed for user {UserId}", userId);
            
            return new 
            { 
                success = false, 
                error = "Search failed. Please contact support."
                // DO NOT return ex.Message (may expose internal details)
            };
        }
    }
}

Tool-Level Security Controls:

Input Validation: Always validate query length, result limits, parameter types before processing
User Context Extraction: Every tool call must log which user (not just which AI assistant) initiated the request
Tenant Context Enforcement: Tools should never accept tenant ID as a parameter (prevents tenant spoofing). Extract from HttpContext.
Output Sanitization: Remove internal IDs, author information, system paths before returning results
Generic Error Messages: Never expose exception details, stack traces, or internal paths to AI assistants
Comprehensive Audit Logging: Log user, tenant, tool, query, and result count for every invocation

Why These Patterns Matter for AI Security:

AI assistants can craft adversarial queries attempting:

SQL injection via search terms
Path traversal via document ID parameters
Tenant spoofing via manipulated context
Error message mining via malformed inputs

Tool-level validation prevents these attacks before they reach your business logic.

Security Observability and Audit Compliance

AI assistants accessing corporate knowledge bases create audit requirements that traditional APIs don’t face. Security teams need to answer:

Who accessed what data? Not “AI assistant queried database” but “User alice@example.com via AI assistant accessed Customer PII documents”
What did they search for? Query terms may indicate data exfiltration attempts
What results were returned? Document IDs and result counts for compliance audits
Were there authorization failures? Failed access attempts indicate reconnaissance

Production Audit Logging Architecture:

public class AuditLoggingMiddleware
{
    public async Task InvokeAsync(HttpContext context)
    {
        // Capture request start time
        var startTime = DateTime.UtcNow;
        
        // Extract user and tenant context
        var userId = context.User.FindFirst("oid")?.Value;
        var userPrincipalName = context.User.FindFirst("upn")?.Value;
        var tenantId = context.Items["TenantId"] as string;
        
        try
        {
            await next(context);
            
            // Success: Log complete access details
            logger.LogInformation(
                "MCP Request | User: {UserPrincipalName} ({UserId}) | " +
                "Tenant: {TenantId} | Path: {Path} | Method: {Method} | " +
                "Status: {StatusCode} | Duration: {Duration}ms",
                userPrincipalName, userId, tenantId,
                context.Request.Path, context.Request.Method,
                context.Response.StatusCode,
                (DateTime.UtcNow - startTime).TotalMilliseconds
            );
        }
        catch (Exception ex)
        {
            // Failure: Log with exception details
            logger.LogError(ex,
                "MCP Request Failed | User: {UserPrincipalName} ({UserId}) | " +
                "Tenant: {TenantId} | Path: {Path} | Error: {Error}",
                userPrincipalName, userId, tenantId,
                context.Request.Path, ex.Message
            );
            throw;
        }
    }
}

Structured Logging for Security Analysis:

All logs emit as structured JSON to STDERR (MCP specification requirement):

{
  "timestamp": "2026-02-04T15:30:00Z",
  "level": "Information",
  "message": "MCP Tool Invoked",
  "properties": {
    "userId": "a1b2c3d4-5678-90ab-cdef-1234567890ab",
    "userPrincipalName": "alice@example.com",
    "tenantId": "tenant-uuid",
    "toolName": "search_knowledge",
    "query": "customer PII data",
    "maxResults": 10,
    "resultCount": 5,
    "documentsAccessed": ["doc-1", "doc-2", "doc-3", "doc-4", "doc-5"],
    "duration": 1234
  }
}

Security Analytics Queries (Kusto/Application Insights):

// Detect potential data exfiltration (excessive result requests)
traces
| where customDimensions.toolName == "search_knowledge"
| where customDimensions.maxResults >= 20
| summarize RequestCount = count() by UserId = tostring(customDimensions.userId), bin(timestamp, 1h)
| where RequestCount > 50
| project timestamp, UserId, RequestCount

// Identify failed authorization attempts (reconnaissance)
traces
| where message contains "Authorization failed" or message contains "Forbidden"
| extend UserId = tostring(customDimensions.userId)
| summarize FailureCount = count() by UserId, bin(timestamp, 5m)
| where FailureCount > 10
| project timestamp, UserId, FailureCount

// Track sensitive keyword searches
traces
| where customDimensions.query contains "PII" or 
        customDimensions.query contains "password" or
        customDimensions.query contains "secret"
| project timestamp, 
          UserId = tostring(customDimensions.userId),
          Query = tostring(customDimensions.query),
          ResultCount = tostring(customDimensions.resultCount)

Compliance Automation:

Integrate audit logs with compliance platforms:

SOC 2 Evidence: Export monthly access reports for auditors
GDPR Requests: Retrieve all data accessed by specific user
Incident Response: Timeline reconstruction when security events occur
Usage Analytics: Identify over-privileged users or unnecessary data access

Production Security Hardening Checklist

Beyond authentication and authorization, production HTTP MCP servers require comprehensive security controls:

1. Security Headers (Prevent Client-Side Attacks)

app.Use(async (context, next) =>
{
    // Prevent MIME type sniffing
    context.Response.Headers.Append("X-Content-Type-Options", "nosniff");
    
    // Prevent clickjacking attacks
    context.Response.Headers.Append("X-Frame-Options", "DENY");
    
    // Control referrer information leakage
    context.Response.Headers.Append("Referrer-Policy", "strict-origin-when-cross-origin");
    
    // Content Security Policy (prevent XSS)
    context.Response.Headers.Append("Content-Security-Policy", 
        "default-src 'self'; script-src 'self'; object-src 'none'");
    
    // HTTPS enforcement in production
    if (!app.Environment.IsDevelopment())
    {
        context.Response.Headers.Append("Strict-Transport-Security", 
            "max-age=31536000; includeSubDomains; preload");
    }
    
    await next();
});

2. Rate Limiting (Prevent Abuse and DoS)

app.UseRateLimiter(new RateLimiterOptions
{
    GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
    {
        // Rate limit by user ID (prevents single user abuse)
        var userId = context.User.FindFirst("oid")?.Value ?? "anonymous";
        
        return RateLimitPartition.GetTokenBucketLimiter(userId, _ => new()
        {
            TokenLimit = 100,                    // 100 requests
            ReplenishmentPeriod = TimeSpan.FromMinutes(1),
            TokensPerPeriod = 100,
            AutoReplenishment = true
        });
    }),
    RejectionStatusCode = 429
});

3. Request Size Limits (Prevent Memory Exhaustion)

builder.Services.Configure<KestrelServerOptions>(options =>
{
    // Maximum request body size: 1 MB
    options.Limits.MaxRequestBodySize = 1_048_576;
    
    // Maximum concurrent connections
    options.Limits.MaxConcurrentConnections = 1000;
    
    // Request timeout: 60 seconds
    options.Limits.KeepAliveTimeout = TimeSpan.FromSeconds(60);
});

4. CORS Configuration (Prevent Unauthorized Origins)

app.UseCors(policy => policy
    .WithOrigins(
        "https://vscode.dev",
        "https://github.com/copilot",
        "https://your-trusted-client.com"
    )
    .AllowCredentials()
    .WithHeaders("Authorization", "Content-Type")
    .WithMethods("POST", "GET") // Only allow necessary methods
);

5. Secret Management (Never Hardcode Credentials)

// Azure Key Vault integration
builder.Configuration.AddAzureKeyVault(
    new Uri($"https://{keyVaultName}.vault.azure.net/"),
    new DefaultAzureCredential()
);

// Access secrets securely
var connectionString = builder.Configuration["DatabaseConnectionString"];
var apiKey = builder.Configuration["AzureSearchApiKey"];

6. Dependency Security Scanning

<PropertyGroup>
  <EnableNETAnalyzers>true</EnableNETAnalyzers>
  <AnalysisMode>All</AnalysisMode>
  <RunAnalyzersDuringBuild>true</RunAnalyzersDuringBuild>
</PropertyGroup>

7. Container Security Hardening

# Use minimal base image (reduces attack surface)
FROM mcr.microsoft.com/dotnet/aspnet:8.0-alpine AS base

# Run as non-root user
USER app

# Read-only root filesystem
ENV ASPNETCORE_URLS="http://+:8080"
EXPOSE 8080

# Health check endpoint
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD wget --quiet --tries=1 --spider http://localhost:8080/health/live || exit 1

Architecture Summary: Security-First HTTP MCP

The production architecture that emerged from these security requirements:

┌─────────────────────────────────────────────────────────────┐
│ Azure Container Apps                                        │
│ • HTTPS Termination (TLS 1.3)                               │
│ • Auto-scaling (1-10 replicas)                              │
│ • Health-based routing                                      │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│ Security Middleware Pipeline                                │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ 1. Forwarded Headers (preserve client IP)               │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ 2. Rate Limiting (100 req/min per user)                 │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ 3. Audit Logging (capture user + tenant context)        │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ 4. MISE Authentication (OAuth 2.1 token validation)     │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ 5. Tenant Resolver (extract + validate tenant)          │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ 6. JSON-RPC Validation (protocol compliance)            │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ 7. Authorization (scope-based access control)           │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│ MCP Tool Registry                                           │
│ • search_knowledge (with parameter validation)              │
│ • get_search_health (readiness checks)                      │
│ • [Additional tools with security controls]                 │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│ Observability Layer                                         │
│ • Application Insights (telemetry)                          │
│ • Structured Logs (JSON to STDERR)                          │
│ • Security Analytics (Kusto queries)                        │
└─────────────────────────────────────────────────────────────┘

Every layer enforces security principles: authenticate, authorize, audit, validate.

Key Takeaways for Production AI Security

Building HTTP MCP servers for enterprise AI assistants requires thinking beyond traditional API security:

1. Defense-in-Depth is Non-Negotiable
Authentication alone is insufficient. Layer MISE OAuth validation, scope-based authorization, JSON-RPC validation, parameter sanitization, and audit logging. Each layer catches attacks the previous layer missed.

2. Audit Trails Must Capture Human Users, Not Just AI Assistants
When security incidents occur, teams need “Alice accessed customer PII,” not “AI assistant queried database.” Every log entry must include user principal name and Azure AD object ID.

3. Tenant Isolation Requires Architectural Enforcement
Never trust client-provided tenant IDs. Extract from validated JWT claims, inject into HttpContext, and enforce at database query level. Cross-tenant data leakage is the highest-impact risk.

4. AI Assistants Amplify Attack Surfaces
Traditional APIs serve deterministic requests. AI assistants synthesize across sources, cache context, and may leak data through generated responses. Tool-level sanitization prevents internal IDs and system paths from reaching AI context.

5. Security Observability Enables Compliance
SOC 2, ISO 27001, and GDPR audits require proof of access controls. Structured logging with user/tenant/tool/data enables automated compliance evidence generation.

6. Production Maturity Matters
MISE’s 20 years of security hardening addresses edge cases (conditional access, device compliance, multi-tenant isolation) that custom OAuth implementations miss. For enterprise knowledge bases, ecosystem maturity reduces risk.

When These Patterns Apply

This architecture is designed for:

Multi-tenant AI knowledge systems serving hundreds to thousands of users
Sensitive data environments requiring SOC 2, ISO 27001, or GDPR compliance
Corporate knowledge bases accessed by AI assistants (GitHub Copilot, VS Code, custom tools)
Zero-trust security models where every request is independently authenticated and authorized

For local development or personal knowledge bases, STDIO transport may be sufficient. For production enterprise deployments, HTTP with defense-in-depth security is the proven pattern.

The next article provides complete C# implementation code for every component described here: MISE configuration, JSON-RPC middleware, multi-tenant isolation, secure tool design, and Azure Container Apps deployment.

Resources

MCP Specification (HTTP Transport) – Official details for how MCP speaks JSON-RPC 2.0 over HTTP, including initialization, sessions, and streaming.
modelcontextprotocol.io/specification/2025-06-18/basic/transports
MCP Authorization – The spec section that covers OAuth-based authorization and how clients/servers are expected to enforce it.
modelcontextprotocol.io/specification/2025-06-18/basic/authorization
JSON-RPC 2.0 Specification – The base protocol this article leans on for error handling, batching, and message structure.
jsonrpc.org/specification
OAuth 2.0 (RFC 6749) – The core authorization framework behind the bearer tokens and scopes discussed here.
datatracker.ietf.org/doc/html/rfc6749
C# SDK for MCP – Strongly typed .NET SDK with attributes and helpers used in the HTTP server examples.
github.com/modelcontextprotocol/csharp-sdk
Azure Container Apps Documentation – Reference for the deployment and scaling model used in the architecture diagrams.
learn.microsoft.com/azure/container-apps

Related Reading:

Building Your First HTTP MCP Server (Implementation Guide)
Securing AI Knowledge Access: STDIO MCP Server Patterns