Building a Production-Ready MCP Server in .NET: A PostgreSQL Memory Example

In the rapidly evolving world of AI integration, enabling Large Language Models (LLMs) to interact with external systems is becoming increasingly important. The Model Context Protocol (MCP) has emerged as a standardized way to connect AI models with various tools and data sources, making it easier to build powerful AI-enabled applications.

Today, we’ll explore how to implement a production-ready MCP server in .NET by examining a real-world example: the postg-mem project. This MCP server provides vector-based memory storage for AI applications using PostgreSQL with the pgvector extension, demonstrating how to go beyond simple “Hello World” examples to create a practical, useful service.

Understanding MCP and Why It Matters

The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to LLMs. It acts like a “USB-C port for AI applications,” providing a universal way to connect AI models to different data sources and tools.

At its core, MCP follows a client-server architecture:

MCP Hosts: Applications like Claude Desktop, VS Code, or AI-powered tools that want to access data through MCP
MCP Clients: Protocol clients that maintain connections with servers
MCP Servers: Lightweight programs that expose specific capabilities through the standardized protocol
Data Sources: Your computer’s files, databases, or external services that the MCP servers can access

This architecture allows LLMs to interact with various systems through a consistent interface, making it easier to build complex AI applications without needing to implement custom integration code for each new tool or data source.

Why Vector Memory Matters for AI Applications

One of the key challenges in building AI applications is providing them with memory. LLMs by themselves are stateless—they don’t remember previous interactions beyond what’s included in the current prompt. To build AI agents that can maintain context over time, we need a way to store and retrieve information efficiently.

Vector databases are particularly well-suited for this task because they allow for semantic search—finding information based on meaning rather than just keywords. When we store text as vector embeddings (numerical representations that capture semantic meaning), we can find similar content even if it doesn’t share the same exact words.

PostgreSQL with the pgvector extension gives us the best of both worlds:

A robust, mature relational database for structured data
Vector similarity search capabilities for semantic retrieval
The ability to combine traditional queries with vector search

This combination makes it an excellent choice for implementing AI memory systems that need to store, organize, and retrieve information efficiently.

The PostgMem Project: An Overview

The postg-mem project implements an MCP server that provides vector memory storage using PostgreSQL and pgvector. It allows AI agents to:

Store structured memories with vector embeddings
Retrieve memories by ID
Perform semantic search through memories using vector similarity
Filter search results using tags

It’s built using .NET 9.0 and integrates with the MCP C# SDK, making it a great example of how to implement MCP servers in the .NET ecosystem.

Setting Up the Foundation

The PostgMem project uses a standard ASP.NET Core application structure. Let’s start by examining the key components needed to set up an MCP server in .NET.

Project Setup

The project is configured as an ASP.NET Core web application with the following NuGet packages:

<ItemGroup>
    <PackageReference Include="Configuration.Extensions.EnvironmentFile" Version="2.0.0" />
    <PackageReference Include="Microsoft.Extensions.Configuration.EnvironmentVariables" Version="9.0.3" />
    <PackageReference Include="Microsoft.Extensions.Hosting" Version="9.0.3" />
    <PackageReference Include="Microsoft.Extensions.Http" Version="9.0.3" />
    <PackageReference Include="ModelContextProtocol" Version="0.1.0-preview.5" />
    <PackageReference Include="ModelContextProtocol.AspNetCore" Version="0.1.0-preview.5" />
    <PackageReference Include="Npgsql" Version="9.0.3" />
    <PackageReference Include="pgvector" Version="0.3.1" />
    <PackageReference Include="Registrator.Net" Version="3.0.0" />
    <PackageReference Include="Serilog.Extensions.Logging.File" Version="3.0.0" />
</ItemGroup>

The two essential packages for MCP functionality are ModelContextProtocol and ModelContextProtocol.AspNetCore, which provide the core SDK and ASP.NET Core integration, respectively.

Application Startup

The Program.cs file sets up the MCP server along with the necessary services:

WebApplicationBuilder builder = WebApplication.CreateBuilder(args);

builder
    .Configuration
    .AddEnvironmentFile() 
    .AddEnvironmentVariables("POSTGMEM_"); 

builder
    .Logging
    .AddConsole(options => { options.LogToStandardErrorThreshold = LogLevel.Trace; })
    .AddFile("log.log", minimumLevel: LogLevel.Trace);

builder
    .Services
    .AddPostgMem()
    .AddMcpServer().WithTools<MemoryTools>();

WebApplication app = builder.Build();

app.MapMcp();
app.Run();

Let’s break down the key points:

The application uses the Configuration.Extensions.EnvironmentFile package to load configuration from a .env file, which is a convenient way to manage environment-specific settings.
It sets up logging to both the console and a file, ensuring that we have proper observability in production.
It registers the PostgMem services using a custom extension method AddPostgMem(), which we’ll examine shortly.
It adds the MCP server with the AddMcpServer() method and registers our memory tools using WithTools<MemoryTools>().
Finally, it maps the MCP endpoints with app.MapMcp(), which sets up the necessary routes for the MCP protocol.

Service Registration Extensions

The AddPostgMem() extension method is defined in the ServiceCollectionExtensions.cs file and handles registering the various services needed for the application:

internal static class ServiceCollectionExtensions
{
    internal static IServiceCollection AddPostgMem(
        this IServiceCollection services)
    {
        services.AddEmbeddings();
        services.AddStorage();
        return services;
    }

    internal static IServiceCollection AddEmbeddings(
        this IServiceCollection services)
    {
        services
            .AddSingleton<EmbeddingSettings>(sp =>
                sp.GetRequiredService<IConfiguration>().GetSection("Embeddings").Get<EmbeddingSettings>() ??
                throw new ArgumentNullException("Embeddings Settings"))
            .AddHttpClient<IEmbeddingService, EmbeddingService>((sp, client) =>
            {
                EmbeddingSettings settings = sp.GetRequiredService<EmbeddingSettings>();
                client.BaseAddress = settings.ApiUrl;
                client.Timeout = settings.Timeout;
            });

        return services;
    }

    internal static IServiceCollection AddStorage(
        this IServiceCollection services)
    {
        services
            .AddSingleton(sp =>
            {
                string connectionString =
                    sp.GetRequiredService<IConfiguration>().GetConnectionString("Storage") ??
                    throw new ArgumentNullException("Storage Connection String");
                NpgsqlDataSourceBuilder sourceBuilder = new(connectionString);
                sourceBuilder.UseVector();
                return sourceBuilder.Build();
            });
        return services;
    }
}

This extension method registers two main services:

The embedding service, which is responsible for generating vector embeddings from text.
The storage service, which handles database operations using Npgsql with the pgvector extension.

Note the use of sourceBuilder.UseVector(), which enables the pgvector extension for Npgsql, allowing us to work with vector types in .NET.

The Memory Model and Storage

At the core of the PostgMem project is the memory model, which defines the structure of the data we’ll be storing:

public class Memory
{
    public Guid Id { get; init; }
    public string Type { get; init; } = string.Empty;
    public JsonDocument Content { get; init; } = JsonDocument.Parse("{}");
    public string Source { get; init; } = string.Empty;
    public Vector Embedding { get; init; } = new(new float[384]);
    public string[]? Tags { get; init; }
    public double Confidence { get; init; }
    public DateTime CreatedAt { get; init; }
    public DateTime UpdatedAt { get; init; }
}

This model includes:

A unique identifier
Type and source fields for categorization
JSON content stored as a JsonDocument
A vector embedding for semantic search
Tags for filtering
A confidence score
Creation and update timestamps

The Vector type comes from the pgvector package and represents a vector embedding in PostgreSQL.

The storage service implements the IStorage interface:

public interface IStorage
{
    Task<Memory> StoreMemory(
        string type,
        string content,
        string source,
        string[]? tags,
        double confidence,
        CancellationToken cancellationToken = default
    );

    Task<List<Memory>> Search(
        string query,
        int limit = 10,
        double minSimilarity = 0.7,
        string[]? filterTags = null,
        CancellationToken cancellationToken = default
    );

    Task<Memory?> Get(
        Guid id,
        CancellationToken cancellationToken = default
    );

    Task<bool> Delete(
        Guid id,
        CancellationToken cancellationToken = default
    );
}

This interface defines the core operations for our memory system:

StoreMemory: Stores a new memory with the provided metadata and content
Search: Performs a semantic search for memories similar to a query
Get: Retrieves a specific memory by ID
Delete: Removes a memory from the database

Let’s look at how the Search method is implemented in the Storage class:

public async Task<List<Memory>> Search(
    string query,
    int limit = 10,
    double minSimilarity = 0.7,
    string[]? filterTags = null,
    CancellationToken cancellationToken = default
)
{
    // Generate embedding for the query
    float[] queryEmbedding = await _embeddingService.Generate(
        query,
        cancellationToken
    );
    
    await using NpgsqlConnection connection = await _dataSource.OpenConnectionAsync(cancellationToken);

    string sql =
        @"
        SELECT id, type, content, source, embedding, tags, confidence, created_at, updated_at
        FROM memories
        WHERE embedding <=> @embedding < @maxDistance";

    if (filterTags is { Length: > 0 })
    {
        sql += " AND tags @> @tags";
    }

    sql += " ORDER BY embedding <=> @embedding LIMIT @limit";

    await using NpgsqlCommand cmd = new(sql, connection);
    cmd.Parameters.AddWithValue("embedding", new Vector(queryEmbedding));
    cmd.Parameters.AddWithValue("maxDistance", 1 - minSimilarity); 
    cmd.Parameters.AddWithValue("limit", limit);

    if (filterTags != null && filterTags.Length > 0)
    {
        cmd.Parameters.AddWithValue("tags", filterTags);
    }

    List<Memory> memories = [];
    await using NpgsqlDataReader reader = await cmd.ExecuteReaderAsync(cancellationToken);

    while (await reader.ReadAsync(cancellationToken))
    {
        memories.Add(
            new Memory
            {
                Id = reader.GetGuid(0),
                Type = reader.GetString(1),
                Content = reader.GetFieldValue<JsonDocument>(2),
                Source = reader.GetString(3),
                Embedding = reader.GetFieldValue<Vector>(4),
                Tags = reader.GetFieldValue<string[]>(5),
                Confidence = reader.GetDouble(6),
                CreatedAt = reader.GetDateTime(7),
                UpdatedAt = reader.GetDateTime(8),
            }
        );
    }

    return memories;
}

This method:

Generates a vector embedding for the search query using the embedding service
Constructs a SQL query that uses pgvector’s cosine distance operator (<=>) to find similar memories
Adds a clause for filtering by tags if specified
Orders the results by similarity and limits the number of results
Maps the database records to Memory objects and returns them

The <=> operator calculates the cosine distance between vectors, which is 1 minus the cosine similarity. A lower distance means higher similarity, so we’re finding memories with the closest embeddings to our query.

Embedding Generation

To perform semantic search, we need to convert text into vector embeddings. The EmbeddingService handles this task:

public async Task<float[]> Generate(
    string text,
    CancellationToken cancellationToken = default
)
{
    try
    {
        _logger.LogDebug("Generating embedding for text of length {TextLength}", text.Length);

        EmbeddingRequest request = new() { Model = _settings.Model, Prompt = text };

        _logger.LogDebug("Sending request to embedding API at {ApiUrl}", _settings.ApiUrl);
        HttpResponseMessage response = await _httpClient.PostAsJsonAsync(
            "api/embeddings",
            request,
            cancellationToken
        );
        response.EnsureSuccessStatusCode();

        EmbeddingResponse? result =
            await response.Content.ReadFromJsonAsync<EmbeddingResponse>(
                cancellationToken: cancellationToken
            );
        if (result?.Embedding == null || result.Embedding.Length == 0)
        {
            throw new Exception("Failed to generate embedding: Empty response from API");
        }

        _logger.LogDebug("Successfully generated embedding with {Dimensions} dimensions", result.Embedding.Length);

        return result.Embedding;
    }
    catch (Exception ex)
    {
        _logger.LogError(ex, "Error generating embedding: {ErrorMessage}", ex.Message);

        // Fallback to a random embedding in case of error
        _logger.LogWarning("Falling back to random embedding generation");
        Random random = new();
        float[] embedding = new float[384];
        for (int i = 0; i < embedding.Length; i++)
        {
            embedding[i] = (float)random.NextDouble();
        }

        // Normalize the embedding
        float sum = 0;
        for (int i = 0; i < embedding.Length; i++)
        {
            sum += embedding[i] * embedding[i];
        }

        float magnitude = (float)Math.Sqrt(sum);
        for (int i = 0; i < embedding.Length; i++)
        {
            embedding[i] /= magnitude;
        }

        return embedding;
    }
}

This service:

Sends the text to an external embeddings API (by default, it uses Ollama)
Parses the response to extract the vector embedding
Includes a fallback mechanism that generates a random embedding if the API call fails
Normalizes the fallback embedding to ensure it has unit length

The fallback mechanism is an important production consideration—it ensures that the application can continue to function even if the embedding API is temporarily unavailable, though with reduced effectiveness.

Implementing MCP Tools

Now that we have the core services in place, let’s see how we expose them through MCP. The MemoryTools class implements the MCP tools that allow AI agents to interact with our memory system:

[McpServerToolType]
public class MemoryTools
{
    private readonly IStorage _storage;

    public MemoryTools(IStorage storage)
    {
        _storage = storage;
    }

    [McpServerTool, Description("Store a new memory in the database")]
    public async Task<string> Store(
        [Description("The type of memory (e.g., 'conversation', 'document', etc.)")] string type,
        [Description("The content of the memory as a JSON object")] string content,
        [Description("The source of the memory (e.g., 'user', 'system', etc.)")] string source,
        [Description("Optional tags to categorize the memory")] string[]? tags = null,
        [Description("Confidence score for the memory (0.0 to 1.0)")] double confidence = 1.0,
        CancellationToken cancellationToken = default
    )
    {
        // Store the memory
        Memory memory = await _storage.StoreMemory(
            type,
            content,
            source,
            tags,
            confidence,
            cancellationToken
        );

        return $"Memory stored successfully with ID: {memory.Id}";
    }

    [McpServerTool, Description("Search for memories similar to the provided text")]
    public async Task<string> Search(
        [Description("The text to search for similar memories")] string query,
        [Description("Maximum number of results to return")] int limit = 10,
        [Description("Minimum similarity threshold (0.0 to 1.0)")] double minSimilarity = 0.7,
        [Description("Optional tags to filter memories")] string[]? filterTags = null,
        CancellationToken cancellationToken = default
    )
    {
        // Search for similar memories
        List<Memory> memories = await _storage.Search(
            query,
            limit,
            minSimilarity,
            filterTags,
            cancellationToken
        );

        if (memories.Count == 0)
        {
            return "No memories found matching your query.";
        }

        // Format the results
        StringBuilder result = new();
        result.AppendLine($"Found {memories.Count} memories:");
        result.AppendLine();

        foreach (Memory? memory in memories)
        {
            result.AppendLine($"ID: {memory.Id}");
            result.AppendLine($"Type: {memory.Type}");
            result.AppendLine($"Content: {memory.Content.RootElement}");
            result.AppendLine($"Source: {memory.Source}");
            result.AppendLine(
                $"Tags: {(memory.Tags != null ? string.Join(", ", memory.Tags) : "none")}"
            );
            result.AppendLine($"Confidence: {memory.Confidence:F2}");
            result.AppendLine($"Created: {memory.CreatedAt:yyyy-MM-dd HH:mm:ss}");
            result.AppendLine();
        }

        return result.ToString();
    }

    [McpServerTool, Description("Retrieve a specific memory by ID")]
    public async Task<string> Get(
        [Description("The ID of the memory to retrieve")] Guid id,
        CancellationToken cancellationToken = default
    )
    {
        Memory? memory = await _storage.Get(id, cancellationToken);

        if (memory == null)
        {
            return $"Memory with ID {id} not found.";
        }

        StringBuilder result = new();
        result.AppendLine($"ID: {memory.Id}");
        result.AppendLine($"Type: {memory.Type}");
        result.AppendLine($"Content: {memory.Content.RootElement}");
        result.AppendLine($"Source: {memory.Source}");
        result.AppendLine(
            $"Tags: {(memory.Tags != null ? string.Join(", ", memory.Tags) : "none")}"
        );
        result.AppendLine($"Confidence: {memory.Confidence:F2}");
        result.AppendLine($"Created: {memory.CreatedAt:yyyy-MM-dd HH:mm:ss}");
        result.AppendLine($"Updated: {memory.UpdatedAt:yyyy-MM-dd HH:mm:ss}");

        return result.ToString();
    }

    [McpServerTool, Description("Delete a memory by ID")]
    public async Task<string> Delete(
        [Description("The ID of the memory to delete")] Guid id,
        CancellationToken cancellationToken = default
    )
    {
        bool success = await _storage.Delete(id, cancellationToken);

        return success 
            ? $"Memory with ID {id} deleted successfully." 
            : $"Memory with ID {id} not found or could not be deleted.";
    }
}

The key elements here are:

The McpServerToolType attribute on the class, which marks it as a container for MCP tools.
The McpServerTool attribute on each method, which exposes it as an MCP tool.
The Description attributes on methods and parameters, which provide information to the AI about what each tool and parameter does.
The return type of string for all methods, which provides formatted responses that the AI can interpret.

These tools map directly to the operations defined in our IStorage interface, providing a clean, well-documented API for AI agents to interact with our memory system.

Running and Testing

To run the PostgMem MCP server, you’ll need:

.NET 9.0 SDK
PostgreSQL with the pgvector extension installed
An embedding API (the default configuration uses Ollama)

First, set up the PostgreSQL database:

CREATE EXTENSION vector;

CREATE TABLE memories (
    id UUID PRIMARY KEY,
    type TEXT NOT NULL,
    content JSONB NOT NULL,
    source TEXT NOT NULL,
    embedding VECTOR(384) NOT NULL,
    tags TEXT[] NOT NULL,
    confidence DOUBLE PRECISION NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE NOT NULL,
    updated_at TIMESTAMP WITH TIME ZONE NOT NULL
);

Then configure the application by creating a .env file:

ConnectionStrings__Storage="Host=localhost;Database=mcp_memory;Username=postgres;Password=postgres"
Embeddings__ApiUrl=http://localhost:11434/
Embeddings__Model=all-minilm:33m-l12-v2-fp16

Finally, run the application:

dotnet run

The MCP server will be available at http://localhost:5000 by default, ready to accept connections from MCP clients.

Using the MCP Server with AI Agents

Once the server is running, you can use it with any MCP-compatible client, such as GitHub Copilot in VS Code or Claude Desktop. Here’s an example of how you might configure it in a mcp.json file for VS Code:

{
    "servers": {
        "postgmem": {
            "type": "stdio",
            "command": "dotnet",
            "args": [
                "run",
                "--project",
                "/path/to/postg-mem/PostgMem/PostgMem.csproj"
            ]
        }
    }
}

With this configuration, the AI agent can interact with the memory system through natural language. For example:

“Store this fact about PostgreSQL: { ‘fact’: ‘PostgreSQL supports vector search through the pgvector extension’ }”
“What do you know about PostgreSQL vector search?”
“Retrieve the memory with ID 12345678-9abc-def0-1234-56789abcdef0”
“Delete the memory about PostgreSQL vector search”

The MCP server handles these requests, executes the appropriate operations, and returns the results to the AI agent, which can then incorporate them into its responses.

Production Considerations

While the PostgMem project provides a solid foundation for an MCP server, there are several considerations to keep in mind for production deployments:

Security

The example doesn’t include authentication or authorization mechanisms. In a production environment, you would want to:

Implement authentication for MCP clients
Add authorization rules to control which clients can access which tools
Validate and sanitize inputs to prevent injection attacks
Use HTTPS for all communications

Scaling

For high-traffic applications, consider:

Using connection pooling for database access
Setting up database replication for read-heavy workloads
Implementing caching for frequently accessed memories
Using a more scalable embedding service

Monitoring and Logging

The example includes basic logging, but a production system would benefit from:

Structured logging with correlation IDs
Metrics collection for key operations
Alerts for error conditions
Distributed tracing for complex request flows

Error Handling

While the embedding service includes fallback handling, a comprehensive approach would:

Implement circuit breakers for external dependencies
Add retry mechanisms with exponential backoff
Provide graceful degradation paths for all failure modes
Maintain detailed error logs for troubleshooting

Deployment

For deploying the MCP server:

Containerize the application using Docker
Use Kubernetes for orchestration
Set up CI/CD pipelines for automated deployments
Implement blue/green deployment strategies for zero-downtime updates

Beyond Memory: Other Applications

The PostgMem example focuses on memory storage, but the MCP server pattern can be applied to many other use cases:

Document Management: Implement tools for storing, retrieving, and searching documents
Data Analysis: Create tools for performing statistical analysis on datasets
External API Integration: Build tools that interact with third-party services
Process Automation: Develop tools that execute workflows in response to AI requests
Content Generation: Create tools that generate or transform content based on AI inputs

By following the patterns established in the PostgMem project, you can build MCP servers for any of these applications, providing AI agents with powerful capabilities beyond what they can do on their own.

Conclusion

The PostgMem project demonstrates how to build a production-ready MCP server in .NET using PostgreSQL with pgvector for AI memory storage. By following this example, you can create your own MCP servers that provide AI agents with access to specialized tools and data sources.

Key takeaways from this exploration:

MCP provides a standardized way for AI models to interact with external systems, enabling more powerful and flexible AI applications.
.NET offers a robust platform for implementing MCP servers, with strong typing, excellent performance, and a rich ecosystem of libraries.
PostgreSQL with pgvector provides an excellent foundation for AI memory systems, combining traditional database capabilities with vector similarity search.
Building production-ready MCP servers requires careful attention to architecture, error handling, security, and performance considerations.

As AI continues to evolve, the ability to extend its capabilities through tools like MCP servers will become increasingly important. By mastering these techniques, you can build AI applications that combine the natural language understanding of LLMs with the specific functionality and data access that your use cases require.

Building a Production-Ready MCP Server in .NET - A PostgreSQL Memory Example