Serverless MCP on AWS Lambda, exposed via API Gateway, provisioned with Terraform
The Model Context Protocol (MCP), introduced by Anthropic, is about to turn one year old — and at Horaizon, we've been busy experimenting with it. We're excited to share that we've now successfully deployed our own MCP server on AWS Lambda, exposed to the internet via API Gateway, and fully provisioned through Terraform infrastructure-as-code.
Although there have been a few reports of this approach documented online (linked below), we believe our implementation is novel in that it makes use of a Terraform infrastructure-as-code configuration.
As well as our MCP server, we have also written an LLM layer on top of OpenAI's Responses API to allow our LLM layer to query our custom server and serve responses to an Angular chatbot user interface. Demos to potential clients are certainly possible, and we would be keen to discuss how we can best help your business with software solutions in the age of AI.
What is MCP and Why Does It Matter?
MCP allows LLMs to seamlessly integrate with external services through standardised servers. Think of it as a universal adapter:
Need payments? Connect your LLM to Stripe's MCP server.
Need knowledge? Plug it into Wikipedia's MCP server.
Need your own data? Host your custom MCP server.
This interoperability means businesses can bridge AI models with live services safely and flexibly.
At Horaizon, our use case centred on customer data interaction — enabling an LLM to query, analyse, and provide insights in real time.
Our Architecture at a Glance
Here's how we built and deployed our serverless MCP solution:
Building the MCP Server
For local development, we leveraged FastMCP to quickly set up an MCP server.
For serverless deployment on AWS, however, we had to think about how to run a 'server-based' application in Lambda. This is where the AWS Lambda Web Adapter came in, which proxies incoming requests to the MCP server running inside a Lambda function.
Exposing the Endpoint
Once the Lambda was set up, we created an API Gateway which provided us with an endpoint from which we could route requests to the Lambda function in which our FastMCP server was running.
Terraform Automation
Instead of manual setup, our entire infrastructure is defined as code with Terraform — ensuring repeatability, scalability, and easy updates.
Integrating with LLMs
Our particular use case for building an MCP server involved interacting with customer data, so we wrote tools which allowed an integrating LLM to retrieve this data and provide insights on it. The integrating LLM was built on top of OpenAI's Responses API, as the Completions API does not currently support MCP tool calling.
Frontend Delivery
The integrating LLM was then exposed to the Angular frontend using a WebSocket API, allowing for long-running connections. We again made use of Lambda functions to host the integrating LLM, routing requests made to the WebSocket API to it.
Word of Advice
By default, the Lambda Web Adapter routes requests to the "/" path of your server. The FastMCP server URL is at /mcp/, so we had to use the AWS_LWA_READINESS_CHECK_PATH Lambda environment variable and set it to "/mcp/" so that requests did not return 404 Not Found responses.
The code is proprietary, so we won't be sharing it — but hopefully the above gives you a clear idea of how to go about setting up an MCP server on AWS with Terraform.
Standing on the Shoulders of Innovators
Our work builds on excellent explorations by:
Heeki Park — Building an MCP Server as an API Developer
Ran Isenberg — AWS Lambda MCP Cookbook
Both served as inspiration and reference points as we refined our Terraform-based solution.
At Horaizon, we're not just exploring AI infrastructure — we're building practical, production-grade solutions that unlock new possibilities for businesses.
Get in Touch
If you're exploring MCP for your own systems, we'd love to discuss how Horaizon can build and integrate a solution tailored to your needs. Reach us at contact@horaizon.co.uk
Stay tuned for more updates from the cutting edge of applied AI.
Want to build something similar?
We help teams design and ship AI agents that do real, measurable work.