This sample shows how to manage Generative AI APIs at scale using Azure API Management and Azure OpenAI Service for a simple chatbot.
(Like and fork this sample to receive lastest changes and updates)
This sample demonstrates how to build and manage a scalable AI application using Azure API Management (APIM) and Azure OpenAI Service. The sample includes a chat interface that sends messages through APIM to load-balanced Azure OpenAI endpoints, showcasing enterprise-grade API management for AI workloads.
The architecture is set up in the following way:
- Sample app
- Frontend: Two files,
index.html
andapp.js
, that make requests to the backend. - Backend: A Node.js Express app that serves the frontend and makes requests to the Azure Open AI instance.
- Frontend: Two files,
- Azure OpenAI Service: Two instances of Azure OpenAI models, one primary endpoint and one secondary/ failover endpoint.
- Azure API Management: Manages the Azure OpenAI instances and exposes them to the frontend.
- Managed identity: Used to authenticate the Azure API Management instance to the Azure Open AI instances.
The documentation for this project is available in the DOC.md file.
You have a few options for getting started with this template. The quickest way to get started is GitHub Codespaces, since it will setup all the tools for you, but you can also set it up locally. You can also use a VS Code dev container
You can run this template virtually by using GitHub Codespaces. The button will open a web-based VS Code instance in your browser:
-
Open the template (this may take several minutes to build the container)
-
Open the terminal
-
Run
azd auth login
to sign in to Azure. -
Deploy resources with
azd up
(this may take several minutes to deploy the resources).You'll be asked to provide:
- An environment name, which will be use as a prefix for naming resources in your deployment
- An Azure subscription
- An Azure location
- An
apimLocation
Note
Enter a value of koreacentral
. The new API Management SKUv2 tier is used in this demo which is supported in the following regions.
-
Start the app.
cd src && npm start
A related option is VS Code Dev Containers, which will open the project in your local VS Code using the Dev Containers extension
You need to install VS Code, Dev Containers extension and Docker Desktop
-
Start Docker Desktop
-
Open the project (this may take several minutes to build the container)
-
Open the terminal
-
Run
azd auth login
to sign in to Azure. -
Deploy resources with
azd up
(this may take several minutes to deploy the resources).You'll be asked to provide:
- An environment name, which will be use as a prefix for naming resources in your deployment
- An Azure subscription
- An Azure location
- An
apimLocation
Note
Enter a value of koreacentral
. The new API Management SKUv2 tier is used in this demo which is supported in the following regions.
-
Install dependencies and start the app.
cd src npm install npm start
This will start the app on http://localhost:3000 and the API is available at http://localhost:1337
Then you can get the project code:
-
Fork the project to create your own copy of this repository.
-
On your forked repository, select the Code button, then the Local tab, and copy the URL of your forked repository.
-
Open a terminal and run this command to clone the repo:
git clone <your-repo-url>
-
Install dependencies and start the app.
cd src npm install npm start
This will start the app on http://localhost:3000 and the API is available at http://localhost:1337
- Azure OpenAI Service is available in specific regions
- Request quota increases if needed through Azure Portal
- API Management service tier v2 is currently available in limited regions
- We recommend using
koreacentral
for this sample
- We recommend using
- Azure OpenAI Service: Pay per token usage
- API Management: Cost varies by tier (this sample uses StandardV2)
- Consider enabling budgets and cost alerts
- Use managed identities for service-to-service authentication
- Enable APIM policies for rate limiting and circuit breaking
- Follow least-privilege access principles
- Monitor API usage and implement alerting.
- You can use the Azure pricing calculator to get an estimate.
Warning
To avoid unnecessary costs, remember to take down your app if it's no longer in use, either by deleting the resource group in the Portal or running azd down --purge
.
After running the azd up
command, an environment file will be generated for you at src/.env
. Here's some of the key information added to the .env
file.
APIM_ENDPOINT="<Your APIM Endpoint>"
API_SUFFIX="<Your API Suffix>"
API_VERSION="<Your API Version>"
DEPLOYMENT_ID="<Your Deployment Name>"
SUBSCRIPTION_KEY="<Your Subscription Key>"
Finding values using the Azure portal:
If you'd like to find the values in the .env
yourself, follow these steps:
Value | Instruction |
---|---|
APIM_ENDPOINT | Navigate to portal.azure.com -> Select rg -> Select APIM instance -> Go to Overview -> Copy Gateway URL |
API_SUFFIX | Navigate to portal.azure.com -> Select rg -> Select APIM instance -> Navigate to APIs/APIs -> open myAPI -> Go to Settings -> Copy API URL suffix |
API_VERSION | Open https://learn.microsoft.com/azure/ai-services/openai/reference#completions, Copy most recent Supported versions = 2024-02-01 |
DEPLOYMENT_ID | Navigate to portal.azure.com -> Select rg -> Select 1st OpenAI instance -> Go to Resource Management/Mode deployments -> Click on Manage Deployments to open Azure AI Studio -> Copy Deployment name |
SUBSCRIPTION_KEY | Navigate to portal.azure.com -> Select rg -> select APIM instance -> Go to APIs/Subscriptions -> Click show/hide keys on first row (Built-in all-access) -> copy Primary key |
Once you're done, you can remove all deployed resources using azd down --purge
Note
Some resources on Azure are only soft deleted for performance reasons and can be retrieved. By using --purge
resources are hard deleted and cannot be retrieved.
- Manage your Gen AI APIs with Azure API Management Module
- Azure API Management Documentation
- Azure OpenAI Service Documentation
-
If deployment fails, check:
- Regional availability of services
- Quota limits
- Required permissions
-
For application issues:
- Review APIM diagnostic logs
- Check Azure Monitor metrics
- Validate environment variables
If you can't find a solution to your problem, please open an issue in this repository.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines.
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.