From 59cb6fcbd54911f9b8f400472d07732ab9380d17 Mon Sep 17 00:00:00 2001 From: Luke Murray <24467442+lukemurraynz@users.noreply.github.com> Date: Wed, 26 Nov 2025 09:43:29 +1300 Subject: [PATCH] Adjust name from AI Studio to: Microsoft Foundry --- README.md | 4 +-- bicep/README.md | 2 +- bicep/infra/components/bing-search/main.bicep | 2 +- bicep/infra/components/bing-search/readme.md | 2 +- docs/AI-Landing-Zones-Cost-Guide.md | 2 +- docs/AI-Landing-Zones-Design-Checklist.md | 28 +++++++++---------- terraform/readme.md | 6 ++-- 7 files changed, 23 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index bc66f84..17c318c 100644 --- a/README.md +++ b/README.md @@ -41,8 +41,8 @@ The AI Landing Zone cover the following design areas which are across the Cloud ## Use cases & Scenarios The AI Landing Zone act as a foundation architecture which can be leveraged to deploy a particular AI usecase & scenario on Azure, such as the below, either with its default architecture or by extending it with additional Azure services as needed. -- Chat Using Azure AI Foundry -- Agents Using Azure AI Foundry +- Chat Using Microsoft Foundry +- Agents Using Microsoft Foundry - Document generation - Conversational Agent - Build your own copilot diff --git a/bicep/README.md b/bicep/README.md index 3cfba95..53eb870 100644 --- a/bicep/README.md +++ b/bicep/README.md @@ -4,7 +4,7 @@ This repository contains a **Bicep template** that is built on top of **Azure Ve ## Architecture - This architecture delivers a full **AI Landing Zone** built around **Azure AI Foundry**. The **AI Foundry Agent service**, together with **AI Search, Cosmos DB, Storage, and Key Vault**, operates securely and seamlessly. A dedicated **Azure Container Apps** environment enables custom **GenAI applications**, and supporting services cover configuration, data, and observability. Thanks to its modular design, you can deploy everything or only the components you need. + This architecture delivers a full **AI Landing Zone** built around **Microsoft Foundry**. The **AI Foundry Agent service**, together with **AI Search, Cosmos DB, Storage, and Key Vault**, operates securely and seamlessly. A dedicated **Azure Container Apps** environment enables custom **GenAI applications**, and supporting services cover configuration, data, and observability. Thanks to its modular design, you can deploy everything or only the components you need. ![Architecture](./docs/architecture.png) *AI Landing Zone* diff --git a/bicep/infra/components/bing-search/main.bicep b/bicep/infra/components/bing-search/main.bicep index cbecb2f..1d9cdf7 100644 --- a/bicep/infra/components/bing-search/main.bicep +++ b/bicep/infra/components/bing-search/main.bicep @@ -1,5 +1,5 @@ metadata name = 'bing-search' -metadata description = 'Create-or-reuse a Bing Grounding account and its Cognitive Services connection to be used by Azure AI Foundry.' +metadata description = 'Create-or-reuse a Bing Grounding account and its Cognitive Services connection to be used by Microsoft Foundry.' @description('Conditional. The name of the Azure Cognitive Services account to be used for the Bing Search tool. Required if `enableBingSearchConnection` is true.') param accountName string diff --git a/bicep/infra/components/bing-search/readme.md b/bicep/infra/components/bing-search/readme.md index 880b885..98aa10e 100644 --- a/bicep/infra/components/bing-search/readme.md +++ b/bicep/infra/components/bing-search/readme.md @@ -1,6 +1,6 @@ # bing-search `[AiMl/LandingZoneComponentsBingSearch]` -Create-or-reuse a Bing Grounding account and its Cognitive Services connection to be used by Azure AI Foundry. +Create-or-reuse a Bing Grounding account and its Cognitive Services connection to be used by Microsoft Foundry. ## Navigation diff --git a/docs/AI-Landing-Zones-Cost-Guide.md b/docs/AI-Landing-Zones-Cost-Guide.md index 788168e..74d57f8 100644 --- a/docs/AI-Landing-Zones-Cost-Guide.md +++ b/docs/AI-Landing-Zones-Cost-Guide.md @@ -213,7 +213,7 @@ ### **3. AI & COGNITIVE SERVICES** -#### **Azure AI Foundry / AI Services** +#### **Microsoft Foundry / AI Services** - **Service:** Azure AI Services (multi-service account) - **Region:** East US - **API Calls:** Variable (usage-based) diff --git a/docs/AI-Landing-Zones-Design-Checklist.md b/docs/AI-Landing-Zones-Design-Checklist.md index 17cd69f..ad5e127 100644 --- a/docs/AI-Landing-Zones-Design-Checklist.md +++ b/docs/AI-Landing-Zones-Design-Checklist.md @@ -3,35 +3,35 @@ | ID | Design Area | Design Consideration | Design Recommendation | References | |-------|-------|--------------|--------------|--------------| | C-R1 | Compute | Determine the compute options of your models, orchestrators, self hosted agents and application (frontend, backend & Ingestion) in an AI Landing Zone.| Recommended to standardize the compute options across all components to ensure consistency and ease of management. Leverage PaaS compute options such as Azure Container Apps, Azure Apps Service, Azure Kubernetes Service to optimize resource utilization and simplify management.| [Compute for Models](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/foundry-models-overview#model-deployment-managed-compute-and-serverless-api-deployments)
[Compute for orchestrators, self hosted agents and application](https://learn.microsoft.com/en-us/azure/architecture/guide/technology-choices/compute-decision-tree) | -| CO-R1 | Cost | Familirize yourself with pricing & billing model of Azure AI Foundry and its services | Estimate costs before using Azure AI Foundry and its services for an AI application | [Manage Cost](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/manage-costs) | +| CO-R1 | Cost | Familirize yourself with pricing & billing model of Microsoft Foundry and its services | Estimate costs before using Microsoft Foundry and its services for an AI application | [Manage Cost](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/manage-costs) | | CO-R2 | Cost | Consider having PTU & PAYGO Endpoints endpoints. If you have predictable workloads, use AI commitment tiers in Azure AI services. For Azure OpenAI models, use [provisioned throughput units](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/provisioned-throughput) (PTUs), which can be less expensive than pay-as-you-go (consumption-based) pricing. | It's common to combine PTU endpoints and consumption-based endpoints for cost optimization. Use PTUs on the AI model primary endpoint and a secondary, consumption-based AI endpoint for spillover. | [Introduce a gateway for multiple Azure OpenAI instances](https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/azure-openai-gateway-multi-backend#multiple-azure-openai-instances-in-a-single-region-and-single-subscription) | | CO-R3 | Cost | Consider the various Azure OpenAI deployment types in particular the global deployment type. | Azure OpenAI models allow you to use different deployment types. Global deployment offers lower cost-per-token pricing on certain OpenAI models. | [Deployment types](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/deployment-types) | -| CO-R4 | Cost | Cosider implementing auto shutdown policy for compute resources in non-prod environment.| Define and enforce a policy stating that AI resources must use the automatic shutdown feature on virtual machines and compute instances in Azure AI Foundry and Azure Machine Learning. Automatic shutdown is applicable to nonproduction environments and production workloads that you can take offline for certain periods of time. | | +| CO-R4 | Cost | Cosider implementing auto shutdown policy for compute resources in non-prod environment.| Define and enforce a policy stating that AI resources must use the automatic shutdown feature on virtual machines and compute instances in Microsoft Foundry and Azure Machine Learning. Automatic shutdown is applicable to nonproduction environments and production workloads that you can take offline for certain periods of time. | | | D-R1 | Data | Determine how stateful entities such as threads, messages, and runs created during usage along with files uploaded during Foundry Agent Service setup or as part of a message are managed| Recommended to use standard setup of Agent service and store data in your own Azure resources, giving you full ownership and control. | [Storing customer data](https://learn.microsoft.com/en-us/azure/ai-foundry/agents/concepts/standard-agent-setup#leveraging-your-own-resources-for-storing-customer-data) | | D-R2 | Data | Determine whether thread storage (conversation history, agent definitions), file storage (uploaded documents), vector search (embeddings and retrieval) will be shared by all project or separated by projects | Considering a project represents a distinct application or use case, it's recommended to separate these storage components by project to ensure data isolation and better manageability. | [Project-based resource management](https://learn.microsoft.com/en-us/azure/ai-foundry/agents/concepts/capability-hosts#recommended-setup) | | D-R3 | Data | In case you have Microsoft Fabric, determine how you would surface data from it into AI Foundry | Recommended to leverage the Microsoft Fabric data agent for this purpose | [Use the Microsoft Fabric data agent](https://learn.microsoft.com/en-us/azure/ai-foundry/agents/how-to/tools/fabric?pivots=portal) | -| G-R1 | Governance | Consider using built-in AI related policies for governance of AI resources and AI apps | Automate policy enforcement with Azure Policy to enforce policies automatically across AI deployments, reducing human error. Apply AI policies to each management group. | [Azure AI Foundry Policies](https://learn.microsoft.com/en-us/azure/ai-services/policy-reference?context=%2Fazure%2Fai-studio%2Fcontext%2Fcontext), [Azure Machine Learning Policies](https://learn.microsoft.com/en-us/azure/machine-learning/policy-reference), [Azure AI services Policies](https://learn.microsoft.com/en-us/azure/ai-services/policy-reference), [Azure AI Search Policies](https://learn.microsoft.com/en-us/azure/search/policy-reference), | +| G-R1 | Governance | Consider using built-in AI related policies for governance of AI resources and AI apps | Automate policy enforcement with Azure Policy to enforce policies automatically across AI deployments, reducing human error. Apply AI policies to each management group. | [Microsoft Foundry Policies](https://learn.microsoft.com/en-us/azure/ai-services/policy-reference?context=%2Fazure%2Fai-studio%2Fcontext%2Fcontext), [Azure Machine Learning Policies](https://learn.microsoft.com/en-us/azure/machine-learning/policy-reference), [Azure AI services Policies](https://learn.microsoft.com/en-us/azure/ai-services/policy-reference), [Azure AI Search Policies](https://learn.microsoft.com/en-us/azure/search/policy-reference), | | G-R2 | Governance | Review applicable industry standards such as the [NIST Artificial Intelligence Risk Management Framework (AI RMF)](https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf) and [NIST AI RMF Playbook](https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook) and ensure alignment and compliance with them. | Apply the regulatory compliance policy initiatives | [Regulatory compliance initiatives](https://learn.microsoft.com/en-us/azure/governance/policy/samples/#regulatory-compliance) | | G-R3 | Governance | Consider implementing [responsible AI standards](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/govern#assess-ai-organizational-risks) | Use the Responsible AI dashboard to generate reports around model outputs. | [Responsible AI dashboard](https://learn.microsoft.com/en-us/azure/machine-learning/concept-responsible-ai-dashboard) | | G-R4 | Governance | Consider implementing Azure AI Content Safety APIs that can be called for testing to facilitate content safety testing. | Use Azure AI Content Safety to define a baseline content filter for your approved AI models. This safety system runs both the prompt and completion for your model through a group of classification models. These classification models detect and help prevent the output of harmful content across a range of categories. Content Safety provides features like prompt shields, groundedness detection, and protected material text detection. It scans images and text. | [Azure AI Content Safety](https://learn.microsoft.com/en-us/azure/ai-services/content-safety/overview) | -| G-R5 | Governance | Govern model availability in your organization across applications and workload | Use Azure Policy to manage which specific models your teams are allowed to deploy from the Azure AI Foundry model catalog. You have the option to use a built-in policy or create a custom policy. Since this approach uses an allowlist, begin with an _audit_ effect. The _audit_ effect allows you to monitor the models your teams are using without restricting deployments. Only switch to the _deny_ effect once you understand the AI development and experimentation needs of workload teams, so you don't hinder their progress unnecessarily. If you switch a policy to _deny_, it doesn't automatically remove noncompliant models that teams have already deployed. You must remediate those models manually. | [Built-in policy](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/built-in-policy-model-deployment) | +| G-R5 | Governance | Govern model availability in your organization across applications and workload | Use Azure Policy to manage which specific models your teams are allowed to deploy from the Microsoft Foundry model catalog. You have the option to use a built-in policy or create a custom policy. Since this approach uses an allowlist, begin with an _audit_ effect. The _audit_ effect allows you to monitor the models your teams are using without restricting deployments. Only switch to the _deny_ effect once you understand the AI development and experimentation needs of workload teams, so you don't hinder their progress unnecessarily. If you switch a policy to _deny_, it doesn't automatically remove noncompliant models that teams have already deployed. You must remediate those models manually. | [Built-in policy](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/built-in-policy-model-deployment) | | I-R1 | Identity | Consider using managed identities with least privilege access. | Use managed identity on all supported Azure services. Grant least privilege access to application resources that need to access AI model endpoints. Secure Azure service-to-service interactions. Use managed identity to allow Azure services to authenticate to each other without managing credentials.| [Managed identity](https://learn.microsoft.com/entra/identity/managed-identities-azure-resources/overview) | | I-R2 | Identity | Leverage MFA and PIM for sensitive accounts.| Enable multifactor authentication (MFA) and prefer secondary administrative accounts or just-in-time access with Privileged Identity Management (PIM) for sensitive accounts. Limit control plane access using services like Azure Bastion as secure entry points into private networks. | [Multifactor authentication](https://learn.microsoft.com/entra/identity/authentication/tutorial-enable-azure-mfa), [Privileged Identity Management](https://learn.microsoft.com/entra/id-governance/privileged-identity-management/pim-configure) | -| I-R3 | Identity | Use Microsoft Entra ID for authentication | Wherever possible, eliminate static API keys in favor of Microsoft Entra ID for authentication. This step enhances security through centralized identity management and reduces secret management overhead. Also limit the distribution of API keys. Instead, prefer identities in Microsoft Entra ID over API keys for authentication. Audit the list of individuals with API key access to ensure it's current. | For authentication guidance
[Azure AI Foundry](https://learn.microsoft.com/azure/ai-studio/concepts/rbac-ai-studio), [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/how-to/managed-identity), [Azure AI services](https://learn.microsoft.com/azure/ai-services/authentication), [Azure Machine Learning](https://learn.microsoft.com/azure/machine-learning/how-to-setup-authentication). | +| I-R3 | Identity | Use Microsoft Entra ID for authentication | Wherever possible, eliminate static API keys in favor of Microsoft Entra ID for authentication. This step enhances security through centralized identity management and reduces secret management overhead. Also limit the distribution of API keys. Instead, prefer identities in Microsoft Entra ID over API keys for authentication. Audit the list of individuals with API key access to ensure it's current. | For authentication guidance
[Microsoft Foundry](https://learn.microsoft.com/azure/ai-studio/concepts/rbac-ai-studio), [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/how-to/managed-identity), [Azure AI services](https://learn.microsoft.com/azure/ai-services/authentication), [Azure Machine Learning](https://learn.microsoft.com/azure/machine-learning/how-to-setup-authentication). | | I-R4 | Identity | Use Conditional Access policies | Implement Conditional Access policies that respond to unusual sign-in activity or suspicious behavior. Use signals like user location, device state, and sign-in behavior to trigger extra verification steps. Require MFA for accessing critical AI resources to enhance security. Restrict access to AI infrastructure based on geographic locations or trusted IP ranges. Ensure that only compliant devices (those meeting security requirements) can access AI resources. | [Conditional Access policies](https://learn.microsoft.com/entra/id-protection/howto-identity-protection-configure-risk-policies) | -| I-R5 | Identity | Configure least privilege access | Implement least privilege access by implementing role-based access control (RBAC) to provide minimal access to data and services. Assign roles to users and groups based on their responsibilities. Use Azure RBAC to fine-tune access control for specific resources such as virtual machines and storage accounts. Ensure users have only the minimum level of access necessary to perform their tasks. Regularly review and adjust permissions to prevent privilege creep.| [Role-based access control for Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/rbac-azure-ai-foundry) | +| I-R5 | Identity | Configure least privilege access | Implement least privilege access by implementing role-based access control (RBAC) to provide minimal access to data and services. Assign roles to users and groups based on their responsibilities. Use Azure RBAC to fine-tune access control for specific resources such as virtual machines and storage accounts. Ensure users have only the minimum level of access necessary to perform their tasks. Regularly review and adjust permissions to prevent privilege creep.| [Role-based access control for Microsoft Foundry](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/rbac-azure-ai-foundry) | | I-R6 | Identity | Disable key based access and only have access to AI Model endpoints using Microsoft Entra ID. | Secure external access to AI model endpoints. Require clients to authenticate using Microsoft Entra ID when accessing AI model endpoints. | | | M-R1 | Monitoring | Monitor [AI models](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/platform/management#manage-ai-models), [AI resources](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/platform/management#manage-ai-operations), [AI data](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/platform/management#manage-ai-data) of the workload | Implement monitoring to ensure that it remain aligned with applications and workload KPIs | | M-R2 | Monitoring | Review recommended alerts for AI in Azure Monitor Baseline Alerts.| Enable recommended alert rules to receive notifications of deviations that indicate a decline in workload health. | [Azure Monitor Baseline Alerts](https://azure.github.io/azure-monitor-baseline-alerts/patterns/artificial-intelligence/) | -| M-R3 | Monitoring | Monitor the performance of a generative AI application | For generative AI workloads, use Azure AI Foundry's built-in [evaluation](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/evaluation-metrics-built-in) and [manual](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/evaluate-prompts-playground) monitoring capabilities. Also monitor latency in response times or the accuracy of vector search results to enhance user experiences. In Azure AI Foundry, [enable tracing](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/develop/trace-local-sdk) to collect trace data for each request, aggregated metrics, and user feedback. Enable [diagnostic logging](https://learn.microsoft.com/en-us/azure/ai-services/diagnostic-logging) for each Azure AI service. | -| M-R4 | Monitoring | Have diagnostic settings configured to capture logs and metrics of all deployed resources to a log analytics workspace. | Use diagnostic settings to capture logs and metrics for all key services, such as Azure AI Foundry and [Azure AI services](https://learn.microsoft.com/en-us/azure/ai-services/diagnostic-logging). Specific services should capture audit logs and relevant service-specific logs. | -| M-R5 | Monitoring | Monitor model & data drift.| Track accuracy and data drift continuously in generative and nongenerative AI to ensure that models remain relevant. Monitoring can alert you when model predictions or large language model responses deviate from expected behavior. This deviation indicates a need for retraining or adjustment. Set up custom alerts to detect performance thresholds. This approach enables early intervention when problems arise. Use [evaluations in Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/evaluation-approach-gen-ai). | +| M-R3 | Monitoring | Monitor the performance of a generative AI application | For generative AI workloads, use Microsoft Foundry's built-in [evaluation](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/evaluation-metrics-built-in) and [manual](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/evaluate-prompts-playground) monitoring capabilities. Also monitor latency in response times or the accuracy of vector search results to enhance user experiences. In Microsoft Foundry, [enable tracing](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/develop/trace-local-sdk) to collect trace data for each request, aggregated metrics, and user feedback. Enable [diagnostic logging](https://learn.microsoft.com/en-us/azure/ai-services/diagnostic-logging) for each Azure AI service. | +| M-R4 | Monitoring | Have diagnostic settings configured to capture logs and metrics of all deployed resources to a log analytics workspace. | Use diagnostic settings to capture logs and metrics for all key services, such as Microsoft Foundry and [Azure AI services](https://learn.microsoft.com/en-us/azure/ai-services/diagnostic-logging). Specific services should capture audit logs and relevant service-specific logs. | +| M-R5 | Monitoring | Monitor model & data drift.| Track accuracy and data drift continuously in generative and nongenerative AI to ensure that models remain relevant. Monitoring can alert you when model predictions or large language model responses deviate from expected behavior. This deviation indicates a need for retraining or adjustment. Set up custom alerts to detect performance thresholds. This approach enables early intervention when problems arise. Use [evaluations in Microsoft Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/evaluation-approach-gen-ai). | | M-R6 | Monitoring | Leverage Azure Monitor insights and Azure network watcher for trouble shooting networking issues. | Use services such as Azure Monitor Network Insights and Azure Network Watcher to gain visibility into network performance and health. | -| R-R1 | Reliability | **Multi Region Disaster Recovery**: Consider Multi Region in at least two regions to provide high availability and ensure for Disaster Recovery.| Establish a policy for business continuity and disaster recovery for your AI endpoints and AI data. Configure baseline disaster recovery for resources that host your AI model endpoints. These resources include [Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/disaster-recovery), [Azure Machine Learning](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-high-availability-machine-learning), [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/business-continuity-disaster-recovery), or Azure AI services. All Azure data stores, such as [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/common/storage-disaster-recovery-guidance), [Azure Cosmos DB](https://learn.microsoft.com/en-us/azure/cosmos-db/online-backup-and-restore), and [Azure SQL Database](https://learn.microsoft.com/en-us/azure/azure-sql/accelerated-database-recovery), provide reliability and disaster recovery guidance that you should follow. Implement multi-region deployments to ensure high availability and resiliency for both generative and nongenerative AI systems For more information, see multi-region deployment in [Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/disaster-recovery#plan-for-multi-regional-deployment), [Azure Machine Learning](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-high-availability-machine-learning#plan-for-multi-regional-deployment), and [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/business-continuity-disaster-recovery).| +| R-R1 | Reliability | **Multi Region Disaster Recovery**: Consider Multi Region in at least two regions to provide high availability and ensure for Disaster Recovery.| Establish a policy for business continuity and disaster recovery for your AI endpoints and AI data. Configure baseline disaster recovery for resources that host your AI model endpoints. These resources include [Microsoft Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/disaster-recovery), [Azure Machine Learning](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-high-availability-machine-learning), [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/business-continuity-disaster-recovery), or Azure AI services. All Azure data stores, such as [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/common/storage-disaster-recovery-guidance), [Azure Cosmos DB](https://learn.microsoft.com/en-us/azure/cosmos-db/online-backup-and-restore), and [Azure SQL Database](https://learn.microsoft.com/en-us/azure/azure-sql/accelerated-database-recovery), provide reliability and disaster recovery guidance that you should follow. Implement multi-region deployments to ensure high availability and resiliency for both generative and nongenerative AI systems For more information, see multi-region deployment in [Microsoft Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/disaster-recovery#plan-for-multi-regional-deployment), [Azure Machine Learning](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-high-availability-machine-learning#plan-for-multi-regional-deployment), and [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/business-continuity-disaster-recovery).| | R-R1 | Resource Organization | Select regions based on the combination of regional availability of Azure Services and their configuration. | Before deployment, ensure that there's [availability in the region](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/#products-by-region_tab5) for the AI resources that you need. Certain regions might not provide specific AI services or might have limited features, which can affect the functionality of your solution. This limitation can also affect the scalability of your deployment. For example, Azure OpenAI service availability can vary based on your deployment model. These deployment models include global standard, global provisioned, regional standard, and regional provisioned. Check the AI service to confirm that you have access to the necessary resources. || | R-R2 | Resource Organization | Review quota required to deploy the resources. | Consider the quota or subscription limits in your chosen region as your AI workloads grow. Azure services have regional subscription limits. These limits can affect large-scale AI model deployments, such as large inference workloads. To prevent disruptions, contact Azure support in advance if you foresee a need for extra capacity. || | R-R3 | Resource Organization | Consider Azure subscription and region quota limits | Align the resource organization with [Azure’s subscription quota limitations](https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/azure-subscription-service-limits) to avoid unexpected service disruptions. || -| R-R4 | Resource Organization | Consider scaling through multi-account and multi-project deployment.| Azure offers tools like Azure AI Foundry [Resource and projects](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/ai-resources) to enforce governance and security. Use an AI Foundry Resource per billing boundary to allocate costs across different teams. For more information, see [Manage AI deployments](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/manage#manage-ai-deployment). Use distinct AI Foundry resoruces to organize and manage AI artifacts like datasets, models, and experiments. AI Foundry resoruces centralize resource management and simplify access control. For example, use [projects](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/ai-resources#organize-work-in-projects-for-customization) within Azure AI Foundry to manage resources and permissions efficiently, facilitating collaboration while maintaining security boundaries. || +| R-R4 | Resource Organization | Consider scaling through multi-account and multi-project deployment.| Azure offers tools like Microsoft Foundry [Resource and projects](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/ai-resources) to enforce governance and security. Use an AI Foundry Resource per billing boundary to allocate costs across different teams. For more information, see [Manage AI deployments](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/manage#manage-ai-deployment). Use distinct AI Foundry resoruces to organize and manage AI artifacts like datasets, models, and experiments. AI Foundry resoruces centralize resource management and simplify access control. For example, use [projects](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/ai-resources#organize-work-in-projects-for-customization) within Microsoft Foundry to manage resources and permissions efficiently, facilitating collaboration while maintaining security boundaries. || | S-R1 | Security | Review the Microsoft Defender for Cloud Recommendations for compliance. | MDC can help [discover generative AI workloads](https://learn.microsoft.com/en-us/azure/defender-for-cloud/identify-ai-workload-model) and in [predeployment generative AI artifacts.](https://learn.microsoft.com/en-us/azure/defender-for-cloud/explore-ai-risk) Also [AI security posture management](https://learn.microsoft.com/en-us/azure/defender-for-cloud/ai-security-posture) in Microsoft Defender for Cloud can be used to automate detection and remediation of generative AI risks. Defender for Cloud provides a cost-effective approach for detecting configurations in your deployed resources that aren't secure. You should also enable [AI threat protection.](https://learn.microsoft.com/en-us/azure/defender-for-cloud/ai-threat-protection)| | | S-R2 | Security | Comply with Microsoft Cloud Security Baseline | Leverage [Azure security baselines](https://learn.microsoft.com/en-us/security/benchmark/azure/security-baselines-overview) and follow [Azure Service Guides](https://learn.microsoft.com/en-us/azure/well-architected/service-guides/?product=popular) for security guidance.| | | S-R3 | Security | Leverage Purview to secure data in an AI landing zone. | Sensitive data in AI workflows increases the risk of insider threats, data leaks and data oversharing. Use tools like [Microsoft Purview Insider Risk Management](https://learn.microsoft.com/en-us/purview/insider-risk-management)to assess enterprise-wide data risks and prioritize them based on data sensitivity. | | @@ -39,10 +39,10 @@ | S-R5 | Security | Monitor outputs and apply prompt shielding using AI Content Safety. | Regularly inspect the data returned by AI models to detect and mitigate risks associated with malicious or unpredictable user prompts. Implement [Prompt Shields](https://learn.microsoft.com/en-us/azure/ai-services/content-safety/concepts/jailbreak-detection) to scan text for the risk of a user input attack on generative Al models. | | | N-R1 | Networking | Leverrage DDoS in case the workload is public facing. In case of existing platform landing zone, the central DDoS service should be used instead. | [Azure DDoS Protection](https://learn.microsoft.com/en-us/azure/ddos-protection/ddos-protection-overview) should be enabled to safeguard AI services from potential disruptions and downtime caused by distributed denial of service attacks. Enable Azure DDoS protection at the virtual network level to defend against traffic floods targeting internet-facing applications.| | | N-R2 | Networking | Use a jumpbox that can be accessed through bastion. In case of existing platform landing zone the central jump box and bastion service should be used instead. | AI development access should use a jumpbox within the virtual network of the workload or through a connectivity hub virtual network. Use Azure Bastion to securely connect to virtual machines interacting with AI services. Azure Bastion provides secure RDP/SSH connectivity without exposing VMs to the public internet. Enable Azure Bastion to ensure encrypted session data and protect access through TLS-based RDP/SSH connections. | | -| N-R3 | Networking | Use Private endpoint for the AI resources and infact for all PaaS services. | No PaaS services or AI model endpoints should be accessible from the public internet. Private endpoints to provide private connectivity to Azure services within a virtual network. Private endpoints provide secure, private access to PaaS portals like Azure AI Foundry. For Azure AI Foundry, Configure the [managed virtual network](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/configure-managed-network) and use [private endpoints](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/configure-private-link). | | +| N-R3 | Networking | Use Private endpoint for the AI resources and infact for all PaaS services. | No PaaS services or AI model endpoints should be accessible from the public internet. Private endpoints to provide private connectivity to Azure services within a virtual network. Private endpoints provide secure, private access to PaaS portals like Microsoft Foundry. For Microsoft Foundry, Configure the [managed virtual network](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/configure-managed-network) and use [private endpoints](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/configure-private-link). | | | N-R4 | Networking | Use Network Security Groups in the AI Landing Zone on all virtual networks implemented as part of the architecture. | Utilize [network security groups](https://learn.microsoft.com/en-us/azure/virtual-network/network-security-groups-overview) (NSGs) to define and apply access policies that govern inbound and outbound traffic to and from AI workloads. These controls can be used to implement the principle of least privilege, ensuring that only essential communication is permitted. | | N-R5 | Networking | Use App Gateway or Azure Front Door with WAF in the AI Landing Zone for the application public front-end based on regional or global deployment. | [Azure WAF](https://learn.microsoft.com/en-us/azure/web-application-firewall/overview) helps protect your AI workloads from common web vulnerabilities, including SQL injections and cross-site scripting attacks. Configure Azure WAF on [Application Gateway](https://learn.microsoft.com/en-us/azure/web-application-firewall/ag/ag-overview) for workloads that require enhanced security against malicious web traffic. For Azure AI Services, Restrict access to select [virtual networks](https://learn.microsoft.com/en-us/azure/ai-services/cognitive-services-virtual-networks#scenarios) or use [private endpoints](https://learn.microsoft.com/en-us/azure/ai-services/cognitive-services-virtual-networks#use-private-endpoints)| | | N-R6 | Networking | Consider APIM as AI Gateway in the AI landing zone with AI Foundry. | The AI Landing Zone should use [Azure API Management](https://learn.microsoft.com/en-us/azure/api-management/genai-gateway-capabilities#backend-load-balancer-and-circuit-breaker) for load balancing API requests to AI endpoints. Consider using Azure API Management (APIM) as a generative AI gateway within your virtual networks. A generative AI gateway sits between your front-end and the AI endpoints. Application Gateway, WAF policies, and APIM within the virtual network is an established [architecture](https://github.com/Azure/apim-landing-zone-accelerator/blob/main/scenarios/workload-genai/README.md#scenario-3-azure-api-management---generative-ai-resources-as-backend) in generative AI solutions. For more information, see [AI Hub architecture](https://github.com/Azure-Samples/ai-hub-gateway-solution-accelerator#ai-hub-gateway-landing-zone-accelerator) and [Deploy Azure API Management instance to multiple Azure regions](https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-deploy-multi-region). A [generative AI gateway](https://learn.microsoft.com/en-us/azure/api-management/genai-gateway-capabilities) allows you to track token usage, throttle token usage, apply circuit breakers, and route to different AI endpoints to control costs.

_Consider a generative AI gateway for monitoring._ A reverse proxy like Azure API Management allows you to implement logging and monitoring that aren't native to the platform. API Management allows you to collect source IPs, input text, and output text. For more information, see [Implement logging and monitoring for Azure OpenAI Service language models](https://learn.microsoft.com/en-us/azure/architecture/ai-ml/openai/architecture/log-monitor-azure-openai)._._ [Azure API Management](https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/azure-openai-gateway-guide) (APIM) can help ensure consistent security across AI workloads. Use its built-in policies for traffic control and security enforcement. Integrate APIM with Microsoft Entra ID to centralize authentication and authorization and ensure only authorized users or applications interact with your AI models. Ensure you configure least privilege access on the [reverse proxy’s managed identity](https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-use-managed-service-identity). For more information, see [AI authentication with APIM](https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/azure-openai-gateway-custom-authentication#general-recommendations)| | | N-R7 | Networking | Use Firewall either in the AI Landing Zone or from the platform landing zone (preferred) along with a UDR to the Azure or 3P Firewall. | [Azure Firewall](https://learn.microsoft.com/en-us/azure/firewall/overview) enforces security policies for outgoing traffic before it reaches the internet. Use it to control and monitor outgoing traffic and enable SNAT to conceal internal IP addresses by translating private IPs to the firewall's public IP. It ensures secure and identifiable outbound traffic for better monitoring and security. | | -| N-R8 | Networking | Use Private DNS Zones either in the AI Landing Zone or from the platform landing zone (preferred) for [integrated private endpoints with Private DNS Zones](https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-dns-integration) for proper DNS resolution and successful private endpoint functionality. | Private DNS zones centralize and secure DNS management for accessing PaaS services within your AI network. Set up Azure policies that enforce private DNS zones and require private endpoints to ensure secure, internal DNS resolutions. If you don't have central Private DNS Zones, the DNS forwarding doesn't work until you add conditional forwarding manually. For example, see [using custom DNS](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-custom-dns) with Azure AI Foundry hubs and Azure Machine Learning workspace. Custom DNS servers manage PaaS connectivity within the network, bypassing public DNS. Configure private DNS zones in Azure to resolve PaaS service names securely and route all traffic through private networking channels. | | -| N-R9 | Networking | Restrict Outbound by default in the AI Landing Zone should provide guidance and implementation restricting outbound access by default. | Limiting outbound traffic from your AI model endpoints helps protect sensitive data and maintain the integrity of your AI models. For minimizing data exfiltration risks, restricting outbound traffic to approved services or fully qualified domain names (FQDNs) and maintain a list of trusted sources. You should only allow unrestricted internet outbound traffic if you need access to public machine learning resources but regularly monitor and update your systems. For more information, see [Azure AI services](https://learn.microsoft.com/en-us/azure/ai-services/cognitive-services-data-loss-prevention), [Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/configure-managed-network), and [Azure Machine Learning.](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-network-isolation-planning#allow-only-approved-outbound-mode) | \ No newline at end of file +| N-R8 | Networking | Use Private DNS Zones either in the AI Landing Zone or from the platform landing zone (preferred) for [integrated private endpoints with Private DNS Zones](https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-dns-integration) for proper DNS resolution and successful private endpoint functionality. | Private DNS zones centralize and secure DNS management for accessing PaaS services within your AI network. Set up Azure policies that enforce private DNS zones and require private endpoints to ensure secure, internal DNS resolutions. If you don't have central Private DNS Zones, the DNS forwarding doesn't work until you add conditional forwarding manually. For example, see [using custom DNS](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-custom-dns) with Microsoft Foundry hubs and Azure Machine Learning workspace. Custom DNS servers manage PaaS connectivity within the network, bypassing public DNS. Configure private DNS zones in Azure to resolve PaaS service names securely and route all traffic through private networking channels. | | +| N-R9 | Networking | Restrict Outbound by default in the AI Landing Zone should provide guidance and implementation restricting outbound access by default. | Limiting outbound traffic from your AI model endpoints helps protect sensitive data and maintain the integrity of your AI models. For minimizing data exfiltration risks, restricting outbound traffic to approved services or fully qualified domain names (FQDNs) and maintain a list of trusted sources. You should only allow unrestricted internet outbound traffic if you need access to public machine learning resources but regularly monitor and update your systems. For more information, see [Azure AI services](https://learn.microsoft.com/en-us/azure/ai-services/cognitive-services-data-loss-prevention), [Microsoft Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/configure-managed-network), and [Azure Machine Learning.](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-network-isolation-planning#allow-only-approved-outbound-mode) | \ No newline at end of file diff --git a/terraform/readme.md b/terraform/readme.md index 3c67a07..0e0449b 100644 --- a/terraform/readme.md +++ b/terraform/readme.md @@ -350,7 +350,7 @@ az resource list --resource-group ai-lz-rg-01 --output table 3. Click on `ai-lz-rg-01` 4. Explore the deployed resources -**Azure AI Foundry:** +**Microsoft Foundry:** 1. Go to [ai.azure.com](https://ai.azure.com) 2. You should see your AI Hub and Project @@ -465,7 +465,7 @@ terraform init 3. **Deploy AI Models:** - The deployment includes GPT-4o model deployment - - You can deploy additional models through Azure AI Foundry + - You can deploy additional models through Microsoft Foundry 4. **Set Up CI/CD:** - Integrate with Azure DevOps or GitHub Actions @@ -475,7 +475,7 @@ terraform init ## Additional Resources -- [Azure AI Foundry Documentation](https://learn.microsoft.com/azure/ai-studio/) +- [Microsoft Foundry Documentation](https://learn.microsoft.com/azure/ai-studio/) - [Terraform Azure Provider Documentation](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs) - [Azure AI Landing Zone Pattern](https://github.com/Azure/AI-Landing-Zones)