Domain 1 of 5 | 25โ30% of Exam | Azure AI Apps and Agents Developer Associate
Highlighted row = this study page. All percentages are approximate.
| # | Domain | Weight |
|---|---|---|
| 1 | ๐ฏ Plan and manage an Azure AI solution THIS PAGE | 25โ30% |
| 2 | Implement generative AI and agentic solutions | 30โ35% |
| 3 | Implement computer vision solutions | 10โ15% |
| 4 | Implement text analysis solutions | 10โ15% |
| 5 | Implement information extraction solutions | 10โ15% |
The Microsoft Certified: Azure AI Apps and Agents Developer Associate (AI-103) validates expertise in building, deploying, and managing intelligent AI applications and autonomous agents on the Microsoft Azure platformโspecifically using the Azure AI Foundry portal and platform.
The exam was launched in beta in April 2026 and covers the full lifecycle: from selecting and provisioning AI models, securing access and network endpoints, monitoring usage and costs, ensuring responsible AI compliance, all the way through CI/CD pipelines for agentic AI solutions.
Azure AI Studio โ Azure AI Foundry: Microsoft rebranded Azure AI Studio as Azure AI Foundry in late 2024. The AI-103 exam tests the new hub/project model and Foundry-specific features. If you hold AI-102, expect significant differences in tooling terminology.
This page covers all eight sub-topic clusters tested in Domain 1:
Comprehensive study material for all Domain 1 sub-topics
Azure AI Foundry is the rebranded and expanded version of Azure AI Studio (renamed in late 2024). It is the unified platform for discovering, building, testing, and deploying AI models and solutions on Azure. The Foundry portal integrates model catalog browsing, project management, prompt flow authoring, evaluation, and monitoring into a single interface.
Exam tip: If a question references "Azure AI Studio," treat it as equivalent to the pre-Foundry era. Exam questions will predominantly use "Azure AI Foundry" terminology.
The Foundry organizational hierarchy has two main levels:
The model catalog aggregates models from multiple providers into a single discovery surface:
Models can be deployed via serverless API (pay-per-token, no compute provisioning) or as managed compute deployments (dedicated GPU instances with provisioned throughput).
Serverless API: No infrastructure to manage. Pay per token consumed. Best for variable or low-volume workloads. Model hosted by Microsoft. Shared capacity.
Provisioned Throughput: Reserved capacity with guaranteed tokens-per-minute (TPM). Best for production workloads needing predictable latency. Billed hourly regardless of usage.
| Model | Best For | Key Trait |
|---|---|---|
| GPT-4o | Multimodal (text + vision) | Balanced speed & capability; default for most production apps |
| GPT-4 Turbo | Long-context reasoning | 128K context window; older but still widely used |
| GPT-4o-mini | Cost-sensitive tasks | Faster, cheaper; smaller context but good for classification/extraction |
| o1 / o3 series | Complex reasoning | "Thinking" models with internal chain-of-thought; slower, more expensive |
| text-embedding-3-large | RAG / vector search | High-dimensional embeddings for semantic similarity |
When deploying an Azure OpenAI model, you choose one of four deployment types:
Exam trap: "Provisioned" and "Global Standard" are often confused. Provisioned = predictable SLA + hourly billing. Global Standard = best effort + pay-per-token. If a question asks for guaranteed throughput, choose Provisioned.
TPM (Tokens Per Minute) โ The total number of tokens (input + output) your deployment can process per minute. Determines how large and how many requests you can handle. Exceeding TPM results in HTTP 429 (rate limit) errors.
RPM (Requests Per Minute) โ The maximum number of individual API calls per minute. Automatically calculated based on your TPM allocation (approximately TPM / 6 = RPM for most models).
Azure OpenAI includes built-in content filters that score input and output content across 4 harm categories:
Each category uses a severity scale from 0 (safe) to 7 (severe). You configure a threshold per category (Low=2, Medium=4, High=6). Content scoring at or above threshold is blocked.
Additional filter features: custom blocklists (block specific terms/phrases), jailbreak detection (prompt injection attempts), protected material detection (copyright content), groundedness detection (hallucination in RAG responses).
Azure OpenAI models have explicit versions (e.g., gpt-4o-2024-11-20). When creating a deployment you can choose:
Why managed identity over API keys? Managed identities authenticate via Microsoft Entra ID (formerly Azure AD). No credentials stored in code or config files. Tokens are automatically rotated. Supports audit logging. This is the preferred production pattern.
| Role | Permissions | When to Use |
|---|---|---|
| Cognitive Services OpenAI User | Call inference endpoints; cannot view keys or manage deployments | Application service accounts, end-user authenticated apps |
| Cognitive Services OpenAI Contributor | Create/manage deployments; view keys; fine-tuning | Developers building and testing models |
| Cognitive Services Contributor | Full resource management except IAM | DevOps/MLOps engineers |
| Owner | All permissions including IAM role assignments | Admins only โ follow least privilege |
Key-based authentication: Uses the resource API key in the Authorization header or as a query parameter. Simple to implement but risks: keys can be leaked, no per-caller audit trail, revocation requires key rotation which impacts all callers.
Entra ID token authentication: Uses OAuth 2.0 bearer tokens issued by Microsoft Entra ID. Benefits: per-caller identity in audit logs, token expiry (short-lived), revoke by disabling identity, works with managed identities (no credential management).
Best practice rule: Always prefer Entra ID token-based authentication for production. Reserve key-based auth for development/testing or third-party integrations that cannot use managed identities.
Store all secrets (API keys, connection strings, third-party credentials) in Azure Key Vault. Foundry hubs support a connected Key Vault that stores secrets for all connected resources. Applications retrieve secrets at runtime using managed identity โ never hardcoded values.
Key Vault access policies vs RBAC: Modern pattern uses Azure RBAC (Key Vault Secrets User role) rather than legacy access policies.
Defender for Cloud provides security posture management and threat protection for AI services. Key capabilities: configuration recommendations (e.g., "enable private endpoint"), threat detection alerts (e.g., unusual spike in token consumption indicating key compromise), regulatory compliance dashboards.
By default, Azure AI services accept connections from the public internet (over HTTPS). For production environments with strict data security requirements, you should disable public access and use private endpoints.
A private endpoint assigns a private IP address from your VNet to the Azure AI service, making all traffic traverse the Azure backbone network without traversing the public internet. The service appears as if it lives inside your VNet.
myaccount.openai.azure.com resolves to the private IP (not the public IP) from within the VNet. Zone: privatelink.openai.azure.comAzure AI Foundry components (compute instances, managed online endpoints) need outbound access to specific Microsoft endpoints for model downloads, container registry pulls, and monitoring data. In locked-down VNets, you must add allowlist rules for:
Model deployment to Azure AI Foundry can be automated using Azure DevOps pipelines. A typical pipeline includes stages: validate configuration โ evaluate model โ deploy to staging โ run integration tests โ promote to production.
GitHub Actions workflows authenticate to Azure using either a Service Principal (client secret or certificate) or Workload Identity Federation (OIDC โ preferred, no long-lived secrets). The azure/login action handles authentication, then subsequent actions deploy AI resources.
Azure AI Foundry Prompt Flow is a development framework for building LLM-based applications. In a CI/CD context, prompt flows can be:
Before promoting a new model version or prompt configuration to production, evaluation gates check metrics such as:
If evaluation scores fall below defined thresholds, the pipeline fails the gate and blocks deployment. This is the primary mechanism to prevent quality regressions in AI application delivery.
Blue-green deployment maintains two identical production environments. The "blue" environment runs the current model; "green" is the new version. Traffic is switched to green only after validation. Rollback is instant โ switch traffic back to blue. In Azure AI Foundry, implement using traffic splitting on managed online endpoints (e.g., 90% to blue, 10% to green during canary testing).
Use Bicep or ARM templates to provision Azure AI Foundry hubs, projects, Azure OpenAI accounts, and deployments repeatably across environments. Key resources to define in IaC:
Enable Diagnostic Settings on Azure OpenAI and AI Foundry resources to route logs and metrics to:
Key log categories to enable: RequestResponse, Audit, ChatCompletions
In Log Analytics, Azure OpenAI logs appear in the AzureDiagnostics table (legacy) or resource-specific tables. Key KQL queries for exam scenarios:
Application Insights tracks end-to-end request traces for AI applications built on top of Azure OpenAI. Key telemetry: request latency (dependency tracking to OpenAI endpoint), token usage per request, failure rates, dependency failures.
The Azure OpenAI SDK automatically integrates with Application Insights when configured with a connection string. Use distributed tracing to correlate frontend requests โ your app โ OpenAI API calls.
Azure AI Foundry provides built-in tracing for prompt flows and agent runs. Tracing captures:
OpenTelemetry standard: Foundry tracing is built on OpenTelemetry. Traces can be exported to Application Insights or any OTLP-compatible backend. Look for questions about "spans" or "traces" in monitoring context.
Use Azure Cost Management to analyze AI service spending. Tag Azure OpenAI deployments with project/team/environment tags to enable granular cost attribution. Set Azure Budgets with alert thresholds (e.g., alert at 80% of monthly budget, hard stop at 100%). Note: Budget spending limits don't automatically throttle Azure services โ they only alert. Actual throttling requires quota management at the service level.
| Principle | Definition | Example Violation |
|---|---|---|
| Fairness | AI systems should treat all people equitably, without bias based on protected characteristics | Loan model approves applications at lower rates for one ethnic group |
| Reliability & Safety | AI should behave as intended, be safe to use, and handle unexpected inputs gracefully | Autonomous agent crashes production system during an edge case |
| Privacy & Security | AI should respect data privacy, handle personal data responsibly, and resist attacks | LLM reveals PII from training data in outputs |
| Inclusiveness | AI should empower and benefit all people, including those with disabilities and from all backgrounds | Voice assistant only works well for native English speakers |
| Transparency | AI systems and their limitations should be understandable; users should know when they're interacting with AI | Chatbot claims to be human without disclosure |
| Accountability | People and organizations should be accountable for AI systems and their impacts | No process to audit or appeal AI-driven decisions |
A dedicated Azure AI service (separate from Azure OpenAI content filters) for analyzing text and images for harmful content. Used for user-generated content moderation, custom model output validation, and multilingual content analysis.
Harm categories and severity scale:
Before deploying AI systems that could significantly impact people, Microsoft recommends completing a Responsible AI Impact Assessment. This structured document captures: intended use cases, potential harms and affected groups, mitigation measures implemented, monitoring plan post-deployment, and escalation processes. Required for high-risk AI applications; strongly recommended for all customer-facing AI.
Each Azure subscription gets a default TPM quota per model per region. Quota is model-specific (GPT-4o has different quota limits than GPT-4o-mini). Default quotas are typically 30Kโ240K TPM depending on subscription type and region.
Quota is regional: A 100K TPM quota for GPT-4o in East US is separate from a 100K TPM quota in West Europe. Quota cannot be automatically moved between regions. A Global Standard deployment draws from a global pool rather than regional pool.
If default quota is insufficient, submit a quota increase request through the Azure portal (Subscription > Usage + Quotas > Request Increase). For Azure OpenAI, quota increases go through a separate review process and may not be immediate. Best practice: submit requests 2โ4 weeks before planned production launch.
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| GPT-4o | ~$2.50 | ~$10.00 |
| GPT-4o-mini | ~$0.15 | ~$0.60 |
| o1 | ~$15.00 | ~$60.00 |
| text-embedding-3-large | ~$0.13 | N/A |
* Prices are approximate and subject to change. Exam tests concepts, not exact prices.
project, environment, team for granular cost reporting and chargeback.Provisioned Throughput Units (PTUs) can be purchased as reservations (1-year or 3-year) for significant discounts vs pay-as-you-go hourly rates. Reservations commit to a fixed PTU count for the reservation term. Best for stable, high-volume production workloads with predictable usage patterns.
Memorable patterns to lock in Domain 1 concepts for exam day
Memory sentence: "Fairly Reliable Products Include Transparent Accountability"
Severity scale: 0 (safe) โ 2 (low) โ 4 (medium) โ 6 (high) โ 7 (severe). Each category is evaluated independently.
Use Key-based auth when: Rapid prototyping, third-party tools that can't do OAuth, simple local testing.
Use Entra ID Token auth when: Production environment, needs audit trail (who called what), managed identity available, revocability required without impacting other callers.
Think: A key fits any lock โ simple but risky if copied. A token is like a badge with your photo โ tied to your identity, expires, and can be deactivated without changing the lock.
Multiple projects can share one hub. One project belongs to exactly one hub. Projects can have project-specific connected resources in addition to hub-level ones.
You can hit the RPM limit with tiny requests, OR hit the TPM limit with a few large ones. Rate limiting (HTTP 429) triggers when either limit is exceeded.
Public internet access = using the front door (anyone can see you). Private endpoint = using the VIP back entrance through Azure's backbone (private, invisible to internet).
Exam trick: If a question says "the application should call Azure OpenAI," the answer is almost always "Cognitive Services OpenAI User" โ not Contributor or Owner.
All scored 1โ5 in Azure AI Foundry's built-in evaluation. Groundedness is most critical for RAG-based applications.
10 scenario-based questions ยท Select the best answer
20 cards ยท Click any card to reveal the answer
Select your background to get a tailored Domain 1 study plan
You already know Azure AI concepts. Focus on what's new or changed in AI-103 vs AI-102.
AI-102 uses "Cognitive Services resources" and "Azure AI Studio projects." AI-103 uses the new Foundry hub + project hierarchy. Understand what moves from resource-level to hub-level (networking, security) vs what's project-level (deployments, flows).
AI-102 focused on provisioning Azure OpenAI resources. AI-103 tests knowledge of Global Standard vs Standard vs Provisioned (PTU) vs Batch deployment types and when to choose each. This is heavily tested.
AI-103 introduces Azure AI Foundry's built-in tracing based on OpenTelemetry. Understand the difference between traces (end-to-end) and spans (individual operations), and how this differs from Application Insights monitoring you knew from AI-102.
AI-103 has much stronger emphasis on DevOps integration: GitHub Actions, Azure DevOps pipelines, IaC with Bicep/ARM, blue-green deployment for models, and evaluation gates in CI pipelines. AI-102 barely touched this.
The 6 principles are the same as AI-102 but AI-103 adds groundedness detection, protected material detection, and prompt injection detection from Azure AI Content Safety. Understand the distinction between Content Safety service vs Azure OpenAI built-in filters.
TPM/RPM concepts carry over from AI-102 but you now need to know provisioned throughput reservations, PTU pricing model, and the distinction between budget alerts (no enforcement) vs spending limits (enforcement, limited subscription types).
Build your knowledge from the ground up with this structured sequence.
Spend time actually exploring the Azure AI Foundry portal (free trial available). Understand the hub/project hierarchy, model catalog, and connected resources before attempting any exam questions. Hands-on beats reading for this topic.
Understand Microsoft Entra ID (formerly Azure AD), managed identities (system vs user-assigned), and Azure RBAC roles. This foundational knowledge is required for AI-103's security topics. Complete Microsoft Learn's "Describe Azure identity, access, and security" module first.
The 4 deployment types (Global Standard, Standard, Provisioned, Batch) are heavily tested. Create a comparison table mapping each deployment type to: pricing model, throughput guarantee, data residency, and use case. Memorize these before the exam.
Networking concepts trip up many candidates. Understand the difference between service endpoints and private endpoints, why private endpoints are preferred for AI services, and the role of Private DNS zones in making private endpoints work correctly.
Use the FRPITA mnemonic from the Memory Hooks tab. Practice identifying which principle is violated in given scenarios. Fairness (demographic bias), Reliability (edge case failures), Privacy (data leakage), Inclusiveness (accessibility gaps), Transparency (AI disclosure), Accountability (no audit process).
Learn which tool answers which question: Azure Monitor/Log Analytics for infrastructure metrics and audit logs; Application Insights for request tracing in your app code; Foundry Tracing for debugging prompt flows and agent runs at the span level.
Understand TPM vs RPM and how they affect pricing. Know that Azure Budgets alert but don't enforce for production subscriptions. Practice calculating rough cost estimates using token pricing for given model/volume scenarios.
You know how to code. Focus on Azure-specific patterns and operational concerns.
If you're currently using API keys in environment variables, understand why managed identities are superior for Azure deployments. Practice: App Service โ System-assigned managed identity โ Cognitive Services OpenAI User role โ no key needed. Learn the DefaultAzureCredential class in the Azure Identity SDK.
You already know CI/CD โ apply it to AI model deployments. Learn Workload Identity Federation for GitHub Actions (OIDC, no service principal secrets). Study the Azure CLI commands for deploying Azure OpenAI model deployments. Understand evaluation gates: how to fail a pipeline when groundedness drops below threshold.
Learn Bicep/ARM templates for AI resources: Microsoft.CognitiveServices/accounts (Azure OpenAI), deployments sub-resource, private endpoints, diagnostic settings. Practice deploying a full Foundry hub+project stack with IaC. This is heavily relevant for MLOps scenarios in the exam.
Understand Prompt Flow as a DAG-based workflow for LLM applications. Learn how to version prompt flows in Git (YAML), run evaluation flows in code, and deploy flows as endpoints. The azure-ai-projects Python SDK is the primary interface for Foundry operations.
Your apps need the minimum required role. Application calling Azure OpenAI = Cognitive Services OpenAI User. Pipeline deploying a model = Cognitive Services Contributor. Never give your application Owner rights. Map each of your use cases to the correct RBAC role.
You likely know OpenTelemetry. Azure AI Foundry uses OTLP for its tracing. Understand traces (flow execution) vs spans (individual LLM calls, tool calls, retrieval steps). Know that Foundry traces export to Application Insights via OTLP exporter. Useful for debugging agent reasoning chains.
Azure AI Content Safety has its own SDK. As a developer, understand how to integrate it: analyze text before sending to LLM (input moderation), analyze LLM output before sending to users (output moderation), and use groundedness detection in RAG pipelines. Know the API response structure including severity scores per category.
Verified links for AI-103 exam preparation
Microsoft's official study guide with full exam objectives, skills measured breakdown, and recommended learning paths for the AI-103 exam.
learn.microsoft.com โ AI-103 Study Guide โComplete documentation for the Azure AI Foundry platform including hub/project setup, model catalog, prompt flow, evaluations, and deployment options.
learn.microsoft.com โ Azure AI Foundry โDocumentation for Azure AI Content Safety service including harm categories, severity levels, custom blocklists, groundedness detection, and jailbreak detection.
learn.microsoft.com โ AI Content Safety โMicrosoft's official Responsible AI framework including the 6 principles, responsible AI practices, and the Responsible AI Impact Assessment template.
microsoft.com โ Responsible AI โOfficial Microsoft certification page for Azure AI Apps and Agents Developer Associate. Register for the exam, view exam details, and access the sandbox environments.
learn.microsoft.com โ AI-103 Certification โReference for Azure OpenAI Service: model deployment, quotas, content filtering, fine-tuning, API reference, and pricing details.
learn.microsoft.com โ Azure OpenAI Service โDocumentation on private endpoints and Private Link for Azure AI services, including Private DNS zone configuration and VNet integration.
learn.microsoft.com โ Azure Private Link โAccess interactive flashcards, quizzes, and study guides for all 5 AI-103 domains plus hundreds of other certification exams. Free to get started.
โ Start Free on FlashGenius โ