Skip to content

Commit

Permalink
Merge pull request #2518 from MicrosoftDocs/main
Browse files Browse the repository at this point in the history
Publish to live, Sunday 4 AM PST, 1/26
ttorble authored Jan 26, 2025
2 parents c67b444 + e1eaa53 commit 9cc44af
Showing 18 changed files with 109 additions and 87 deletions.
8 changes: 6 additions & 2 deletions articles/ai-foundry/model-inference/concepts/endpoints.md
Original file line number Diff line number Diff line change
@@ -38,7 +38,11 @@ To learn more about how to create deployments see [Add and configure model deplo

## Azure AI inference endpoint

The Azure AI inference endpoint allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. This endpoint follows the [Azure AI model inference API](../../../ai-studio/reference/reference-model-inference-api.md) which all the models in Azure AI model inference support.
The Azure AI inference endpoint allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. This endpoint follows the [Azure AI model inference API](../../../ai-studio/reference/reference-model-inference-api.md) which all the models in Azure AI model inference support. It support the following modalidities:

* Text embeddings
* Image embeddings
* Chat completions

You can see the endpoint URL and credentials in the **Overview** section:

@@ -84,4 +88,4 @@ The Azure OpenAI endpoint is supported by the **OpenAI SDK (`AzureOpenAI` class)
## Next steps

- [Models](models.md)
- [Deployment types](deployment-types.md)
- [Deployment types](deployment-types.md)
4 changes: 2 additions & 2 deletions articles/ai-foundry/model-inference/how-to/inference.md
Original file line number Diff line number Diff line change
@@ -26,9 +26,9 @@ Azure AI services expose multiple endpoints depending on the type of work you're
> * Azure AI model inference endpoint
> * Azure OpenAI endpoint
The **Azure AI inference endpoint** allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. All the models support this capability. This endpoint follows the [Azure AI model inference API](../../../ai-studio/reference/reference-model-inference-api.md).
The **Azure AI inference endpoint** (usually with the form `https://<resource-name>.services.ai.azure.com/models`) allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. All the models support this capability. This endpoint follows the [Azure AI model inference API](../../../ai-studio/reference/reference-model-inference-api.md).

**Azure OpenAI** models deployed to AI services also support the Azure OpenAI API. This endpoint exposes the full capabilities of OpenAI models and supports more features like assistants, threads, files, and batch inference.
**Azure OpenAI** models deployed to AI services also support the Azure OpenAI API (usually with the form `https://<resource-name>.openai.azure.com`). This endpoint exposes the full capabilities of OpenAI models and supports more features like assistants, threads, files, and batch inference.

To learn more about how to apply the **Azure OpenAI endpoint** see [Azure OpenAI service documentation](../../../ai-services/openai/overview.md).

Original file line number Diff line number Diff line change
@@ -26,7 +26,7 @@ from azure.ai.inference import ChatCompletionsClient
from azure.identity import AzureDefaultCredential

model = ChatCompletionsClient(
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
endpoint="https://<resource>.services.ai.azure.com/models",
credential=AzureDefaultCredential(),
model="mistral-large-2407",
)
@@ -48,7 +48,7 @@ import { isUnexpected } from "@azure-rest/ai-inference";
import { AzureDefaultCredential } from "@azure/identity";

const client = new ModelClient(
process.env.AZUREAI_ENDPOINT_URL,
"https://<resource>.services.ai.azure.com/models",
new AzureDefaultCredential(),
"mistral-large-2407"
);
@@ -80,7 +80,7 @@ Then, you can use the package to consume the model. The following example shows

```csharp
ChatCompletionsClient client = new ChatCompletionsClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_INFERENCE_ENDPOINT")),
new Uri("https://<resource>.services.ai.azure.com/models"),
new AzureDefaultCredential(includeInteractiveCredentials: true),
"mistral-large-2407"
);
@@ -108,7 +108,7 @@ Then, you can use the package to consume the model. The following example shows
```java
ChatCompletionsClient client = new ChatCompletionsClientBuilder()
.credential(new DefaultAzureCredential()))
.endpoint("{endpoint}")
.endpoint("https://<resource>.services.ai.azure.com/models")
.model("mistral-large-2407")
.buildClient();
```
@@ -122,7 +122,7 @@ Use the reference section to explore the API design and which parameters are ava
__Request__

```HTTP/1.1
POST models/chat/completions?api-version=2024-04-01-preview
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json
```
Original file line number Diff line number Diff line change
@@ -26,7 +26,7 @@ from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

model = ChatCompletionsClient(
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
endpoint="https://<resource>.services.ai.azure.com/models",
credential=AzureKeyCredential(os.environ["AZUREAI_ENDPOINT_KEY"]),
)
```
@@ -49,7 +49,7 @@ import { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = new ModelClient(
process.env.AZUREAI_ENDPOINT_URL,
"https://<resource>.services.ai.azure.com/models",
new AzureKeyCredential(process.env.AZUREAI_ENDPOINT_KEY)
);
```
@@ -76,7 +76,7 @@ Then, you can use the package to consume the model. The following example shows

```csharp
ChatCompletionsClient client = new ChatCompletionsClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_INFERENCE_ENDPOINT")),
new Uri("https://<resource>.services.ai.azure.com/models"),
new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
);
```
@@ -114,7 +114,7 @@ Use the reference section to explore the API design and which parameters are ava
__Request__

```HTTP/1.1
POST models/chat/completions?api-version=2024-04-01-preview
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json
```
Original file line number Diff line number Diff line change
@@ -77,7 +77,7 @@ for (ChatChoice choice : chatCompletions.getChoices()) {
__Request__

```HTTP/1.1
POST models/chat/completions?api-version=2024-04-01-preview
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json
```
Original file line number Diff line number Diff line change
@@ -26,7 +26,7 @@ from azure.ai.inference import EmbeddingsClient
from azure.core.credentials import AzureKeyCredential

client = EmbeddingsClient(
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
endpoint="https://<resource>.services.ai.azure.com/models",
credential=AzureKeyCredential(os.environ["AZUREAI_ENDPOINT_KEY"]),
)
```
@@ -39,7 +39,7 @@ from azure.ai.inference import EmbeddingsClient
from azure.identity import AzureDefaultCredential

client = EmbeddingsClient(
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
endpoint="https://<resource>.services.ai.azure.com/models",
credential=AzureDefaultCredential(),
)
```
@@ -62,7 +62,7 @@ import { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = new ModelClient(
process.env.AZUREAI_ENDPOINT_URL,
"https://<resource>.services.ai.azure.com/models",
new AzureKeyCredential(process.env.AZUREAI_ENDPOINT_KEY)
);
```
@@ -75,7 +75,7 @@ import { isUnexpected } from "@azure-rest/ai-inference";
import { AzureDefaultCredential } from "@azure/identity";

const client = new ModelClient(
process.env.AZUREAI_ENDPOINT_URL,
"https://<resource>.services.ai.azure.com/models",
new AzureDefaultCredential()
);
```
@@ -108,7 +108,7 @@ Then, you can use the package to consume the model. The following example shows

```csharp
EmbeddingsClient client = new EmbeddingsClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_INFERENCE_ENDPOINT")),
new Uri("https://<resource>.services.ai.azure.com/models"),
new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
);
```
@@ -117,7 +117,7 @@ For endpoint with support for Microsoft Entra ID (formerly Azure Active Director

```csharp
EmbeddingsClient client = new EmbeddingsClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_INFERENCE_ENDPOINT")),
new Uri("https://<resource>.services.ai.azure.com/models"),
new DefaultAzureCredential(includeInteractiveCredentials: true)
);
```
@@ -131,7 +131,7 @@ Use the reference section to explore the API design and which parameters are ava
__Request__

```HTTP/1.1
POST models/embeddings?api-version=2024-04-01-preview
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json
```
Original file line number Diff line number Diff line change
@@ -53,7 +53,7 @@ Console.WriteLine($"Response: {response.Data.Embeddings}");
__Request__

```HTTP/1.1
POST models/embeddings?api-version=2024-04-01-preview
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json
```
Original file line number Diff line number Diff line change
@@ -122,7 +122,7 @@ try {
__Request__

```HTTP/1.1
POST /chat/completions?api-version=2024-04-01-preview
POST /chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json
```
Original file line number Diff line number Diff line change
@@ -9,7 +9,7 @@ author: santiagxf

* An Azure subscription. If you're using [GitHub Models](https://docs.github.com/en/github-models/), you can upgrade your experience and create an Azure subscription in the process. Read [Upgrade from GitHub Models to Azure AI model inference](../how-to/quickstart-github-models.md) if that's your case.

* An Azure AI services resource. For more information, see [Create an Azure AI Services resource](../../../ai-services/multi-service-resource.md?context=/azure/ai-services/model-inference/context/context).
* An Azure AI services resource. For more information, see [Create an Azure AI Services resource](../how-to/quickstart-create-resources.md).

* The endpoint URL and key.

Original file line number Diff line number Diff line change
@@ -28,18 +28,18 @@ To use chat completion models in your application, you need:

## Use chat completions

To use the text embeddings, use the route `/chat/completions` along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.
To use the text embeddings, use the route `/chat/completions` appended to the base URL along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.

```http
POST /chat/completions
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Content-Type: application/json
api-key: <key>
```

If you have configured the resource with **Microsoft Entra ID** support, pass you token in the `Authorization` header:

```http
POST /chat/completions
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Content-Type: application/json
Authorization: Bearer <token>
```
@@ -287,8 +287,7 @@ Some models can create JSON outputs. Set `response_format` to `json_object` to e
The Azure AI Model Inference API allows you to pass extra parameters to the model. The following code example shows how to pass the extra parameter `logprobs` to the model.

```http
POST /chat/completions HTTP/1.1
Host: <ENDPOINT_URI>
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <TOKEN>
Content-Type: application/json
extra-parameters: pass-through
@@ -565,7 +564,7 @@ Now, create a chat completion request with the image:

```json
{
"model": "mistral-large-2407",
"model": "phi-3.5-vision-instruct",
"messages": [
{
"role": "user",
@@ -597,7 +596,7 @@ The response is as follows, where you can see the model's usage statistics:
"id": "0a1234b5de6789f01gh2i345j6789klm",
"object": "chat.completion",
"created": 1718726686,
"model": "mistral-large-2407",
"model": "phi-3.5-vision-instruct",
"choices": [
{
"index": 0,
Original file line number Diff line number Diff line change
@@ -28,18 +28,18 @@ To use embedding models in your application, you need:

## Use embeddings

To use the text embeddings, use the route `/embeddings` along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.
To use the text embeddings, use the route `/embeddings` appended to the base URL along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.

```http
POST /embeddings
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
Content-Type: application/json
api-key: <key>
```

If you have configured the resource with **Microsoft Entra ID** support, pass you token in the `Authorization` header:

```http
POST /embeddings
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
Content-Type: application/json
Authorization: Bearer <token>
```
Original file line number Diff line number Diff line change
@@ -30,18 +30,18 @@ To use embedding models in your application, you need:

## Use embeddings

To use the text embeddings, use the route `/images/embeddings` along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.
To use the text embeddings, use the route `/images/embeddings` appended to your base URL along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.

```http
POST /images/embeddings
POST https://<resource>.services.ai.azure.com/models/images/embeddings
Content-Type: application/json
api-key: <key>
```

If you configured the resource with **Microsoft Entra ID** support, pass you token in the `Authorization` header:

```http
POST /images/embeddings
POST https://<resource>.services.ai.azure.com/models/images/embeddings
Content-Type: application/json
Authorization: Bearer <token>
```
2 changes: 2 additions & 0 deletions articles/ai-foundry/model-inference/index.yml
Original file line number Diff line number Diff line change
@@ -58,6 +58,8 @@ landingContent:
url: ./how-to/configure-content-filters.md
- text: Configure blocklists (preview)
url: ./how-to/use-blocklists.md
- text: Configure key-less authentication
url: ./how-to/configure-entra-id.md
- text: Manage cost
url: ./how-to/manage-costs.md
- text: Create resources
12 changes: 8 additions & 4 deletions articles/ai-foundry/model-inference/toc.yml
Original file line number Diff line number Diff line change
@@ -34,7 +34,7 @@ items:
href: ./concepts/deployment-types.md
- name: Model versions
href: ./concepts/model-versions.md
- name: Safety and compliance
- name: Responsible AI
items:
- name: Content filtering
href: ./concepts/content-filter.md
@@ -48,14 +48,18 @@ items:
href: ./how-to/github/create-model-deployments.md
- name: Connect your AI project
href: ./how-to/configure-project-connection.md
- name: Safety and compliance
- name: Responsible AI
items:
- name: Configure content filtering
href: ./how-to/configure-content-filters.md
- name: Use blocklists
href: ./how-to/use-blocklists.md
- name: Configure key-less authentication with Microsoft Entra ID
href: ./how-to/configure-entra-id.md
- name: Security & Governance
items:
- name: Configure key-less authentication
href: ./how-to/configure-entra-id.md
- name: Control model deployment with custom policies
href: /azure/ai-studio/how-to/custom-policy-model-deployment?context=/azure/ai-foundry/model-inference/context/context
- name: Manage cost
href: ./how-to/manage-costs.md
- name: Quotas and limits
5 changes: 5 additions & 0 deletions articles/ai-services/agents/concepts/model-region-support.md
Original file line number Diff line number Diff line change
@@ -21,11 +21,16 @@ Azure AI Agent Service supports the same models as the chat completions API in A

| **Region** | **gpt-4o**, **2024-05-13** | **gpt-4o**, **2024-08-06** | **gpt-4o-mini**, **2024-07-18** | **gpt-4**, **0613** | **gpt-4**, **1106-Preview** | **gpt-4**, **0125-Preview** | **gpt-4**, **turbo-2024-04-09** | **gpt-4-32k**, **0613** | **gpt-35-turbo**, **0613** | **gpt-35-turbo**, **1106** | **gpt-35-turbo**, **0125** | **gpt-35-turbo-16k**, **0613** |
|:--------------|:--------------------------:|:--------------------------:|:-------------------------------:|:-------------------:|:---------------------------:|:---------------------------:|:-------------------------------:|:--------------------------:|:--------------------------:|:--------------------------:|:------------------------------:|
| australiaeast | - | - | - ||| -| - ||||||
| eastus |||| - | - ||| - || - |||
| eastus2 |||| - || - || - || - |||
| francecentral | - | - | - ||| - | - |||| - ||
| japaneast | - | - | - | - | - | - | - | - || - |||
| norwayeast |- | - | - | - || - |- | - | - | - | - | - |
| swedencentral |||||| - ||||| - ||
| uksouth | - | - | - | - ||| - | - |||||
| westus |||| - || - || - | - ||| - |
| westus3 |||| - || - || - | - | - || - |


## More models
2 changes: 1 addition & 1 deletion articles/ai-services/agents/overview.md
Original file line number Diff line number Diff line change
@@ -13,7 +13,7 @@ ms.custom: azure-ai-agents

# What is Azure AI Agent Service?

Azure AI Agent Service is a fully managed service designed to empower developers to securely build, deploy, and scale high-quality, and extensible AI agents without needing to manage the underlying compute and storage resources. What originally took hundreds of lines of code to support [client side function calling](/azure/ai-services/openai/how-to/function-calling) can now be done in just a few lines of code with Azure AI Agent Service.
[Azure AI Agent Service](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/introducing-azure-ai-agent-service/4298357) is a fully managed service designed to empower developers to securely build, deploy, and scale high-quality, and extensible AI agents without needing to manage the underlying compute and storage resources. What originally took hundreds of lines of code to support [client side function calling](/azure/ai-services/openai/how-to/function-calling) can now be done in just a few lines of code with Azure AI Agent Service.

## What is an AI agent?

Loading

0 comments on commit 9cc44af

Please sign in to comment.