diff --git a/docs/intelligentapps/bulkrun.md b/docs/intelligentapps/bulkrun.md new file mode 100644 index 0000000000..e0a2c402c2 --- /dev/null +++ b/docs/intelligentapps/bulkrun.md @@ -0,0 +1,40 @@ +--- +Order: 4 +Area: intelligentapps +TOCTitle: Bulk Run +ContentId: +PageTitle: Bulk Run Prompts +DateApproved: +MetaDescription: Run a set of prompts in an imported dataset, individually or in a full batch towards the selected genAI models and parameters. +MetaSocialImage: +--- + +# Run multiple prompts in bulk + +The bulk run feature in AI Toolkit allows you to run multiple prompts in batch. When you use the playground, you can only run one prompt manually at a time, in the order they're listed. Bulk run takes a dataset as input, where each row in the dataset has a prompt as the minimal requirement. Typically, the dataset has multiple rows. Once imported, you can select any prompt to run or run all prompts on the selected model. The responses will be displayed in the same dataset view. The results from running the dataset can be exported. + +To start a bulk run: + +1. In the AI Toolkit view, select **TOOLS** > **Bulk Run** to open the Bulk Run view. + + +1. Select either a sample dataset or import a local JSONL file that has a `query` field to use as prompts. + + ![Select dataset](./images/bulkrun/dataset.png) + +1. Once the dataset is loaded, select **Run** or **Rerun** on any prompt to run a single prompt. + + + Like in the playground, you can select AI model, add context for your prompt, and change inference parameters. + + ![Bulk run prompts](./images/bulkrun/bulkrun_one.png) + +1. Select **Run all** on the top of the Bulk Run view to automatically run through queries. The responses are shown in the **response** column. + + There is an option to only run the remaining queries that have not yet been run. + + ![Run all](./images/bulkrun/runall.png) + +1. Select the **Export** button to export the results to a JSONL format. + +1. Select **Import** to import another dataset in JSONL format for the bulk run. \ No newline at end of file diff --git a/docs/intelligentapps/evaluation.md b/docs/intelligentapps/evaluation.md new file mode 100644 index 0000000000..f9315ed5cd --- /dev/null +++ b/docs/intelligentapps/evaluation.md @@ -0,0 +1,52 @@ +--- +Order: 5 +Area: intelligentapps +TOCTitle: Evaluation +ContentId: +PageTitle: AI Evaluation +DateApproved: +MetaDescription: Import a dataset with LLMs or SLMs output or rerun it for the queries. Run evaluation job for the popular evaluators like F1 score, relevance, coherence, similarity... find, visualize, and compare the evaluation results in tables or charts. +MetaSocialImage: +--- + +# Model evaluation + +AI engineers often need to evaluate models with different parameters or prompts in a dataset for comparing to ground truth and compute evaluator values from the comparisons. AI Toolkit allows you to perform evaluations with minimal effort. + +![Start evaluation](./images/evaluation/evaluation.png) + +## Start an evaluation job + +1. In AI Toolkit view, select **TOOLS** > **Evaluation** to open the Evaluation view. +1. Select the **Create Evaluation** button and provide the following information: + + - **Evaluation job name:** default or a name you can specify + - **Evaluator:** currently the built-in evaluators can be selected. + ![Evaluators](./images/evaluation/evaluators.png) + - **Judging model:** a model from the list that can be selected as judging model to evaluate for some evaluators. + - **Dataset:** you can start with a sample dataset for learning purpose, or import a JSONL file with fields `query`,`response`,`ground truth`. +1. Once you provide all necessary information for evaluation, a new evaluation job is created. You will be promoted to open your new evaluation job details. + + ![Open evaluation](./images/evaluation/openevaluation.png) + +1. Verify your dataset and select **Run Evaluation** to start the evaluation. + + ![Run Evaluation](./images/evaluation/runevaluation.png) + +## Monitor the evaluation job + +Once an evaluation job is started, you can find its status from the evaluation job view. + +![Running evaluation](./images/evaluation/running.png) + +Each evaluation job has a link to the dataset that was used, logs from the evaluation process, timestamp, and a link to the details of the evaluation. + +## Find results of evaluation + +Select the evaluation job detail, the view has columns of selected evaluators with the numerical values. Some may have aggregate values. + +You can also select **Open In Data Wrangler** to open the data with the Data Wrangler extension. + +> Install Data Wrangler + +![Data Wrangler](./images/evaluation/datawrangler.png) \ No newline at end of file diff --git a/docs/intelligentapps/faq.md b/docs/intelligentapps/faq.md new file mode 100644 index 0000000000..d356862c3b --- /dev/null +++ b/docs/intelligentapps/faq.md @@ -0,0 +1,129 @@ +--- +Order: 7 +Area: intelligentapps +TOCTitle: FAQ +ContentId: +PageTitle: FAQ for AI Toolkit +DateApproved: +MetaDescription: Find answers to frequently asked questions (FAQ) using AI Toolkit. Get troubleshooting recommendations. +MetaSocialImage: +--- + +# AI Toolkit FAQ + +## Models + +### How can I find my remote model endpoint and authentication header? + +Here are some examples about how to find your endpoint and authentication headers in common OpenAI service providers. For other providers, you can check out their documentation about the chat completion endpoint and authentication header. + +#### Example 1: Azure OpenAI + +1. Go to the `Deployments` blade in Azure OpenAI Studio and select a deployment, for example, `gpt-4o`. If you don't have a deployment yet, you can checkout [the documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal) about how to create a deployment. + + ![Select model deployment](./images/faq/6-aoai-deployments.png) + + ![Find model endpoint](./images/faq/7-aoai-model.png) + +2. As in the last screenshot, you can retrieve your chat completion endpoint in the `Target URI` property in the `Endpoint` section. + +3. You can retrieve your API key from the `Key` property in the `Endpoint` section. After you copy the API key, **fill it in the format of `api-key: ` for authentication header** in AI Toolkit. See [Azure OpenAI service documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#request-header-2) to learn more about the authentication header. + +#### Example 2: OpenAI + +1. For now, the chat completion endpoint is fixed as `https://api.openai.com/v1/chat/completions`. See [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat/create) to learn more about it. + +2. Go to [OpenAI documentation](https://platform.openai.com/docs/api-reference/authentication) and click `API Keys` or `Project API Keys` to create or retrieve your API key. After you copy the API key, **fill it in the format of `Authorization: Bearer ` for authentication header** in AI Toolkit. See the OpenAI documentation for more information. + + ![Find model access key](./images/faq/8-openai-key.png) + + +### How to edit endpoint URL or authentication header? + +If you enter the wrong endpoint or authenticatin header, you may encounter errors when inferencing. Click `Edit settings.json` to open Visual Studio Code settings. You may also type the command `Open User Settings (JSON)` in Visual Studio Code command palette to open it and go to the `windowsaistudio.remoteInfereneEndpoints` section. + +![Edit](./images/faq/9-edit.png) + +Here, you can edit or remove existing endpoint URLs or authentication headers. After you save the settings, the models list in tree view or playground will automatically refresh. + +![Edit nedpoint in settings](./images/faq/10-edit-settings.png) + +### How can I join the waitlist for OpenAI o1-mini or OpenAI o1-preview? + +The OpenAI o1 series models are specifically designed to tackle reasoning and problem-solving tasks with increased focus and capability. These models spend more time processing and understanding the user's request, making them exceptionally strong in areas like science, coding, math and similar fields. For example, o1 can be used by healthcare researchers to annotate cell sequencing data, by physicists to generate complicated mathematical formulas needed for quantum optics, and by developers in all fields to build and execute multi-step workflows. + +IMPORTANT: o1-preview model is available for limited access. To try the model in the playground, registration is required, and access will be granted based on Microsoft’s eligibility criteria. + +You can visit the [GitHub model market](https://aka.ms/github-model-marketplace) to find OpenAI o1-mini or OpenAI o1-preview and join the waitlist. + +### Can I use my own models or other models from Hugging Face? + +If your own model supports OpenAI API contract, you can host the model in the cloud and add it to AI Toolkit as custom model. You need to provide key information such as model endpoint url, access key and model name. + +## Finetune + +### There are too many fine-tune settings do I need to worry about all of them? + +No, you can just run with the default settings and our current dataset in the project to test. If you want you can also pick your own dataset but you will need to tweak some setting see [this](walkthrough-hf-dataset.md) tutorial for more info. + +### AI Toolkit would not scaffold the fine-tuning project + +Make sure to check for the prerequisites before installing the extension. More details at [Prerequisites](README.md#prerequisites). + +### I have the NVIDIA GPU device but the prerequisites check fails + +If you have the NVIDIA GPU device but the prerequisites check fails with "GPU is not detected", make sure that the latest driver is installed. You can check and download the driver at [NVIDIA site](https://www.nvidia.com/Download/index.aspx?lang=en-us). +Also, make sure that it is installed in the path. To check, run run nvidia-smi from the command line. + +### I generated the project but Conda activate fails to find the environment + +There might have been an issue setting the environment you can manually initialize the environment using `bash /mnt/[PROJECT_PATH]/setup/first_time_setup.sh` from inside the workspace. + +### When using a Hugging Face dataset how do I get it? + +Make sure before you start the `python finetuning/invoke_olive.py` command you run `huggingface-cli login` this will ensure the dataset can be downloaded on your behalf. + +## Environment + +### Does the extension work in Linux or other systems? + +Yes, AI Toolkit runs on Windows, Mac and Linux. + +### How can I disable the Conda auto activation from my WSL + +To disable the conda install in WSL you can run `conda config --set auto_activate_base false` this will disable the base environment. + +### Do you support containers today? + +We are currently working on the container support and it will be enable in a future release. + +### Why do you need GitHub and Hugging Face credentials? + +We host all the project templates in GitHub and the base models are hosted in Azure or Hugging Face which requires accounts to get access to them from the APIs. + +### I am getting an error downloading Llama2 + +Please ensure you request access to Llama through this form [Llama 2 sign up page](https://github.com/llama2-onnx/signup) this is needed to comply with Meta's trade compliance. + +### Can't save project inside WSL instance +Because the remote sessions are currently not supported when running the AI Toolkit Actions, you cannot save your project while being connected to WSL. To close remote connections, click on "WSL" at the bottom left of the screen and choose "Close Remote Connections". + +### Error: GitHub API forbidden + +We host the project templates in GitHub repositry *microsoft/windows-ai-studio-templates*, and the extension will call GitHub API to load the repo content. If you are in Microsoft, you may need to authorize Microsoft organization to avoid such forbidden issue. + +See [this issue](https://github.com/microsoft/vscode-ai-toolkit/issues/70#issuecomment-2126089884) for workaround. The detailed steps are: +- Sign out GitHub account from VS Code +- Reload VS Code and AI Toolkit and you will be asked to sign in GitHub again +- [Important] In browser's authorize page, make sure to authorize the app to access "Microsoft" org + ![Authorize Access](./images/faq/faq-github-api-forbidden.png) + +### Cannot list, load, or download ONNX model + +Check the 'AI Toolkit' log from output panel. If seeing *Agent* error or something like: + +![Agent Failure](./images/faq/faq-onnx-agent.png) + +Please close all VS Code instances and reopen VS Code. + +(*It's caused by underlying ONNX agent unexpectedly closed and above step is to restart the agent.*) \ No newline at end of file diff --git a/docs/intelligentapps/finetune.md b/docs/intelligentapps/finetune.md new file mode 100644 index 0000000000..299380c8f7 --- /dev/null +++ b/docs/intelligentapps/finetune.md @@ -0,0 +1,275 @@ +--- +Order: 6 +Area: intelligentapps +TOCTitle: Finetune +ContentId: +PageTitle: Finetune AI Models +DateApproved: +MetaDescription: Use custom dataset to finetune a generative AI model in the Azure cloud or locally with GPUs. Deploy the finetuned model to the Azure cloud or download incremental files from finetuned model. +MetaSocialImage: +--- + +# Finetune models + +Finetune AI model is a common practice that allows you to use your custom dataset to run **finetune** jobs on a pre-trained model in a computing environment with GPUs. AI Toolkit currently supports finetuning SLMs on local machine with GPU or in the cloud (Azure Container App) with GPU. + +A finetuned model can be downloaded to local and do inference test with GPUs, or be quantized to run locally on CPUs. Finetuned model can also be deployed to a could envinronment as remote model. + +## **[Preview]** Finetune AI models on Azure with AI Toolkit for VS Code + +AI Toolkit for VS Code now supports the feature to provision Azure Container Apps to run model fine-tuning and inference endpoint in the cloud. + +### Set up your cloud environment +1. To run the model fine-tuning and inference in your remote Azure Container Apps Environment, make sure your subscription has enough GPU capacity. Submit a [support ticket](https://azure.microsoft.com/support/create-ticket/) to request the required capacity for your application. [Get More Info about GPU capacity](https://learn.microsoft.com/en-us/azure/container-apps/workload-profiles-overview) +2. Make sure you have a [HuggingFace account](https://huggingface.co/) and [generate an access token](https://huggingface.co/docs/hub/security-tokens) if you are using private dataset on HuggingFace or your base model needs access control. +3. Accept the LICENSE on HuggingFace if you are fine-tuning Mistral or Llama. +4. Enable Remote Fine-tuning and Inference feature flag in the AI Toolkit for VS Code + 1. Open the VS Code Settings by selecting *File -> Preferences -> Settings*. + 2. Navigate to *Extensions* and select *AI Toolkit*. + 3. Select the *"Enable to run fine-tuning and inference on Azure Container Apps"* option. + 4. Reload VS Code for the changes to take effect. + ![AI Toolkit Settings](./images/finetune/settings.png) + +### Scaffold a finetune project +1. Execute the command palette `AI Toolkit: Focus on Tools View`. +2. Navigate to `Fine-tuning` to access the model catalog. Select a model for the fine-tuning. Assign a name to your project and select its location on your machine. Then, hit the *"Configure Project"* button. +![Panel: Select Model](./images/finetune/panel-select-model.png) +3. Project Configuration + 1. Avoid enabling the *"Fine-tune locally"* option. + 2. The Olive configuration settings will appear with pre-set default values. Please adjust and fill in these configurations as needed. + 3. Move on to *Generate Project*. This stage leverages WSL and involves setting up a new Conda environment, preparing for future updates that include Dev Containers. +![Panel: Configure the Model](./images/finetune/panel-config-model.png) + 4. Select *"Relaunch Window In Workspace"* to open your finetune project. +![Panel: Generate Project](./images/finetune/panel-generate-project.png) + +> **Note:** The project currently works either locally or remotely within the AI Toolkit for VS Code. If you choose *"Fine-tune locally"* during project creation, it will run exclusively in WSL without cloud resources. Otherwise, the project will be restricted to run in the remote Azure Container App environment. + +### Provision Azure Resources +To get started, you need to provision the Azure Resource for remote fine-tuning. From command palette find and execute `AI Toolkit: Provision Azure Container Apps job for fine-tuning`. During this process, you will be prompted to select your Azure Subscription and resource group. + +![Provision Fine-Tuning](./images/finetune/command-provision-finetune.png) + +Monitor the progress of the provision through the link displayed in the output channel. +![Provision Progress](./images/finetune/log-finetining-progress.png) + +### Run Fine-tuning +To start the remote fine-tuning job, execute the `AI Toolkit: Run fine-tuning` command. +![Run Fine-tuning](./images/finetune/command-run-finetuning.png) + +The extension will do the following operations: +1. Synchronize your workspace with Azure Files. +1. Trigger the Azure Container Appjob using the commands specified in `./infra/fintuning.config.json`. + +QLoRA will be used for fine-tuning, and the finetune process will create LoRA adapters for the model to use during inference. + +The results of the fine-tuning will be stored in the Azure Files. +To explore the output files in the Azure File share, you can navigate to the Azure portal using the link provided in the output panel. Alternatively, you can directly access the Azure portal and locate the storage account named `STORAGE_ACCOUNT_NAME` as defined in `./infra/fintuning.config.json` and the file share named `FILE_SHARE_NAME` as defined in `./infra/fintuning.config.json`. + +![file-share](./images/finetune/log-finetuning-files.png) + +### View Logs +Once the fine-tuning job has been started, you can access the system and console logs by visiting the Azure portal. +Alternatively, you can view the console logs directly in the VSCode output panel. +> **Note:** The job might take a few minutes to initiate. If there is already a running job, the current one may be queued to start later. + +![log-button](./images/finetune/notification-finetune.png) + +#### View and Query Logs on Azure + +After fine-tuning job was triggered, you can view logs on Azure by selecting the "*Open Logs in Azure Portal*" button from the VSCode notification. + +Or, if you've already opened the Azure Portal, find job history from the "*Execution history*" panel to the Azure Container Apps job. + +![Job Execution History](./images/finetune/finetune-job-history.png) + +There are two types of logs, "*Console*" and "*System*". +- Console logs are messages from your app, including `stderr` and `stdout` messages. This is what you have already seen in the streaming logs section. +- System logs are messages from the Azure Container Apps service, including the status of service-level events. + +To view and query your logs, select the "*Console*" button and navigate to the Log Analytics page where you can view all logs and write your queries. + +![Job Log Analytics](./images/finetune/finetune-job-log-query.png) + + +> For more information about Azure Container Apps Logs, see [Application Logging in Azure Container Apps](https://learn.microsoft.com/azure/container-apps/logging). + + +#### View streaming logs in VSCode +After initiating the fine-tuning job, you can also view logs on Azure by selecting the "*Show Streaming Logs in VS Code*" button in the VSCode notification. +Or you can execute the command `AI Toolkit: Show the running fine-tuning job streaming logs`. +![Streaming Log Command](./images/finetune/command-show-streaming-log.png) + +The streaming log of the running fine-tuning job will be displayed in the output panel. + +![Streaming Log Output](./images/finetune/log-finetuning-res.png) + +> **Note:** +> 1. The job might be queued due to insufficient resources. If the log is not displayed, wait for a while and then execute the command to re-connect to the streaming log. +> 2. The streaming log may timeout and disconnect. However, it can be reconnected by execute the command again. + + +## Inferencing with the fine-tuned model +After the adapters are trained in the remote environment, use a simple Gradio application to interact with the model. + +![Fine-tune complete](./images/finetune/log-finetuning-res.png) + +### Provision Azure Resources +Similar to the fine-tuning process, you need to set up the Azure Resources for remote inference by executing the `AI Toolkit: Provision Azure Container Apps for inference` from the command palette. During this setup, you will be asked to select your Azure Subscription and resource group. +![Provision Inference Resource](./images/finetune/command-provision-inference.png) + +By default, the subscription and the resource group for inference should match those used for fine-tuning. The inference will use the same Azure Container App Environment and access the model and model adapter stored in Azure Files, which were generated during the fine-tuning step. + +### Deployment for Inference +If you wish to revise the inference code or reload the inference model, please execute the `AI Toolkit: Deploy for inference` command. This will synchronize your latest code with ACA and restart the replica. + +![Deploy for inference](./images/finetune/command-deploy.png) + +After the successful completion of the deployment, the model is now ready for evaluation using this endpoint. +You can access the inference API by clicking on the "*Go to Inference Endpoint*" button displayed in the VSCode notification. Alternatively, the web API endpoint can be found under `ACA_APP_ENDPOINT` in `./infra/inference.config.json` and in the output panel. + +![App Endpoint](./images/finetune/notification-deploy.png) + +> **Note:** The inference endpoint may require a few minutes to become fully operational. + + + +## Advanced usage + +### Fine-tune project components + +| Folder | Contents | +| ------ |--------- | +| `infra` | Contains all necessary configurations for remote operations. | +| `infra/provision/finetuning.parameters.json` | Holds parameters for the bicep templates, used for provisioning Azure resources for fine-tuning. | +| `infra/provision/finetuning.bicep` | Contains templates for provisioning Azure resources for fine-tuning. | +| `infra/finetuning.config.json` |The configuration file, generated by the `AI Toolkit: Provision Azure Container Apps job for fine-tuning` command. It is used as input for other remote command palettes. | + +### Configuring Secrets for fine-tuning in Azure Container Apps +Azure Container App Secrets provide a secure way to store and manage sensitive data within Azure Container Apps, like HuggingFace tokens and Weights & Biases API keys. Using AI toolkit's command palette, you can input the secrets into the provisioned Azure container app job(as stored in `./finetuning.config.json`). These secrets are then set as **environment variables** in all containers. + +#### Steps: +1. In the Command Palette, type and select `AI Toolkit: Add Azure Container Apps Job secret for fine-tuning` +![Add secret](./images/finetune/command-add-secret.png) + +1. Input Secret Name and Value: You'll be prompted to input the name and value of the secret. + ![Input secret name](./images/finetune/input-secret-name.png) + ![Input secret](./images/finetune/input-secret.png) + For example, if you're using private HuggingFace dataset or models that need Hugging Face access control, set your HuggingFace token as an environment variable [`HF_TOKEN`](https://huggingface.co/docs/huggingface_hub/package_reference/environment_variables#hftoken) to avoid the need for manual login on the Hugging Face Hub. + +After you've set up the secret, you can now use it in your Azure Container App. The secret will be set in the environment variables of your container app. + +### Configuring Azure resource provision for fine-tune +This guide will help you configure the `AI Toolkit: Provision Azure Container Apps job for fine-tuning` command. + +You can find configuration parameters in `./infra/provision/finetuning.parameters.json` file. Here are the details: +| Parameter | Description | +| --------- |------------ | +| `defaultCommands` | This is the default command to start a fine-tuning job. It can be overwritten in `./infra/finetuning.config.json`. | +| `maximumInstanceCount` | This parameter sets the maximum capacity of GPU instances. | +| `timeout` | This sets the timeout for the Azure Container Appfine-tuning job in seconds. The default value is 10800, which equals to 3 hours. If the Azure Container Appjob reaches this timeout, the fine-tuning process halts. However, checkpoints are saved by default, allowing the fine-tuning process to resume from the last checkpoint instead of starting over if it is run again. | +| `location` | This is the location where Azure resources are provisioned. The default value is the same as the chosen resource group's location. | +| `storageAccountName`, `fileShareName` `acaEnvironmentName`, `acaEnvironmentStorageName`, `acaJobName`, `acaLogAnalyticsName` | These parameters are used to name the Azure resources for provision. You can input a new, unused resource name to create your own custom-named resources, or you can input the name of an already existing Azure resource if you'd prefer to use that. For details, refer to the section [Using existing Azure Resources](#using-existing-azure-resources). | + +### Using existing Azure Resources +If you have existing Azure resources that need to be configured for fine-tuning, you can specify their names in the `./infra/provision/finetuning.parameters.json` file and then run the `AI Toolkit: Provision Azure Container Apps job for fine-tuning` from the command palette. This will update the resources you've specified and create any that are missing. + +For example, if you have an existing Azure container environment, your `./infra/finetuning.parameters.json` should look like this: + +```json +{ + "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#", + "contentVersion": "1.0.0.0", + "parameters": { + ... + "acaEnvironmentName": { + "value": "" + }, + "acaEnvironmentStorageName": { + "value": null + }, + ... + } + } +``` + +### Manual provision +If you prefer to manually set up the Azure resources, you can use the provided bicep files in the `./infra/provision` folders. If you've already set up and configured all the Azure resources without using the AI Toolkit command palette, you can simply enter the resource names in the `finetune.config.json` file. + +For example: + +```json +{ + "SUBSCRIPTION_ID": "", + "RESOURCE_GROUP_NAME": "", + "STORAGE_ACCOUNT_NAME": "", + "FILE_SHARE_NAME": "", + "ACA_JOB_NAME": "", + "COMMANDS": [ + "cd /mount", + "pip install huggingface-hub==0.22.2", + "huggingface-cli download --local-dir ./model-cache/ --local-dir-use-symlinks False", + "pip install -r ./setup/requirements.txt", + "python3 ./finetuning/invoke_olive.py && find models/ -print | grep adapter/adapter" + ] +} +``` + +### Inference Components Included in the Template + +| Folder | Contents | +| ------ |--------- | +| `infra` | Contains all necessary configurations for remote operations. | +| `infra/provision/inference.parameters.json` | Holds parameters for the bicep templates, used for provisioning Azure resources for inference. | +| `infra/provision/inference.bicep` | Contains templates for provisioning Azure resources for inference. | +| `infra/inference.config.json` |The configuration file, generated by the `AI Toolkit: Provision Azure Container Apps for inference` command. It is used as input for other remote command palettes. | + +### Configuring Azure Resource Provision +This guide will help you configure the `AI Toolkit: Provision Azure Container Apps for inference` command. + +You can find configuration parameters in `./infra/provision/inference.parameters.json` file. Here are the details: +| Parameter | Description | +| --------- |------------ | +| `defaultCommands` | This is the commands to initiate a web API. | +| `maximumInstanceCount` | This parameter sets the maximum capacity of GPU instances. | +| `location` | This is the location where Azure resources are provisioned. The default value is the same as the chosen resource group's location. | +| `storageAccountName`, `fileShareName` `acaEnvironmentName`, `acaEnvironmentStorageName`, `acaAppName`, `acaLogAnalyticsName` | These parameters are used to name the Azure resources for provision. By default, they will be same to the fine-tuning resource name. You can input a new, unused resource name to create your own custom-named resources, or you can input the name of an already existing Azure resource if you'd prefer to use that. For details, refer to the section [Using existing Azure Resources](#using-existing-azure-resources). | + +### Using Existing Azure Resources +By default, the inference provision use the same Azure Container App Environment, Storage Account, Azure File Share, and Azure Log Analytics that were used for fine-tuning. A separate Azure Container App is created solely for the inference API. + +If you have customized the Azure resources during the fine-tuning step or want to use your own existing Azure resources for inference, specify their names in the `./infra/inference.parameters.json` file. Then, run the `AI Toolkit: Provision Azure Container Apps for inference` command from the command palette. This updates any specified resources and creates any that are missing. + +For example, if you have an existing Azure container environment, your `./infra/finetuning.parameters.json` should look like this: + +```json +{ + "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#", + "contentVersion": "1.0.0.0", + "parameters": { + ... + "acaEnvironmentName": { + "value": "" + }, + "acaEnvironmentStorageName": { + "value": null + }, + ... + } + } +``` + +### Manual Provision +If you prefer to manually configure the Azure resources, you can use the provided bicep files in the `./infra/provision` folders. If you have already set up and configured all the Azure resources without using the AI Toolkit command palette, you can simply enter the resource names in the `inference.config.json` file. + +For example: + +```json +{ + "SUBSCRIPTION_ID": "", + "RESOURCE_GROUP_NAME": "", + "STORAGE_ACCOUNT_NAME": "", + "FILE_SHARE_NAME": "", + "ACA_APP_NAME": "", + "ACA_APP_ENDPOINT": "" +} +``` \ No newline at end of file diff --git a/docs/intelligentapps/images/bulkrun/bulkrun_one.png b/docs/intelligentapps/images/bulkrun/bulkrun_one.png new file mode 100644 index 0000000000..cf4941a95c --- /dev/null +++ b/docs/intelligentapps/images/bulkrun/bulkrun_one.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:09b0b2bb881fe2b40c1b48cb1aef05057eda410be856372f3a27d03705e26ef9 +size 86873 diff --git a/docs/intelligentapps/images/bulkrun/dataset.png b/docs/intelligentapps/images/bulkrun/dataset.png new file mode 100644 index 0000000000..58381e90e0 --- /dev/null +++ b/docs/intelligentapps/images/bulkrun/dataset.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:29475940953f4af6637d1d087eab78069c3b4e653ef37c43f2c26d98ca91a780 +size 23083 diff --git a/docs/intelligentapps/images/bulkrun/runall.png b/docs/intelligentapps/images/bulkrun/runall.png new file mode 100644 index 0000000000..a042864b33 --- /dev/null +++ b/docs/intelligentapps/images/bulkrun/runall.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cee6e5559e5a8964782e6bfa5e12be22e62fbe195a76dbd925d16ed2d380a0e6 +size 27399 diff --git a/docs/intelligentapps/images/evaluation/datawrangler.png b/docs/intelligentapps/images/evaluation/datawrangler.png new file mode 100644 index 0000000000..eae5415394 --- /dev/null +++ b/docs/intelligentapps/images/evaluation/datawrangler.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b5afda46356f83da763ff04ad1b003d6d27ec85b67b10003ccd460ab708af04d +size 56050 diff --git a/docs/intelligentapps/images/evaluation/evaluation.png b/docs/intelligentapps/images/evaluation/evaluation.png new file mode 100644 index 0000000000..f4e2657bc4 --- /dev/null +++ b/docs/intelligentapps/images/evaluation/evaluation.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:684e4434bf2d8d5bb81ffbdf883adacacabe1d371f88e036873c12e59bc0bfdc +size 69549 diff --git a/docs/intelligentapps/images/evaluation/evaluationdetails.png b/docs/intelligentapps/images/evaluation/evaluationdetails.png new file mode 100644 index 0000000000..c641a8b5cb --- /dev/null +++ b/docs/intelligentapps/images/evaluation/evaluationdetails.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7115ffb3fa66353f183f38b51f846d39fc5e1e054679652d91a59e204ff57c44 +size 63652 diff --git a/docs/intelligentapps/images/evaluation/evaluators.png b/docs/intelligentapps/images/evaluation/evaluators.png new file mode 100644 index 0000000000..5dceeb30b7 --- /dev/null +++ b/docs/intelligentapps/images/evaluation/evaluators.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:62c262d38f4c60518be6a5e0be3079b7e7842f23f760a2aa2fb29dc0b7a3ea6a +size 66916 diff --git a/docs/intelligentapps/images/evaluation/openevaluation.png b/docs/intelligentapps/images/evaluation/openevaluation.png new file mode 100644 index 0000000000..2d72e730b2 --- /dev/null +++ b/docs/intelligentapps/images/evaluation/openevaluation.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:47a703d1eaf97ac37fe3dc18daae959615b817e1862cd40358f293332f51cdf1 +size 10310 diff --git a/docs/intelligentapps/images/evaluation/runevaluation.png b/docs/intelligentapps/images/evaluation/runevaluation.png new file mode 100644 index 0000000000..0400081ad8 --- /dev/null +++ b/docs/intelligentapps/images/evaluation/runevaluation.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:738f7217f08dccfb91cdc6d6bfc32e89161059174c944358a21811aa25c5bcf3 +size 66196 diff --git a/docs/intelligentapps/images/evaluation/running.png b/docs/intelligentapps/images/evaluation/running.png new file mode 100644 index 0000000000..ab8ba7d781 --- /dev/null +++ b/docs/intelligentapps/images/evaluation/running.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0a624a63e4e0e8aafa743c900d00d679401ae9ce556f0b17af4a947736dce892 +size 18278 diff --git a/docs/intelligentapps/images/faq/10-edit-settings.png b/docs/intelligentapps/images/faq/10-edit-settings.png new file mode 100644 index 0000000000..4a197bfb28 --- /dev/null +++ b/docs/intelligentapps/images/faq/10-edit-settings.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:86da73a3b5da95ed76f2714ecf84a3940ddbb533f8e1ced0bf5cd371f34b9133 +size 191056 diff --git a/docs/intelligentapps/images/faq/6-aoai-deployments.png b/docs/intelligentapps/images/faq/6-aoai-deployments.png new file mode 100644 index 0000000000..a3a1df73d9 --- /dev/null +++ b/docs/intelligentapps/images/faq/6-aoai-deployments.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:eab3d2cd949f997b1de5ddc4e6c91c61ceb4f687787581497b526ca70498ea95 +size 442690 diff --git a/docs/intelligentapps/images/faq/7-aoai-model.png b/docs/intelligentapps/images/faq/7-aoai-model.png new file mode 100644 index 0000000000..87e9c502c8 --- /dev/null +++ b/docs/intelligentapps/images/faq/7-aoai-model.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7722f13adcc6d287e234d8198e819ac1ff0ca3b1b085ed1abd047d49c4f4e76b +size 462854 diff --git a/docs/intelligentapps/images/faq/8-openai-key.png b/docs/intelligentapps/images/faq/8-openai-key.png new file mode 100644 index 0000000000..d167b0be50 --- /dev/null +++ b/docs/intelligentapps/images/faq/8-openai-key.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a27eac034733af7819e354948ff3edd6b45ffe05ef093ff7d6ef411d702a7a3 +size 236551 diff --git a/docs/intelligentapps/images/faq/9-edit.png b/docs/intelligentapps/images/faq/9-edit.png new file mode 100644 index 0000000000..ab66591ec1 --- /dev/null +++ b/docs/intelligentapps/images/faq/9-edit.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7da34a58e0535ddf5bc5608faaec3a80162d15848b47564a368bb9778ea8651d +size 260736 diff --git a/docs/intelligentapps/images/faq/faq-github-api-forbidden.png b/docs/intelligentapps/images/faq/faq-github-api-forbidden.png new file mode 100644 index 0000000000..92fb301a5a --- /dev/null +++ b/docs/intelligentapps/images/faq/faq-github-api-forbidden.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d206b83d866cc66c504f61e694160f0084c34be47edbe2178becc6ec1b40cb93 +size 63936 diff --git a/docs/intelligentapps/images/faq/faq-onnx-agent.png b/docs/intelligentapps/images/faq/faq-onnx-agent.png new file mode 100644 index 0000000000..f4ebd54429 --- /dev/null +++ b/docs/intelligentapps/images/faq/faq-onnx-agent.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:182d532c04f0dc8b8dd6d8bd14cee1b6997df7851fd0bd58a1a7c8848c65c0b0 +size 42966 diff --git a/docs/intelligentapps/images/finetune/command-add-secret.png b/docs/intelligentapps/images/finetune/command-add-secret.png new file mode 100644 index 0000000000..e95d8dd46e --- /dev/null +++ b/docs/intelligentapps/images/finetune/command-add-secret.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b29629be12a9cb0636f58d7bdb4f193a9faf9a998e07240387c6553635d64e56 +size 268099 diff --git a/docs/intelligentapps/images/finetune/command-deploy.png b/docs/intelligentapps/images/finetune/command-deploy.png new file mode 100644 index 0000000000..ea0a1d88e5 --- /dev/null +++ b/docs/intelligentapps/images/finetune/command-deploy.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9655a840b35486eba66868eceee6218cc83437c1aa9715d8d00e5a44de1228c5 +size 525504 diff --git a/docs/intelligentapps/images/finetune/command-focus-resource-view.png b/docs/intelligentapps/images/finetune/command-focus-resource-view.png new file mode 100644 index 0000000000..094c0b1447 --- /dev/null +++ b/docs/intelligentapps/images/finetune/command-focus-resource-view.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e4f2e9f203520da6eb53d5896da663a262ec493c655068c5d80532954133f26 +size 201497 diff --git a/docs/intelligentapps/images/finetune/command-provision-finetune.png b/docs/intelligentapps/images/finetune/command-provision-finetune.png new file mode 100644 index 0000000000..d1b1402da2 --- /dev/null +++ b/docs/intelligentapps/images/finetune/command-provision-finetune.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca216046175260d06a6f2b9db54089042f21fcc9fe8cf4e3254ff5cabbfda788 +size 278588 diff --git a/docs/intelligentapps/images/finetune/command-provision-inference.png b/docs/intelligentapps/images/finetune/command-provision-inference.png new file mode 100644 index 0000000000..77cba9ed49 --- /dev/null +++ b/docs/intelligentapps/images/finetune/command-provision-inference.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fbf3021539839187c6726dd2ac9441c289b2ed76469486319e2287152ac80413 +size 424515 diff --git a/docs/intelligentapps/images/finetune/command-run-finetuning.png b/docs/intelligentapps/images/finetune/command-run-finetuning.png new file mode 100644 index 0000000000..dd83dbb973 --- /dev/null +++ b/docs/intelligentapps/images/finetune/command-run-finetuning.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:24481c8cc7894a3aca18a1917b409c4bc3a7c2136644034e0d4b10621638d313 +size 588593 diff --git a/docs/intelligentapps/images/finetune/command-show-streaming-log.png b/docs/intelligentapps/images/finetune/command-show-streaming-log.png new file mode 100644 index 0000000000..a7e142ea37 --- /dev/null +++ b/docs/intelligentapps/images/finetune/command-show-streaming-log.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c7d9c6228d69caf7e47d949c499784a69812465ba9cedb217d298b973db0ef7b +size 585702 diff --git a/docs/intelligentapps/images/finetune/finetune-job-history.png b/docs/intelligentapps/images/finetune/finetune-job-history.png new file mode 100644 index 0000000000..bc1f498a48 --- /dev/null +++ b/docs/intelligentapps/images/finetune/finetune-job-history.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9ee00c826d75192816aa3cd21cc06b1cf3d8494222e4a9af6b9e2b3d3b870b86 +size 135226 diff --git a/docs/intelligentapps/images/finetune/finetune-job-log-query.png b/docs/intelligentapps/images/finetune/finetune-job-log-query.png new file mode 100644 index 0000000000..f1a3f08af9 --- /dev/null +++ b/docs/intelligentapps/images/finetune/finetune-job-log-query.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:734c5d9831351b1989998f32a823d759fc95c585ff955afa3a9732aa59dae6f2 +size 216856 diff --git a/docs/intelligentapps/images/finetune/input-secret-name.png b/docs/intelligentapps/images/finetune/input-secret-name.png new file mode 100644 index 0000000000..051d0c18d0 --- /dev/null +++ b/docs/intelligentapps/images/finetune/input-secret-name.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cdf9de42a2c2b95e5b8d97b83c49bee337180b67d5af85e6efba01cd9323cc72 +size 25049 diff --git a/docs/intelligentapps/images/finetune/input-secret.png b/docs/intelligentapps/images/finetune/input-secret.png new file mode 100644 index 0000000000..ce46ab090e --- /dev/null +++ b/docs/intelligentapps/images/finetune/input-secret.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b07fc55b3dac5d7b421edf694676b904b1b74102479bffb86cb098c13c0a8fce +size 26902 diff --git a/docs/intelligentapps/images/finetune/log-finetining-progress.png b/docs/intelligentapps/images/finetune/log-finetining-progress.png new file mode 100644 index 0000000000..c0b53d4021 --- /dev/null +++ b/docs/intelligentapps/images/finetune/log-finetining-progress.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:303f873921aebe42543570a429041af978c5d6b6b5053f7f3aaa14d95bb60cdb +size 377798 diff --git a/docs/intelligentapps/images/finetune/log-finetuning-files.png b/docs/intelligentapps/images/finetune/log-finetuning-files.png new file mode 100644 index 0000000000..8b19735c5b --- /dev/null +++ b/docs/intelligentapps/images/finetune/log-finetuning-files.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e5d7ba635aba6939c9e49636fd393b0f4b30f911fffa6e7bfc3ce7e481e629e0 +size 560892 diff --git a/docs/intelligentapps/images/finetune/log-finetuning-res.png b/docs/intelligentapps/images/finetune/log-finetuning-res.png new file mode 100644 index 0000000000..953eebfc43 --- /dev/null +++ b/docs/intelligentapps/images/finetune/log-finetuning-res.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:50b7a73a88c9497bcc6a16882e6e36876d07693f25b4834662fcb27383ad72a1 +size 530700 diff --git a/docs/intelligentapps/images/finetune/notification-deploy.png b/docs/intelligentapps/images/finetune/notification-deploy.png new file mode 100644 index 0000000000..99de346ffb --- /dev/null +++ b/docs/intelligentapps/images/finetune/notification-deploy.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6345ce00aa9a432a27f22a2c44cbe3513c12cbd9965d872b922fe3609225d3c7 +size 551471 diff --git a/docs/intelligentapps/images/finetune/notification-finetune.png b/docs/intelligentapps/images/finetune/notification-finetune.png new file mode 100644 index 0000000000..40fe109840 --- /dev/null +++ b/docs/intelligentapps/images/finetune/notification-finetune.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0ea490ba83878d778daeac396aa655176b4ba25007127286053301d6186323d8 +size 391263 diff --git a/docs/intelligentapps/images/finetune/panel-config-model.png b/docs/intelligentapps/images/finetune/panel-config-model.png new file mode 100644 index 0000000000..f824e0f1ac --- /dev/null +++ b/docs/intelligentapps/images/finetune/panel-config-model.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d9f516f4522759d49b01621966451fc1d7dae8f5e2dd388ff1909f8bba2b92b9 +size 239539 diff --git a/docs/intelligentapps/images/finetune/panel-generate-project.png b/docs/intelligentapps/images/finetune/panel-generate-project.png new file mode 100644 index 0000000000..88d360dd80 --- /dev/null +++ b/docs/intelligentapps/images/finetune/panel-generate-project.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ad0135987ed132e66d92a087f9489e0cfe2aa359e72e4309f78bd052c1aa8348 +size 169417 diff --git a/docs/intelligentapps/images/finetune/panel-select-model.png b/docs/intelligentapps/images/finetune/panel-select-model.png new file mode 100644 index 0000000000..0430f0f713 --- /dev/null +++ b/docs/intelligentapps/images/finetune/panel-select-model.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f15fbb0af5e07a71439c94a4da047424c3c7a761c247a2b7449a393a425d478d +size 251909 diff --git a/docs/intelligentapps/images/finetune/settings.png b/docs/intelligentapps/images/finetune/settings.png new file mode 100644 index 0000000000..1cb8dbe0f9 --- /dev/null +++ b/docs/intelligentapps/images/finetune/settings.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4968002c8310abaad401c65346fa0b345215f0965cf311a3c93510f4bce2b05e +size 305446 diff --git a/docs/intelligentapps/images/models/byom.png b/docs/intelligentapps/images/models/byom.png new file mode 100644 index 0000000000..668d2e0b44 --- /dev/null +++ b/docs/intelligentapps/images/models/byom.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f99e3da23759d75dc4b71732be2481c45c68259fc15c0cc203a69a2ee7226cb2 +size 45836 diff --git a/docs/intelligentapps/images/models/model_catalog.png b/docs/intelligentapps/images/models/model_catalog.png new file mode 100644 index 0000000000..18482b3824 --- /dev/null +++ b/docs/intelligentapps/images/models/model_catalog.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9f8351f7213b63ab479fd69b9f6d1fabad4f70c1b141429a47f048b61721cef0 +size 119633 diff --git a/docs/intelligentapps/images/models/select-models.png b/docs/intelligentapps/images/models/select-models.png new file mode 100644 index 0000000000..705183915a --- /dev/null +++ b/docs/intelligentapps/images/models/select-models.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:190966047dc6ca5868b62133a926f282a89cb2a9b0d6f313c401bb674926aa13 +size 152115 diff --git a/docs/intelligentapps/images/models/select-ollama.png b/docs/intelligentapps/images/models/select-ollama.png new file mode 100644 index 0000000000..0d24e5f9d4 --- /dev/null +++ b/docs/intelligentapps/images/models/select-ollama.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ed28fdf216e28741c443f25b5067d20d094f1d0655d4fe7c66317a8b074c104b +size 115845 diff --git a/docs/intelligentapps/images/models/select-type.png b/docs/intelligentapps/images/models/select-type.png new file mode 100644 index 0000000000..4981054bfb --- /dev/null +++ b/docs/intelligentapps/images/models/select-type.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e988a1139e1c81392dc2890354d6b339b9a6a77feaab53cc3f10c452d1a5a107 +size 115616 diff --git a/docs/intelligentapps/images/overview/get_started.png b/docs/intelligentapps/images/overview/get_started.png new file mode 100644 index 0000000000..d5e5b3944e --- /dev/null +++ b/docs/intelligentapps/images/overview/get_started.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:247d45fe9dbed6f0c6c4a5ae61d92544e1cb4cc88ac0d835258de51a1bce40f4 +size 287068 diff --git a/docs/intelligentapps/images/overview/install.png b/docs/intelligentapps/images/overview/install.png new file mode 100644 index 0000000000..72cfd0d966 --- /dev/null +++ b/docs/intelligentapps/images/overview/install.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:280f631d5af5a2c267073847b1853ae85fb833f44da8f45dde3964f1a7444c25 +size 104833 diff --git a/docs/intelligentapps/images/playground/0-entrypoint-treeview.png b/docs/intelligentapps/images/playground/0-entrypoint-treeview.png new file mode 100644 index 0000000000..08f1a580cf --- /dev/null +++ b/docs/intelligentapps/images/playground/0-entrypoint-treeview.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:53d54adbb4beee43751d9aa676539cfa56022bb16512088967943386c4a1d47f +size 176932 diff --git a/docs/intelligentapps/images/playground/1-entrypoint-command.png b/docs/intelligentapps/images/playground/1-entrypoint-command.png new file mode 100644 index 0000000000..791d7a9c72 --- /dev/null +++ b/docs/intelligentapps/images/playground/1-entrypoint-command.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ae9b1dbbcc73123752787471ae508360ceceeca3abfe53cdc8364a91fd42a6bb +size 208234 diff --git a/docs/intelligentapps/images/playground/2-model-name.png b/docs/intelligentapps/images/playground/2-model-name.png new file mode 100644 index 0000000000..5d8fe3e454 --- /dev/null +++ b/docs/intelligentapps/images/playground/2-model-name.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:07f9379c5c880a92eaa1c31e44ea77ce772d92aafeb611611050e553380d836d +size 185433 diff --git a/docs/intelligentapps/images/playground/3-endpoint.png b/docs/intelligentapps/images/playground/3-endpoint.png new file mode 100644 index 0000000000..e17d0b9522 --- /dev/null +++ b/docs/intelligentapps/images/playground/3-endpoint.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d0b2d99c9dd7fd9601a22d7d89aea8ae0fcc157b25eac855150d2a9d82fc886a +size 197164 diff --git a/docs/intelligentapps/images/playground/4-auth-header.png b/docs/intelligentapps/images/playground/4-auth-header.png new file mode 100644 index 0000000000..fec5b7d622 --- /dev/null +++ b/docs/intelligentapps/images/playground/4-auth-header.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:96932f24e1640163e5fa8344106399cb96bf8d50d056e39fa511fbfb3e19b909 +size 191108 diff --git a/docs/intelligentapps/images/playground/5-inference.png b/docs/intelligentapps/images/playground/5-inference.png new file mode 100644 index 0000000000..074b58732a --- /dev/null +++ b/docs/intelligentapps/images/playground/5-inference.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:28e8defb1415cf1279f322e8ce02c3b677f69c6faf26c5e8c751cb4aa09c2e52 +size 248140 diff --git a/docs/intelligentapps/images/playground/attachment.png b/docs/intelligentapps/images/playground/attachment.png new file mode 100644 index 0000000000..9e93824d29 --- /dev/null +++ b/docs/intelligentapps/images/playground/attachment.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c1e1bedc954acb51d7ff05468edde2c93ff606be6d1d823b175a9a245531c8e5 +size 16211 diff --git a/docs/intelligentapps/images/playground/parameters.png b/docs/intelligentapps/images/playground/parameters.png new file mode 100644 index 0000000000..e0eadb1d0d --- /dev/null +++ b/docs/intelligentapps/images/playground/parameters.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0304855d0e69945c4f5d9195367babf9c812d243649ca283b717141583daabb9 +size 28318 diff --git a/docs/intelligentapps/images/playground/playground.png b/docs/intelligentapps/images/playground/playground.png new file mode 100644 index 0000000000..908d78a0fc --- /dev/null +++ b/docs/intelligentapps/images/playground/playground.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:976196ecb65ea747419503289773d1c674a0d25bedc21c8bf73a57b20bb4a856 +size 240110 diff --git a/docs/intelligentapps/models.md b/docs/intelligentapps/models.md new file mode 100644 index 0000000000..5b810452a6 --- /dev/null +++ b/docs/intelligentapps/models.md @@ -0,0 +1,107 @@ +--- +Order: 2 +Area: intelligentapps +TOCTitle: Models +ContentId: +PageTitle: AI Models in AI Toolkit +DateApproved: +MetaDescription: Find a popular generative AI model by publisher and source. Bring your own model that is hosted with a URL, or select an Ollama model. +MetaSocialImage: +--- + +# Models in AI Toolkit + +AI Toolkit supports a broad range of generative AI models. Both Small Language Models (SLM) and Large Language Models (LLM) are supported. + +In the model catalog, you can access models from various sources: + +- GitHub-hosted models (Llama3, Phi-3, Mistral models) +- Publisher-hosted models (OpenAI ChatGPT models, Anthropic Claude, Google Gemini) +- Locally downloaded models, for example from HuggingFace +- Locally running Ollama models +- Connect to Bring-Your-Own-Models + +## Find a model + +To find a model in the model catalog: + +1. In the AI Toolkit view, select **CATALOG** > **Models** to open the model catalog + +1. Use the filters to reduce the list of available models + + - **Hosted by**: AI Toolkit supports GitHub, ONNX, OpenAI, Anthropic, Google as model hosting sources. + + - **Publisher**: The publisher for AI models, such as Microsoft, Meta, Google, OpenAI, Anthropic, Mistral AI, and more. + + - **Tasks**: Currently, only `Text Generation` is supported. + + - **Model type**: Filter models that can run remotely or locally on CPU, GPU, or NPU. This filter depends on the local availability. + + - **Fine-tuning Support**: switch filters models that can be used to run fine-tuning. + + ![Select model in model catalog](./images/models/model_catalog.png) + +To reference a self-hosted model or locally running Ollama model: + +1. Select **+ Add model** in the model catalog + +1. Choose between Ollama or a custom model in the model Quick Pick + +1. Provide details to add the model + +Select a model card in model catalog to view more details of the selected model. + +## License and sign-in + +Some models require a publisher or hosting-service license and account to sign-in. In that case, you are prompted when you select a model from the model catalog to run it in the playground. + +## Select a model for testing + +You can test run each of the models in the playground for chat completions. + +On each model card, there are several options: +- **Try in Playground**: load the selected model for testing in the playground without downloading it +- **Download**: download the model from a source like Hugging Face +- **Load in Playground**: load a downloaded model into the playground for chat + +## Bring your own models + +AI Toolkit's playground also supports remote models. + +If you have a self-hosted or deployed model that is accessible from the internet, you can add it to AI Toolkit and use it in the playground. + +1. Hover over **MY MODELS** in the treeview, and select the `+` icon to add a remote model into AI Toolkit. +1. Fill in the requested information, such as model name, display name, model hosting URL, and optional auth string. + +![Bring Your Own Models](./images/models/byom.png) + +## Add Ollama models + +Ollama enables many popular genAI models to run locally with CPU via GGUF quantization. If you have Ollama installed on your local machine with downloaded Ollama models, you can add them to AI Toolkit to use in playground. + +### Prerequisites + +- AI Toolkit v0.6.2 or newer. +- [Ollama](https://ollama.com/download) (Tested on Ollama v0.4.1) + +### Steps to add local Ollama into AI Toolkit + +1. Select the "+" icon while hovering over **MY MODELS** in the treeview, or select the **+ Add model** button in the model catalog or playground. + +1. Select **Add an Ollama model** + + ![Select model type to add](./images/models/select-type.png) + +1. Select **Select models from Ollama library**. Or if you start Ollama runtime at a different address, you can choose **Provide custom Ollama endpoint** to specify an Ollama endpoint. + + ![Select Ollama models](./images/models/select-ollama.png) + +1. Select the models you want to add to AI Toolkit. + + ![Select available models to add](./images/models/select-models.png) + + > Note that AI Toolkit will only show models that are already downloaded in Ollama and not already added to AI Toolkit. To download a model from Ollama, you can run `ollama pull `. You can see the list of models supported by Ollama in [Ollama library](https://ollama.com/library) or refer to the [Ollama documentation](https://github.com/ollama/ollama). + +1. You will see the added Ollama model on treeview's **MY MODELS** list. Use this Ollama model the same way as other models in playground. + + > Attachment is not support yet for Ollama models. Since we connect to Ollama using its [OpenAI compatible endpoint](https://github.com/ollama/ollama/blob/main/docs/openai.md) and it doesn't support attachments yet. diff --git a/docs/intelligentapps/overview.md b/docs/intelligentapps/overview.md new file mode 100644 index 0000000000..c38bccadc3 --- /dev/null +++ b/docs/intelligentapps/overview.md @@ -0,0 +1,50 @@ +--- +Order: 1 +Area: intelligentapps +TOCTitle: Overview +ContentId: +PageTitle: AI Toolkit Overview +DateApproved: +MetaDescription: Develop and test AI apps with AI Toolkit for Visual Studio Code. Inference test, batch run, evaluate, finetune and deploy LLMs and SLMs. +MetaSocialImage: +--- + +# AI Toolkit for Visual Studio Code + + +AI Toolkit for Visual Studio Code is an extension to help developers and AI engineers to easily build AI apps through developing and testing with generative AI models locally or in the cloud. AI Toolkit supports most genAI models on the market. + +AI engineers can use AI Toolkit to discover and try popular AI models easily with playground that has attachment support, run multiple prompts in batch mode, evaluate the prompts in a dataset to AI models for the popular evaluators, and finetune/deploy AI models. + +## Key features + +- [Model catalog](/docs/intelligentapps/models.md) with rich generative AI models sources (GitHub, ONNX, OpenAI, Anthropic, Google, ...) +- [Bring Your Own Models](/docs/intelligentapps/models.md#bring-your-own-models) from remotely hosted model, or Ollama models that are running locally +- [Playground](/docs/intelligentapps/playground.md) for model inference or testing via chat +- Attachment support for multi-modal language models +- [Batch run prompts](/docs/intelligentapps/bulkrun.md) for selected AI models +- [Evaluate an AI model with a dataset](/docs/intelligentapps/evaluation.md) for supported popular evaluators like F1 score, relevance, similarity, coherence, and more +## Who should use AI Toolkit? + +Any developer who wants to explore, test, evaluate, and finetune generative AI models when building AI apps. + +## Install and setup + +You can install the AI Toolkit from the Extensions view in VS Code: + +> Install the AI Toolkit for VS Code + +You can switch the installation between the formal released version for stable features and pre-released version for early access of new features. Check What's New during the installation for detailed feature list of each version. + +## Getting started + +After you install the AI Toolkit, follow the steps in the AI Toolkit getting started walkthrough. The walkthrough takes you through the playground, where you can use chat to interact with AI models. + +1. Select the AI Toolkit view in the Activity Bar + +1. In the **Help and Feedback** section, select **Get Started** to open the walkthrough + + ![Getting started](./images/overview/get_started.png) + + +Find more on [models](./models.md) and [playground](./playground.md) \ No newline at end of file diff --git a/docs/intelligentapps/playground.md b/docs/intelligentapps/playground.md new file mode 100644 index 0000000000..70aed29928 --- /dev/null +++ b/docs/intelligentapps/playground.md @@ -0,0 +1,47 @@ +--- +Order: 3 +Area: intelligentapps +TOCTitle: Playground +ContentId: +PageTitle: AI Model Playground +DateApproved: +MetaDescription: Chat with selected generative AI model in playground. Change system prompt and parameters. Add attachment for Multi-Modal models. Keep chat history. +MetaSocialImage: +--- + +# AI Toolkit playground + +The AI Toolkit playground is where you can interact with your AI models and try different prompts with different model parameter settings. You can also use the playground to interact with multi-modal models that support attachment of different input formats. + +![Playground view](./images/playground/playground.png) + +## Test a model in the playground + + +There are multiple options to open the model playground: + +- In AI Toolkit, select the **Playground** in the treeview + +- Select **Load in Playground** or **Try in Playground** from a model card in the model catalog + +To test a model in the playground: + +1. Select a model +1. Insert context instructions (Optional) +1. Change model parameters (Optional) +1. Type prompts or questions from the chat box at the bottom of playground +1. Send questions and wait for responses. + +![Setting model parameters](./images/playground/parameters.png) + +From the chat input box, you can also clear chat history or add attachments for the prompt. + +## Add attachments for multi-modal models + +Multi-modal models allow you to include attachment files in different format, such as image, voice, video, or document, send together with prompts to AI models. You can ask questions about the contents in the attachments. + +For the models that support attachments, the attachment icon (paperclip) will show. Select the icon, and follow the instructions to attach one or more local files and use them with your prompt. + +![Adding attachments](./images/playground/attachment.png) + +