Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Run llm tests for anthropic in ci #517

Closed
wants to merge 29 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
26099d4
Gemini tests for CI: WIP
rjambrecic Jan 14, 2025
1b0923f
Mark all opeani test also with llm mark
rjambrecic Jan 14, 2025
6587f5b
Change skipping openai tests to skip llm tests
rjambrecic Jan 14, 2025
089af36
Update openai.yml to have llm-service matrix parameter
rjambrecic Jan 14, 2025
2d1a092
Comment out environment in openai.yml just for testing
rjambrecic Jan 14, 2025
1dc7b4f
Comment out environment in openai.yml just for testing
rjambrecic Jan 14, 2025
8ed3f06
openai.yml work in progress
rjambrecic Jan 14, 2025
98486ca
openai.yml work in progress
rjambrecic Jan 14, 2025
37ed55a
openai.yml work in progress
rjambrecic Jan 14, 2025
aa36bcc
openai.yml work in progress
rjambrecic Jan 14, 2025
880f536
openai.yml WIP
rjambrecic Jan 14, 2025
6baae30
CI refactoring
rjambrecic Jan 15, 2025
f067b6a
Remove llm pytest mark
rjambrecic Jan 15, 2025
1b536de
llm.yml WIP
rjambrecic Jan 15, 2025
8b944ef
llm.yml WIP
rjambrecic Jan 15, 2025
a38e9ac
llm.yml WIP
rjambrecic Jan 15, 2025
a61ab46
llm.yml WIP
rjambrecic Jan 15, 2025
4c660a7
llm.yml WIP
rjambrecic Jan 15, 2025
5eb309b
Remove pytest skip for windows and mac if test is marked with openai
rjambrecic Jan 15, 2025
e2a6602
Gemini LLM tests WIP
rjambrecic Jan 15, 2025
0c17e52
Gemini LLM tests WIP
rjambrecic Jan 15, 2025
46a7c6b
10 gemini tests added
rjambrecic Jan 15, 2025
62d3a39
Rename test scripts
rjambrecic Jan 15, 2025
e7b6298
Update docs
rjambrecic Jan 15, 2025
6fc3602
Update llm.yml
rjambrecic Jan 15, 2025
0aa52c9
Merge remote-tracking branch 'origin/main' into run-llm-tests-for-gem…
rjambrecic Jan 16, 2025
c325721
Tests for Anthopic LLMs: WIP
rjambrecic Jan 16, 2025
137d862
Tests for Anthopic LLMs: WIP
rjambrecic Jan 16, 2025
8a2d8d6
Add anthropic-claude-sonnet to conftest
rjambrecic Jan 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,19 +91,19 @@ jobs:
# Remove the line below once https://github.com/docker/docker-py/issues/3256 is merged
run: |
uv pip install --system "requests<2.32.0"
bash scripts/test_skip_openai.sh
bash scripts/test-skip-llm.sh
- name: Test with pytest skipping openai and docker tests
if: matrix.python-version != '3.10' && matrix.os != 'ubuntu-latest'
run: |
bash scripts/test_skip_openai.sh --skip-docker
bash scripts/test-skip-llm.sh --skip-docker
- name: Coverage with Redis
if: matrix.python-version == '3.10'
run: |
uv pip install --system -e .[redis,websockets]
bash scripts/test_skip_openai.sh
bash scripts/test-skip-llm.sh
- name: Test with Cosmos DB
run: |
bash scripts/test.sh test/cache/test_cosmos_db_cache.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/cache/test_cosmos_db_cache.py
- name: Upload coverage to Codecov
if: matrix.python-version == '3.10'
uses: codecov/codecov-action@v3
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/contrib-graph-rag-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ jobs:
OAI_CONFIG_LIST: ${{ secrets.OAI_CONFIG_LIST }}
run: |
uv pip install --system pytest-cov>=5
bash scripts/test.sh test/agentchat/contrib/graph_rag/test_falkor_graph_rag.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/graph_rag/test_falkor_graph_rag.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -108,7 +108,7 @@ jobs:
OAI_CONFIG_LIST: ${{ secrets.OAI_CONFIG_LIST }}
run: |
uv pip install --system pytest-cov>=5
bash scripts/test.sh test/agentchat/contrib/graph_rag/test_neo4j_graph_rag.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/graph_rag/test_neo4j_graph_rag.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down
42 changes: 21 additions & 21 deletions .github/workflows/contrib-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/test_retrieve_utils.py test/agentchat/contrib/retrievechat/test_retrievechat.py test/agentchat/contrib/retrievechat/test_qdrant_retrievechat.py test/agentchat/contrib/vectordb -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/test_retrieve_utils.py test/agentchat/contrib/retrievechat/test_retrievechat.py test/agentchat/contrib/retrievechat/test_qdrant_retrievechat.py test/agentchat/contrib/vectordb
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -131,7 +131,7 @@ jobs:
- name: Coverage
run: |
uv pip install --system pytest-cov>=5
bash scripts/test.sh test/test_retrieve_utils.py test/agentchat/contrib/retrievechat test/agentchat/contrib/vectordb -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/test_retrieve_utils.py test/agentchat/contrib/retrievechat test/agentchat/contrib/vectordb
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -162,7 +162,7 @@ jobs:
uv pip install --system -e .
- name: Coverage
run: |
bash scripts/test.sh test/agentchat/contrib/agent_eval/ -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/agent_eval/
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -199,7 +199,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/agentchat/contrib/test_gpt_assistant.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/test_gpt_assistant.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -236,7 +236,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/agentchat/contrib/capabilities/test_teachable_agent.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/capabilities/test_teachable_agent.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -273,7 +273,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/test_browser_utils.py test/agentchat/contrib/test_web_surfer.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/test_browser_utils.py test/agentchat/contrib/test_web_surfer.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -312,11 +312,11 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/agentchat/contrib/test_img_utils.py test/agentchat/contrib/test_lmm.py test/agentchat/contrib/test_llava.py test/agentchat/contrib/capabilities/test_vision_capability.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/test_img_utils.py test/agentchat/contrib/test_lmm.py test/agentchat/contrib/test_llava.py test/agentchat/contrib/capabilities/test_vision_capability.py
- name: Image Gen Coverage
if: ${{ matrix.os != 'windows-latest' && matrix.python-version != '3.13' }}
run: |
bash scripts/test.sh test/agentchat/contrib/capabilities/test_image_generation_capability.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/capabilities/test_image_generation_capability.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -358,7 +358,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/oai/test_gemini.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_gemini.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -395,7 +395,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/agentchat/contrib/capabilities/test_transform_messages.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/capabilities/test_transform_messages.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -427,7 +427,7 @@ jobs:
uv pip install --system llama-index llama-index-llms-openai
- name: Coverage
run: |
bash scripts/test.sh test/agentchat/contrib/test_llamaindex_conversable_agent.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/test_llamaindex_conversable_agent.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -470,7 +470,7 @@ jobs:

- name: Coverage
run: |
bash scripts/test.sh test/oai/test_anthropic.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_anthropic.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -512,7 +512,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/oai/test_cerebras.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_cerebras.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -554,7 +554,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/oai/test_mistral.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_mistral.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -596,7 +596,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/oai/test_together.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_together.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -638,7 +638,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/oai/test_groq.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_groq.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -676,7 +676,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/oai/test_cohere.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_cohere.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -718,7 +718,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/oai/test_ollama.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_ollama.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -760,7 +760,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/oai/test_bedrock.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/oai/test_bedrock.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -802,7 +802,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/agentchat/contrib/test_swarm.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/test_swarm.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down Expand Up @@ -855,7 +855,7 @@ jobs:
fi
- name: Coverage
run: |
bash scripts/test.sh test/agentchat/contrib/test_reasoning_agent.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/agentchat/contrib/test_reasoning_agent.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docs-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
uv pip install --system -e ".[test,docs]"
- name: Run documentation tests
run: |
bash scripts/test.sh test/website/test_process_api_reference.py test/website/test_process_notebooks.py -m "not openai"
bash scripts/test-contrib-skip-llm.sh test/website/test_process_api_reference.py test/website/test_process_notebooks.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down
35 changes: 12 additions & 23 deletions .github/workflows/openai.yml → .github/workflows/llm.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# This workflow will install Python dependencies and run tests with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: OpenAI
name: LLM Tests

on:
pull_request:
Expand All @@ -25,7 +25,8 @@ jobs:
strategy:
matrix:
os: [ubuntu-latest]
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
python-version: ["3.9"]
llm: ["openai", "gemini", "anthropic"]
runs-on: ${{ matrix.os }}
environment: openai1
services:
Expand All @@ -48,36 +49,24 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install packages and dependencies
if: matrix.llm == 'openai'
run: |
docker --version
uv pip install --system -e ".[test]"
uv pip install --system -e ".[test,redis,interop]"
python -c "import autogen"
- name: Install packages for test when needed
if: matrix.python-version == '3.9'
- name: Install packages for ${{ matrix.llm }}
if: matrix.llm == 'gemini' || matrix.llm == 'anthropic'
run: |
uv pip install --system docker
uv pip install --system -e .[redis,interop]
- name: Coverage
if: matrix.python-version == '3.9'
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
OAI_CONFIG_LIST: ${{ secrets.OAI_CONFIG_LIST }}
run: |
bash scripts/test.sh test --ignore=test/agentchat/contrib --durations=10 --durations-min=1.0
- name: Coverage and check notebook outputs
if: matrix.python-version != '3.9'
docker --version
uv pip install --system -e ".[test,redis,interop,${{ matrix.llm }}]"
python -c "import autogen"
- name: Tests - ${{ matrix.llm }}
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
WOLFRAM_ALPHA_APPID: ${{ secrets.WOLFRAM_ALPHA_APPID }}
OAI_CONFIG_LIST: ${{ secrets.OAI_CONFIG_LIST }}
run: |
uv pip install --system nbconvert nbformat ipykernel
bash scripts/test.sh test/test_notebook.py --durations=10 --durations-min=1.0
cat "$(pwd)/test/executed_openai_notebook_output.txt"
run: bash scripts/test_llm.sh -m "${{ matrix.llm }}"
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,7 @@ markers = [
"all",
"openai",
"gemini",
"anthropic",
"redis",
"docker",
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@

#!/usr/bin/env bash

bash scripts/test.sh -m "not openai" --ignore=test/agentchat/contrib "$@"
bash scripts/test.sh -m "not (openai or gemini or anthropic)" "$@"
27 changes: 27 additions & 0 deletions scripts/test-llm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright (c) 2023 - 2024, Owners of https://github.com/ag2ai
#
# SPDX-License-Identifier: Apache-2.0

#!/usr/bin/env bash

# Default mark if none is provided
DEFAULT_MARK="openai or gemini or anthropic"

# Initialize MARK as the default value
MARK="$DEFAULT_MARK"

# Parse arguments for the -m flag
while [[ $# -gt 0 ]]; do
case $1 in
-m)
MARK="$2" # Set MARK to the provided value
shift 2 # Remove -m and its value from arguments
;;
*)
break # If no more flags, stop processing options
;;
esac
done

# Call the test script with the correct mark and any remaining arguments
bash scripts/test.sh "$@" -m "$MARK" --ignore=test/agentchat/contrib
7 changes: 7 additions & 0 deletions scripts/test-skip-llm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Copyright (c) 2023 - 2024, Owners of https://github.com/ag2ai
#
# SPDX-License-Identifier: Apache-2.0

#!/usr/bin/env bash

bash scripts/test-contrib-skip-llm.sh --ignore=test/agentchat/contrib "$@"
Loading
Loading