-
Notifications
You must be signed in to change notification settings - Fork 8
Add Quarto evaluation workflow and test suite #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
karangattu
wants to merge
5
commits into
t-kalinowski:main
Choose a base branch
from
karangattu:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
2925325
Add Quarto evaluation workflow and test suite
karangattu d7785ba
Fix solver function reference and update callout example
karangattu babfe86
Update eval workflow to use quartohelp package
karangattu 46650d3
Update vitals package source in workflow
karangattu 13c44ae
Update R dependencies in eval-and-publish workflow
karangattu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,106 @@ | ||
| # Workflow to run evaluations on push to main and publish reports to Posit Connect | ||
| on: | ||
| push: | ||
| branches: [main] | ||
| workflow_dispatch: | ||
|
|
||
| name: Evaluate and Publish | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| jobs: | ||
| evaluate: | ||
| runs-on: ubuntu-latest | ||
| name: Run Evaluation and Publish to Connect | ||
|
|
||
| env: | ||
| GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} | ||
| R_KEEP_PKG_SOURCE: yes | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - uses: r-lib/actions/setup-r@v2 | ||
| with: | ||
| r-version: release | ||
| use-public-rspm: true | ||
|
|
||
| - uses: r-lib/actions/setup-r-dependencies@v2 | ||
| with: | ||
| extra-packages: | | ||
| tidyverse/vitals | ||
| any::ellmer | ||
| any::readr | ||
| any::ragnar | ||
| any::rsconnect | ||
| needs: check | ||
|
|
||
| - name: Install quartohelp package | ||
| run: | | ||
| R CMD INSTALL . | ||
|
|
||
| - name: Download ragnar store | ||
| env: | ||
| OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} | ||
| run: | | ||
| Rscript -e 'quartohelp::update_store()' | ||
|
|
||
| - name: Run Evaluation | ||
| env: | ||
| OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} | ||
| run: | | ||
| Rscript evals/eval_quartohelp.R | ||
|
|
||
| - name: Publish to Posit Connect | ||
| env: | ||
| CONNECT_SERVER: ${{ secrets.CONNECT_SERVER }} | ||
| CONNECT_API_KEY: ${{ secrets.CONNECT_API_KEY }} | ||
| run: | | ||
| Rscript -e ' | ||
| library(rsconnect) | ||
|
|
||
| # Configure Posit Connect account | ||
| rsconnect::addConnectServer( | ||
| url = Sys.getenv("CONNECT_SERVER"), | ||
| name = "posit-connect" | ||
| ) | ||
|
|
||
| rsconnect::connectApiUser( | ||
| account = "github-actions", | ||
| server = "posit-connect", | ||
| apiKey = Sys.getenv("CONNECT_API_KEY") | ||
| ) | ||
|
|
||
| # Use a fixed app name so it always updates the same deployment | ||
| app_name <- "quartohelp-eval" | ||
|
|
||
| # Deploy the evaluation bundle as static content | ||
| rsconnect::deployApp( | ||
| appDir = "./quarto_eval_bundle", | ||
| appName = app_name, | ||
| appTitle = "Quartohelp Evaluation Report", | ||
| server = "posit-connect", | ||
| account = "github-actions", | ||
| forceUpdate = TRUE, | ||
| launch.browser = FALSE | ||
| ) | ||
|
|
||
| # Output the URL | ||
| cat("DEPLOY_URL=", rsconnect::deployments("./quarto_eval_bundle")$url[1], "\n", | ||
| file = Sys.getenv("GITHUB_OUTPUT"), append = TRUE, sep = "") | ||
| ' | ||
|
|
||
| - name: Upload evaluation artifacts | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: evaluation-bundle | ||
| path: quarto_eval_bundle/ | ||
| retention-days: 30 | ||
|
|
||
| - name: Upload evaluation logs | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: evaluation-logs | ||
| path: logs/ | ||
| retention-days: 30 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| library(vitals) | ||
| library(ellmer) | ||
| library(readr) | ||
| source("R/store.r") | ||
| source("R/client.R") | ||
|
|
||
| quarto_evaluation_suite <- read_csv( | ||
| "evals/quarto_evaluation_suite.csv", | ||
| col_types = cols(input = col_character(), target = col_character()) | ||
| ) | ||
|
|
||
| cat( | ||
| "Loaded evaluation suite with", | ||
| nrow(quarto_evaluation_suite), | ||
| "test cases\n" | ||
| ) | ||
|
|
||
| vitals::vitals_log_dir_set("./logs") | ||
|
|
||
| tsk <- Task$new( | ||
| dataset = quarto_evaluation_suite, | ||
| solver = generate(solver_chat = chat_quartohelp), | ||
| scorer = model_graded_qa( | ||
| scorer_chat = chat_openai(model = "gpt-5-nano-2025-08-07"), | ||
| partial_credit = TRUE | ||
| ) | ||
| ) | ||
|
|
||
| cat("Total test cases:", nrow(quarto_evaluation_suite), "\n\n") | ||
|
|
||
| tsk$eval(view = FALSE) | ||
|
|
||
| bundle_dir <- "./quarto_eval_bundle" | ||
| vitals_bundle(output_dir = bundle_dir, overwrite = TRUE) | ||
|
|
||
| cat("\n✅ Evaluation complete!\n") | ||
| cat("📦 Bundle created at: ", bundle_dir, "\n", sep = "") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| input,target | ||
| "I need to build a multi-language data science website with R analysis, Python machine learning models, interactive visualizations, and automatic deployment to GitHub Pages whenever I push changes. What's the full setup?","Should search knowledge store, provide _quarto.yml and example .qmd files with R and Python chunks, mention GitHub Actions, link to websites, computation, publishing, and CI/CD docs" | ||
| "I'm creating a book with 20 chapters where each chapter has Python code, cross-referenced figures and tables, citations, and needs to output as both HTML website and PDF. The PDF needs different styling than HTML. How do I organize this?","Should search knowledge store, provide _quarto.yml for book project with multiple formats, example chapter .qmd with Python, cross-refs, citations, format-specific options, link to books, multi-format, cross-references, citations docs" | ||
| "I want to build a data dashboard that updates daily from an API, shows interactive plots with Plotly, includes filterable tables, has tabs for different data views, works on mobile, gets published automatically to Quarto Pub, and sends alerts if data quality issues are detected. What technologies and Quarto features do I need?","Should search knowledge store, provide .qmd dashboard example with Plotly, note external automation for API/alerts, mention responsive design, publishing commands, link to dashboards, interactivity, publishing docs" | ||
| "I'm creating an academic journal article with R analysis, need to submit to multiple journals with different citation styles, include supplementary materials as separate documents, cross-reference between main text and supplement, generate Word and PDF versions, and manage co-author comments. What's the workflow?","Should search knowledge store, provide manuscript .qmd example with R code, multiple CSL files, cross-references, multi-format output, link to manuscripts, citations, cross-references, multi-format docs" | ||
| "I want to create parameterized reports that run nightly via cron, execute R and Python code, generate different outputs based on parameters, email the results, and cache expensive computations between runs. What Quarto features should I use?","Should search knowledge store, provide .qmd with params in YAML, R and Python chunks, mention freeze for caching, explain external automation needed for cron/email, link to parameters, computation, project execution docs" | ||
| "How do I create a presentation with Quarto using Reveal.js?","Should search knowledge store, provide .qmd example with format: revealjs, link to presentations documentation" | ||
| "What's the syntax for cross-referencing figures in Quarto?","Should search knowledge store, provide .qmd example with #fig-label and @fig-label syntax, link to cross-references documentation" | ||
| "How do I create and reference tables in Quarto?","Should search knowledge store, provide .qmd example with #tbl-label and @tbl-label syntax, link to tables and cross-references docs" | ||
| "How do I add Shiny interactivity to a Quarto document?","Should search knowledge store, provide .qmd example with server: shiny in YAML, link to Shiny documentation" | ||
| "What's a callout block?","Should search knowledge store, provide .qmd example with :::{.callout-*} syntax, link to callouts documentation" | ||
| "What are Quarto extensions and how do I install them?","Should search knowledge store, provide quarto add command example, may include .qmd using extension, link to extensions documentation" | ||
| "What's the command to render a Quarto document?","Should search knowledge store, provide quarto render command example, may include basic .qmd file, link to CLI documentation" | ||
| "How do I fix this error: 'pandoc: command not found'?","Should search knowledge store for installation/troubleshooting docs, if not found inform user docs don't contain this specific error solution, may suggest checking installation, link to installation docs if available" | ||
| "How do I learn Python?","Should search knowledge store, recognize completely unrelated to Quarto, inform user this is outside scope of Quarto documentation, do not provide answer" | ||
| "Can Quarto solve world hunger?","Should search knowledge store (will fail), recognize nonsensical question, respond gracefully that this is outside Quarto scope" |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just add a
local::.to theextra-packageslist of the previoussetup-r-dependenciesaction, which will automatically pull in all the quartohelp dependencies! https://github.com/r-lib/actions/tree/v2-branch/setup-r-dependencies#installing-the-local-packageThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am getting an error -
Error in library(readr) : there is no package called ‘readr’if I do this:I need the dev version of vitals since the feature is not on the CRAN version yet