Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions .github/workflows/eval-and-publish.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# Workflow to run evaluations on push to main and publish reports to Posit Connect
on:
push:
branches: [main]
workflow_dispatch:

name: Evaluate and Publish

permissions:
contents: read

jobs:
evaluate:
runs-on: ubuntu-latest
name: Run Evaluation and Publish to Connect

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes

steps:
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-r@v2
with:
r-version: release
use-public-rspm: true

- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: |
tidyverse/vitals
any::ellmer
any::readr
any::ragnar
any::rsconnect
needs: check

- name: Install quartohelp package
run: |
R CMD INSTALL .
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just add a local::. to the extra-packages list of the previous setup-r-dependencies action, which will automatically pull in all the quartohelp dependencies! https://github.com/r-lib/actions/tree/v2-branch/setup-r-dependencies#installing-the-local-package

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am getting an error - Error in library(readr) : there is no package called ‘readr’ if I do this:

            - uses: r-lib/actions/setup-r-dependencies@v2
              with:
                  extra-packages: |
                      local::.
                      tidyverse/vitals
                  needs: check

            - name: Download ragnar store
              env:
                  OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
              run: |
                  Rscript -e 'quartohelp::update_store()'

I need the dev version of vitals since the feature is not on the CRAN version yet


- name: Download ragnar store
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
Rscript -e 'quartohelp::update_store()'

- name: Run Evaluation
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
Rscript evals/eval_quartohelp.R

- name: Publish to Posit Connect
env:
CONNECT_SERVER: ${{ secrets.CONNECT_SERVER }}
CONNECT_API_KEY: ${{ secrets.CONNECT_API_KEY }}
run: |
Rscript -e '
library(rsconnect)

# Configure Posit Connect account
rsconnect::addConnectServer(
url = Sys.getenv("CONNECT_SERVER"),
name = "posit-connect"
)

rsconnect::connectApiUser(
account = "github-actions",
server = "posit-connect",
apiKey = Sys.getenv("CONNECT_API_KEY")
)

# Use a fixed app name so it always updates the same deployment
app_name <- "quartohelp-eval"

# Deploy the evaluation bundle as static content
rsconnect::deployApp(
appDir = "./quarto_eval_bundle",
appName = app_name,
appTitle = "Quartohelp Evaluation Report",
server = "posit-connect",
account = "github-actions",
forceUpdate = TRUE,
launch.browser = FALSE
)

# Output the URL
cat("DEPLOY_URL=", rsconnect::deployments("./quarto_eval_bundle")$url[1], "\n",
file = Sys.getenv("GITHUB_OUTPUT"), append = TRUE, sep = "")
'

- name: Upload evaluation artifacts
uses: actions/upload-artifact@v4
with:
name: evaluation-bundle
path: quarto_eval_bundle/
retention-days: 30

- name: Upload evaluation logs
uses: actions/upload-artifact@v4
with:
name: evaluation-logs
path: logs/
retention-days: 30
37 changes: 37 additions & 0 deletions evals/eval_quartohelp.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
library(vitals)
library(ellmer)
library(readr)
source("R/store.r")
source("R/client.R")

quarto_evaluation_suite <- read_csv(
"evals/quarto_evaluation_suite.csv",
col_types = cols(input = col_character(), target = col_character())
)

cat(
"Loaded evaluation suite with",
nrow(quarto_evaluation_suite),
"test cases\n"
)

vitals::vitals_log_dir_set("./logs")

tsk <- Task$new(
dataset = quarto_evaluation_suite,
solver = generate(solver_chat = chat_quartohelp),
scorer = model_graded_qa(
scorer_chat = chat_openai(model = "gpt-5-nano-2025-08-07"),
partial_credit = TRUE
)
)

cat("Total test cases:", nrow(quarto_evaluation_suite), "\n\n")

tsk$eval(view = FALSE)

bundle_dir <- "./quarto_eval_bundle"
vitals_bundle(output_dir = bundle_dir, overwrite = TRUE)

cat("\n✅ Evaluation complete!\n")
cat("📦 Bundle created at: ", bundle_dir, "\n", sep = "")
16 changes: 16 additions & 0 deletions evals/quarto_evaluation_suite.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
input,target
"I need to build a multi-language data science website with R analysis, Python machine learning models, interactive visualizations, and automatic deployment to GitHub Pages whenever I push changes. What's the full setup?","Should search knowledge store, provide _quarto.yml and example .qmd files with R and Python chunks, mention GitHub Actions, link to websites, computation, publishing, and CI/CD docs"
"I'm creating a book with 20 chapters where each chapter has Python code, cross-referenced figures and tables, citations, and needs to output as both HTML website and PDF. The PDF needs different styling than HTML. How do I organize this?","Should search knowledge store, provide _quarto.yml for book project with multiple formats, example chapter .qmd with Python, cross-refs, citations, format-specific options, link to books, multi-format, cross-references, citations docs"
"I want to build a data dashboard that updates daily from an API, shows interactive plots with Plotly, includes filterable tables, has tabs for different data views, works on mobile, gets published automatically to Quarto Pub, and sends alerts if data quality issues are detected. What technologies and Quarto features do I need?","Should search knowledge store, provide .qmd dashboard example with Plotly, note external automation for API/alerts, mention responsive design, publishing commands, link to dashboards, interactivity, publishing docs"
"I'm creating an academic journal article with R analysis, need to submit to multiple journals with different citation styles, include supplementary materials as separate documents, cross-reference between main text and supplement, generate Word and PDF versions, and manage co-author comments. What's the workflow?","Should search knowledge store, provide manuscript .qmd example with R code, multiple CSL files, cross-references, multi-format output, link to manuscripts, citations, cross-references, multi-format docs"
"I want to create parameterized reports that run nightly via cron, execute R and Python code, generate different outputs based on parameters, email the results, and cache expensive computations between runs. What Quarto features should I use?","Should search knowledge store, provide .qmd with params in YAML, R and Python chunks, mention freeze for caching, explain external automation needed for cron/email, link to parameters, computation, project execution docs"
"How do I create a presentation with Quarto using Reveal.js?","Should search knowledge store, provide .qmd example with format: revealjs, link to presentations documentation"
"What's the syntax for cross-referencing figures in Quarto?","Should search knowledge store, provide .qmd example with #fig-label and @fig-label syntax, link to cross-references documentation"
"How do I create and reference tables in Quarto?","Should search knowledge store, provide .qmd example with #tbl-label and @tbl-label syntax, link to tables and cross-references docs"
"How do I add Shiny interactivity to a Quarto document?","Should search knowledge store, provide .qmd example with server: shiny in YAML, link to Shiny documentation"
"What's a callout block?","Should search knowledge store, provide .qmd example with :::{.callout-*} syntax, link to callouts documentation"
"What are Quarto extensions and how do I install them?","Should search knowledge store, provide quarto add command example, may include .qmd using extension, link to extensions documentation"
"What's the command to render a Quarto document?","Should search knowledge store, provide quarto render command example, may include basic .qmd file, link to CLI documentation"
"How do I fix this error: 'pandoc: command not found'?","Should search knowledge store for installation/troubleshooting docs, if not found inform user docs don't contain this specific error solution, may suggest checking installation, link to installation docs if available"
"How do I learn Python?","Should search knowledge store, recognize completely unrelated to Quarto, inform user this is outside scope of Quarto documentation, do not provide answer"
"Can Quarto solve world hunger?","Should search knowledge store (will fail), recognize nonsensical question, respond gracefully that this is outside Quarto scope"