Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Meetings/AIsessions/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ title: AI sessions

During 2025, a set of meeting presented and interrogated the use of Artificial Intelligence (AI) in reading systems (RS).

* July 17th, [Daniel Weck, AI use cases and technical considerations in Thorium Reader, an open source reading system]
* July 17th, [Daniel Weck, AI use cases and technical considerations in Thorium Reader, an open source reading system](https://w3c.github.io/publishingcg/Meetings/Minutes/2025-07-17-publishingcg.html)
* June 19th, [Lars Wallin, Colibrio approach of AI](https://www.w3.org/2025/06/11-publishingcg-minutes.html)
* May 15th, [Senthil Nathan, CEO of Ailaysa, about Steps towards Responsible Digital Publishing: Content Exclusion and AI Training, and present the Chaï reader AI capacities.](https://www.w3.org/2025/05/15-publishingcg-minutes.html)
* March 20th [Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems](https://www.w3.org/2025/03/20-publishingcg-minutes.html)
230 changes: 230 additions & 0 deletions Meetings/Minutes/2024-02-15-publishingcg.html

Large diffs are not rendered by default.

129 changes: 129 additions & 0 deletions Meetings/Minutes/2025-03-20-publishingcg.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@

<!DOCTYPE html>
<html lang=en>
<head>
<meta charset=utf-8>
<title>W3C Publishing Community Group Plenary: Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems</title>
<meta name=viewport content="width=device-width">
<link rel="stylesheet" type="text/css" title="2018" href="https://www.w3.org/StyleSheets/scribe2/public.css">
<link rel="alternate stylesheet" type="text/css" title="2004" href="https://www.w3.org/StyleSheets/base.css">
<link rel="alternate stylesheet" type="text/css" title="2004" href="https://www.w3.org/StyleSheets/public.css">
<link rel="alternate stylesheet" type="text/css" title="2004" href="https://www.w3.org/2004/02/minutes-style.css">
<link rel="alternate stylesheet" type="text/css" title="Fancy" href="https://www.w3.org/StyleSheets/scribe2/fancy.css">
<link rel="alternate stylesheet" type="text/css" title="Typewriter" href="https://www.w3.org/StyleSheets/scribe2/tt-member.css">
</head>

<body>
<header>
<p><a href="https://www.w3.org/"><img src="https://www.w3.org/StyleSheets/TR/2016/logos/W3C" alt=W3C border=0 height=48 width=72></a></p>

<h1>W3C Publishing Community Group Plenary: Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems</h1>
<p>20 March 2025</p>

<nav id=links>
<a href="https://www.w3.org/2025/03/20-publishingcg-irc"><img alt="IRC log." title="IRC log" src="https://www.w3.org/StyleSheets/scribe2/text-plain.png"></a>
</nav>
</header>

<div id=prelims>
<div id=attendees>
<h2>Attendees</h2>
<dl class=intro>
<dt>Present</dt><dd>gautierchomel, liisamk, rickj, wolfgang</dd>
<dt>Regrets</dt><dd>-</dd>
<dt>Chair</dt><dd>wolfgang</dd>
<dt>Scribe</dt><dd>rickj</dd>
</dl>
</div>

<nav id=toc>
<h2>Contents</h2>
<ol>
</ol>
</nav>
</div>

<main id=meeting class=meeting>
<h2>Meeting minutes</h2>
<section><p id=1a9f class=irc><cite>&lt;wolfgang&gt;</cite> Agenda: Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems</p>
<p id=00d2 class="phone s01"><cite>wolfgang:</cite> welcome to all</p>
<p id=cb30 class="phone s02"><cite>Nick:</cite> From Principles to Practice overview<br>
<span id=93b6>… VitalSource overview... talking about 'the engagement challenge'</span><br>
<span id=f00d>… who are we? 28 m units delivered last year, 19+m users served, users around the world world, localized in 37 languages</span><br>
<span id=7580>… we act as a 'learning delivery network' sitting between learning providers and institutions/students/bookstores</span><br>
<span id=4563>… How do we do this? 1. Driving day 1 affordable access for millions. 2. Helping students stay engaged on Day 2 and beyond</span><br>
<span id=3b3d>… <a href="https://research.vitalsource.com/research">https://<wbr>research.vitalsource.com/<wbr>research</a> researching what works</span><br>
<span id=1631>… more than 30 published papers about learning science</span><br>
<span id=46dd>… engagement is step 1</span></p>
<p id=ee39 class=summary><a href="https://dl.acm.org/doi/10.1145/3576050.3576086">https://<wbr>dl.acm.org/<wbr>doi/<wbr>10.1145/<wbr>3576050.3576086</a> referenced<br>
<span id=a203>… shows students not reading</span></p>
<p id=a319 class=summary>14% average number of assigned textbook pages read by students<br>
<span id=9d32>… one solution: VitalSource CoachMe.</span></p>
<p id=0770 class=summary>feature inside Bookshelf reader, adds AI generated low-stakes formative practice questions inside the textbook reading experience</p>
<p id=7712 class=summary>generated from the textbook, not a general LLM model</p>
<p id=b7a4 class=summary>instructors can assign engagement with these questions as a part of their grade</p>
<p id=1437 class=summary>uses VitalSource proprietary AI for automatic question generation. Based on the Doer Effect from Carnegie Mellon OLI work</p>
<p id=6259 class=summary>'practice while reading causes 6x more learning gain than reading alone'</p>
<p id=4493 class=summary>bring active learning to a 'passive medium'</p>
<p id=4b75 class=summary>(deeper dive into how it works)</p>
<p id=9c8c class=summary>timeline of deployment and achievement</p>
<p id=38b8 class=summary>21 million AI generated questions answered in a learning context</p>
<p id=7e70 class=summary>published 35 peer reviewed papers</p>
<p id=79d8 class=summary>3 best paper awards</p>
<p id=ab45 class=summary>examination at Iowa State University course with A/B test with the same book, looking at how many times students opened their book showing dramatic improvement in engagement when CoachMe questions are assigned as a part of the grade</p>
<p id=abc4 class=summary>multiple school study showing increase in engagement by students<br>
<span id=df52>… 'but what about Gen AI?'</span></p>
<p id=0ce8 class=summary><a href="https://www.aacu.org/research/leading-through-disruption">https://<wbr>www.aacu.org/<wbr>research/<wbr>leading-through-disruption</a> discussed</p>
<p id=ac9a class=summary>explosion of GenAI usage</p>
<p id=2d79 class=summary>Key opportunities: things we can do for students and faculty that could not be done before</p>
<p id=1179 class=summary>Pro's and Con's of using AI</p>
<p id=3694 class=summary>The importance of responsible AI use: <a href="https://get.vitalsource.com/ai-principles">https://<wbr>get.vitalsource.com/<wbr>ai-principles</a></p>
<p id=283f class=summary>referenced 1EdTech rubric <a href="https://www.1edtech.org/standards/ai-rubric">https://<wbr>www.1edtech.org/<wbr>standards/<wbr>ai-rubric</a></p>
<p id=9878 class=summary>Why do these principles matter?</p>
<p id=d9c3 class=summary>avoid &quot;AI for AI's sake', focus on real learning gains not hype, maintain strong publisher and institutional partnerships</p>
<p id=8b1c class=summary>where we are heading next...</p>
<p id=ffff class=summary>High quality AI answers aligned with textbook content, no model training, no IP leakage, SOC2 compliant use with LLMs, DRM protected</p>
<p id=f39e class=summary>Q&amp;A</p>
<p id=5157 class="phone s03"><cite>gautierchomel:</cite> when content/questions are generated, we need an evaluation methodology and a way to advise the user this is AI generated. Have you been thinking about this? Is it possible?</p>
<p id=d93c class="phone s04"><cite>uptownnickbrown:</cite> that concern drove a lot of our development and decisions. 'generation' is a misnomer, as nothing is 'generated'. There is zero risk, as the sentence for the 'fill in the blank' is from the book.<br>
<span id=a1cc>… there are feedback mechanisms inside the book to give positive/negative feedback on the questions</span><br>
<span id=7c9b>… recommend: have clear disclaimers that this is AI generated content. It's the right ethical thing to do.</span><br>
<span id=a01a>… measuring quality is also critical, and hard.</span><br>
<span id=8d0a>… automated judges that use LLMs to judge LLM responses, trained to do different tasks in different ways. Judge a few different things at the same time (factually accurate, ...)</span><br>
<span id=6cb9>… reduce hallucinations by having the concrete source material</span><br>
<span id=7b79>… comfortable with a 'the book does not address that question' response</span><br>
<span id=bd6d>… also trying to evaluate the underlying pedagogy behind the model</span><br>
<span id=cc10>… 'can answer that... best I can do is recommend you read page 62 of the book' type of answers</span></p>
<p id=e8a6 class="phone s05"><cite>liisamk:</cite> have you thought about using AI for 'other things' with your reading system?</p>
<p id=84e2 class="phone s04"><cite>uptownnickbrown:</cite> a few ways to tie this into the reading system. Things like making good flashcards (hard to do now).<br>
<span id=4b6d>… also looking at ways to evolve search beyond a 'find' function</span><br>
<span id=455d>… search with mixed languages...</span><br>
<span id=0ed6>… alt-text for screen readers</span><br>
<span id=dadf>… lots of places to pervade the reading system</span></p>
<p id=4fb2 class="phone s06"><cite>Michalis0:</cite> are questions created in real time, or pre-loaded?</p>
<p id=7225 class="phone s04"><cite>uptownnickbrown:</cite> yes, the CoachMe questions are pre-generated<br>
<span id=9654>… also aligned with where to ask the question in the book flow</span><br>
<span id=cf0e>… may evolve over time as AI improvements come</span><br>
<span id=8122>… over 1,000,000 questions in production at scale</span></p>
<p id=f518 class="phone s01"><cite>wolfgang:</cite> am I right that you mainly use NLP and not LLMs?</p>
<p id=6635 class="phone s04"><cite>uptownnickbrown:</cite> Yes. That's exactly how we built CoachMe</p>
<p id=464d class="phone s01"><cite>wolfgang:</cite> did you take into account the AI legislation from the EU?</p>
<p id=7b76 class="phone s04"><cite>uptownnickbrown:</cite> we are starting to think more about that now, as it was not in place at the time.</p>
<p id=234a class=summary>rssagent, generate minutes</p>
</section>
</main>


<address>Minutes manually created (not a transcript), formatted by <a
href="https://w3c.github.io/scribe2/scribedoc.html"
>scribe.perl</a> version 244 (Thu Feb 27 01:23:09 2025 UTC).</address>

<div class=diagnostics>
<h2>Diagnostics</h2>
<p class=warning>Found 'Agenda:' not followed by a URL: 'Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems'.</p>
<p class=warning>Maybe present: Michalis0, Nick, uptownnickbrown</p>
<p class=warning>All speakers: gautierchomel, liisamk, Michalis0, Nick, uptownnickbrown, wolfgang</p>
<p class=warning>Active on IRC: gautierchomel, liisamk, Michalis0, rickj, wolfgang</p>
</div>
</body>
</html>
98 changes: 98 additions & 0 deletions Meetings/Minutes/2025-05-15-publishingcg.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@

<!DOCTYPE html>
<html lang=en>
<head>
<meta charset=utf-8>
<title>Publishing CG plenary: Senthil Nathan (Ailaysa, Chennai) discuss steps towards Responsible Digital Publishing: Content Exclusion and AI Training</title>
<meta name=viewport content="width=device-width">
<link rel="stylesheet" type="text/css" title="2018" href="https://www.w3.org/StyleSheets/scribe2/public.css">
<link rel="alternate stylesheet" type="text/css" title="2004" href="https://www.w3.org/StyleSheets/base.css">
<link rel="alternate stylesheet" type="text/css" title="2004" href="https://www.w3.org/StyleSheets/public.css">
<link rel="alternate stylesheet" type="text/css" title="2004" href="https://www.w3.org/2004/02/minutes-style.css">
<link rel="alternate stylesheet" type="text/css" title="Fancy" href="https://www.w3.org/StyleSheets/scribe2/fancy.css">
<link rel="alternate stylesheet" type="text/css" title="Typewriter" href="https://www.w3.org/StyleSheets/scribe2/tt-member.css">
</head>

<body>
<header>
<p><a href="https://www.w3.org/"><img src="https://www.w3.org/StyleSheets/TR/2016/logos/W3C" alt=W3C border=0 height=48 width=72></a></p>

<h1>Publishing CG plenary: Senthil Nathan (Ailaysa, Chennai) discuss steps towards Responsible Digital Publishing: Content Exclusion and AI Training</h1>
<p>15 May 2025</p>

<nav id=links>
<a href="https://www.w3.org/2025/05/15-publishingcg-irc"><img alt="IRC log." title="IRC log" src="https://www.w3.org/StyleSheets/scribe2/text-plain.png"></a>
</nav>
</header>

<div id=prelims>
<div id=attendees>
<h2>Attendees</h2>
<dl class=intro>
<dt>Present</dt><dd>miia</dd>
<dt>Regrets</dt><dd>-</dd>
<dt>Chair</dt><dd>-</dd>
<dt>Scribe</dt><dd>wolfgang</dd>
</dl>
</div>

<nav id=toc>
<h2>Contents</h2>
<ol>
</ol>
</nav>
</div>

<main id=meeting class=meeting>
<h2>Meeting minutes</h2>
<section><p id=2f67 class="phone s01"><cite>gautier:</cite> Talk of Senthil (Ailaysa, Chennai) - we are taking notes</p>
<p id=3a82 class=irc><cite>&lt;gautierchomel&gt;</cite> presentmiia</p>
<p id=ffff class="phone s02"><cite>senthil:</cite> speak about the concept, then provide a demo, then Q &amp; A<br>
<span id=83ec>… Senthil Nathan from Ailaysa - AI company - content translation based on AI - taking content in different languages - international book fair in Chennai - introduced products into publsihing - before mainly translation/localization - automatic translations using AI</span><br>
<span id=c64c>… concepts: how to develop a responsible content in an AI context - we cannot have walled days - great data rush for training AI systems without knowledge and permission of owners - awareness that quality content is very important for AI - quality data should come from publishers, media companies, research institutes - shifting to being active</span></p>
<p id=6833 class=summary>negotiators<br>
<span id=bc8b>… Content exclusion of content as training data - in case of use responsible usage + permission needed - in 2024 ppl are actively discussing - should be a fair deal with proper compensation - illegal scraping was a big problem - is coming to an end - much more reduced now</span><br>
<span id=b2fb>… terms of permission are set by both parties - technical barriers can now be easily implemented - clear legal terms prohibiting use without limits - content watermarking and provenance tracking tools</span><br>
<span id=33ef>… to include: fair licensing terms - mandatory source citations in AI output - quality control: selective participation with responsible AI companies - usage tracking: monitoring how content influences AI responses - consent frameworks: granular control over AI uses</span><br>
<span id=e0a1>… factors: technical, business, regulatory and market dynamics</span><br>
<span id=c854>… AI-specific exclusion protocols (better than robots.txt) - rise of new AI-crawlers (require new blocking mechanisms) - dynamic paywalls and anti-scraping tech - emergence for content-tracking tools</span><br>
<span id=ac82>… blockers (NYT, Guardian) vs. partners (Axel Springer with OpenAI) vs. open access (But seeking attribution) vs. wait-and-see</span><br>
<span id=2910>… EU: Ai-Act - US: considering legal framework - courses of copyright offices</span><br>
<span id=b117>… market: growing need for high-quality content - AI is not thinking, algorithmic, not creative - publishers see new revenue streams via partnerships - data brokers like literary agency - syndication rights</span><br>
<span id=abad>… principle of fair monetization - important to track extent of usage and kinds of usage</span><br>
<span id=74c6>… from authoring to reading: AI environment is set - book discovery enhanced through LLM recommendation and search systems - going beyond metadata and keywords: asking questions on the contents of the book (e.g. ChaiReader)</span><br>
<span id=7028>… future options: read book in another language such as Tamil thx to automatic translation or as audiobook - in libraries, bookstores, schools use of books may be changed -</span><br>
<span id=7b6a>… HarperCollins works with MS, also Sage, CUP,</span><br>
<span id=d81c>… have to find common ground between publishers and AI companies</span></p>
<p id=10000 class=summary>Demo Chai Reader: Reading, Chatting and Buying in one portal - multilingual Q&amp;A - buy routine integrated - in future: book recommendations based on search terms - translation of a book into a target language</p>
<p id=ba98 class="phone s01"><cite>gautier:</cite> when I'm chatting with a book, answers only from book content - LLM only used to prepare a nice answer - not training each book in LLM -</p>
<p id=ae9a class="phone s02"><cite>Senthil:</cite> completely separated</p>
<p id=10001 class="phone s03"><cite>michalis:</cite> concerned that access to content should be fair use - esp. in the US -next months will be critical in legal aspects</p>
<p id=d3e4 class="phone s02"><cite>senthil:</cite> big publishers have great interest - different for small publishers or even authors -</p>
<p id=3482 class="phone s03"><cite>michalis:</cite> in education or academic this would be quite useful</p>
<p id=b23a class="phone s02"><cite>senthil:</cite> exactly useful to expolore several books in parallel to formulate an answer - we work with EDRLabs to improve on it - ChaiReader still in Beta - working with publishers - can chat with a collection of books, not only one at the same time - impact of &quot;AI on economics&quot; - reasoning capacity - more important than just referring back - great</p>
<p id=c10e class=summary>thing for book</p>
<p id=9f45 class=summary> discovery</p>
<p id=3411 class="phone s04"><cite>ivan:</cite> aren't you forced to make some sort of ranking between books consumed - need a local ranking for books you have</p>
<p id=cec1 class="phone s02"><cite>senthil:</cite> possible to rank or categorize dependent on prompting</p>
<p id=9358 class="phone s05"><cite>vishal:</cite> the more correct the prompt, the more precise the answer will be - if 3 books have an answer - semantic ranking combined with keyword level ranking - still experimental feature - as Google and Amazon do</p>
<p id=7ff5 class="phone s04"><cite>ivan:</cite> in some cases this is not the best answer - in scholarly usage - ranking by systems outside your bookshop - based on reputation of answers - you use LLM only for niceties of input and output</p>
<p id=e49a class="phone s05"><cite>vishal:</cite> reinforcement learning - librarian knows the authors - deepseek uses this feature - integrate human expertise into machine</p>
<p id=8327 class="phone s02"><cite>senthil:</cite> good question</p>
<p id=fa7c class=irc><cite>&lt;gautierchomel&gt;</cite> RSSAgent make minutes</p>
</section>
</main>


<address>Minutes manually created (not a transcript), formatted by <a
href="https://w3c.github.io/scribe2/scribedoc.html"
>scribe.perl</a> version 244 (Thu Feb 27 01:23:09 2025 UTC).</address>

<div class=diagnostics>
<h2>Diagnostics</h2>
<p class=warning>Maybe present: gautier, ivan, michalis, senthil, vishal</p>
<p class=warning>All speakers: gautier, ivan, michalis, senthil, vishal</p>
<p class=warning>Active on IRC: gautierchomel, Michalis, wolfgang</p>
</div>
</body>
</html>
Loading
Loading