diff --git a/Meetings/AIsessions/index.md b/Meetings/AIsessions/index.md index 53d693b..e4a3441 100644 --- a/Meetings/AIsessions/index.md +++ b/Meetings/AIsessions/index.md @@ -5,7 +5,7 @@ title: AI sessions During 2025, a set of meeting presented and interrogated the use of Artificial Intelligence (AI) in reading systems (RS). -* July 17th, [Daniel Weck, AI use cases and technical considerations in Thorium Reader, an open source reading system] +* July 17th, [Daniel Weck, AI use cases and technical considerations in Thorium Reader, an open source reading system](https://w3c.github.io/publishingcg/Meetings/Minutes/2025-07-17-publishingcg.html) * June 19th, [Lars Wallin, Colibrio approach of AI](https://www.w3.org/2025/06/11-publishingcg-minutes.html) * May 15th, [Senthil Nathan, CEO of Ailaysa, about Steps towards Responsible Digital Publishing: Content Exclusion and AI Training, and present the Chaï reader AI capacities.](https://www.w3.org/2025/05/15-publishingcg-minutes.html) * March 20th [Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems](https://www.w3.org/2025/03/20-publishingcg-minutes.html) diff --git a/Meetings/Minutes/2024-02-15-publishingcg.html b/Meetings/Minutes/2024-02-15-publishingcg.html new file mode 100644 index 0000000..01b3682 --- /dev/null +++ b/Meetings/Minutes/2024-02-15-publishingcg.html @@ -0,0 +1,230 @@ + + + + + +Publishing Community Group: Plenary Session – 15 February 2024 + + + + + + + + + + +
+

W3C

+ +

+Publishing Community Group: Plenary Session

+

15 February 2024

+ + +
+ +
+
+

Attendees

+
+
Present
ashley, AvneeshSingh, CharlesL, duga, George, gpellegrino, graham, Hadrien, ivan, jeffrey_griggs, jonas_lillqvist, ken_jones, LauraB, laurent_le_meur, liisamk, miia_kirsi, mike_baker, paul_belfanti, paul_gilius, rickj, sagiv, tzviya, vincent_nicotina, wendyreid, wolfgang
+
Regrets
-
+
Chair
wolfgang
+
Scribe
duga, wendyreid
+
+
+ + +
+ +
+

Meeting minutes

+

<wolfgang> date: 2024-02-15

+

<tzviya> scribe?

+

wolfgang: Welcome to the plenary, starting with update with accessibility task force

+

<AvneeshSingh> https://w3c.github.io/publ-a11y/UX-Guide-Metadata/draft/principles/?updated

+
+ +
+

Accessibility task force

+

AvneeshSingh: Focus is on the guide for retailers and distributors for understanding the metadata
+… this is an update to an existing doc
+… we have found that one approach doesn't work for everyone
+… so now we have different types of metadata targeted to different groups
+… VitalSource and ??? have committed to implement
+… We are no longer using such fixed recs, instead being a little broader

+

<AvneeshSingh> https://w3c.github.io/publ-a11y/UX-Guide-Metadata/draft/principles/?updated

+

AvneeshSingh: There are some technique documents explaining how to extract metadata
+… There are more technique docs coming for other types of metadata

+

<AvneeshSingh> https://www.w3.org/publishing/a11y/schema-a11y-summary/

+

<AvneeshSingh> https://www.w3.org/publishing/a11y/audio-playback/

+

AvneeshSingh: See the two links to the docs

+

George: Trying to get feedback for the schema document.
+… There are some English strings in there, we are trying to make sure they are good in English, but we plan to add a localization method for them
+… VitalSource will localize some of these

+

AvneeshSingh: We realized there are some things on the edge of accessibility, but have broader implications
+… We want to bring these back to the group so we only work on things people are interested in

+

George: #70 - get citation
+… It's hard to get a page number for citations

+

<wendyreid> w3c/publishingcg#70

+

George: we want to see if the CG is interested in the feature

+

<wendyreid> w3c/publishingcg#71

+

George: Next bookmarks, annotations and export of such
+… There is partial support in some reading systems
+… hard to compare. Purely RS, but we would test and evaluate
+… #72 - read aloud

+

<wendyreid> w3c/publishingcg#72

+

George: There are lot of distracting things (footnotes, citation refs, etc)
+… DAISY has skippability that can be toggled
+… Finally virtual pages, discussed but never resolved
+… VitalSource has implemented something, as has Lars

+

<ivan> w3c/publishingcg#73

+

George: But companies like Ebsco (???) could insert real page numbers, but they want it to be a generally used algorithm
+… So other versions of the book would have the same page breaks
+… Would like to hear what people think

+

ivan: For 71, EDRlab may have a project going here - can we work with them?
+… Are 70 and 73 related?

+

<wendyreid> https://docs.google.com/document/d/11GypOjE9xOTaINATl5bxVIA3Mc9jzNBGCr6GT_KNaQ4/edit?pli=1

+

ivan: The page numbers for 70 seem to be very related to 73

+

George: Unless there is an alternative to page numbers

+

ivan: Also related to annotations (want to refer to the text)
+… need to reference the page somehow
+… Need to anchor somehow

+

George: These aren't shared annotations

+

Hadrien: Not from EDRlab technically, but involved with them
+… EPUB itself contains the annotations, then open the epub you get the annotations
+… idea is it self contained
+… tricky part is not just anchoring, but also context
+… often need to embed a lot of information
+… Also know percentage into the book, DOM ranges, etc
+… need both anchors and context

+

duga: Just wanted to say 70/73 are likely the same
+… the TF determined that people only understand progress is through page numbers, no one likes %s, it makes 70 and 73 the same
+… also the impression of the indexers

+
+ +
+

Anti counterfeit Task Force

+

liisamk: Anti counterfeit TF
+… not much discussion since the start of the year
+… We are socializing the ISCC in the hopes of bringing in more people
+… We are kicking back up on Friday
+… The next piece is digging in to how ISCC and ??? work together
+… how to we start socializing the next piece of the trust chain

+
+ +
+

Fixed Layout Accessibility TF

+

wendyreid: FL a11y TF

+

https://w3c.github.io/epub-specs/epub33/fxl-a11y/

+

wendyreid: Working on guidance doc for FL a11y - it is now in complete draft state
+… please read and give feedback
+… Mostly for publishers/authors, but there are also recs for RS at the end

+

<ivan> i/FL A11y TF/Topic: FL A11y TF/

+

wendyreid: discuss media overlay, tables, etc
+… will produces some samples

+

w3c/publishingcg#69

+

<vince> can someone repost the fixed layout guidelines here? Sorry, I just joined the IRC now, and can't see any links above

+

George: I heard that Hadrien recommended that if you could get correct reading order, we could claim accessibility for FL documents
+… that is the biggest issue, getting things in the correct order

+

wendyreid: Have discussed a reflowable mode
+… It's still experimental, not in the main document
+… would like to explore possibility of this and how to specify this for reading systems

+

George: Would there be a validator?

+

wendyreid: Still way too early

+

Hadrien: Context - this is looking at the EU directive and what people can actually do
+… For instance you need to able to do ??? and you can't, but we might be able to make a system that could
+… We need to document how TTS actually works, very few people know
+… Out of that work we can make a best practice document
+… documenting would be the first step

+

CircularKen: The document we have been working on is the start of what we will eventually be able to do
+… already in the doc is reading order and image descriptions
+… Then we can add an additional way to read it once we have this groundwork laid

+

w3c/publishingcg#69

+
+ +
+

Generate reflowable fixed layout books

+

wolfgang: One aspect is a11l and the other is adapting to different viewports

+

<tzviya> c/a11l/a11y

+

CircularKen: Fundamentally need to start with well made and designed docs
+… we should discard placement, etc and just have a simple replacement of CSS that disregards positioning and styling
+… Then we have the order with descriptions, etc with page markers. Should plan on designed version and a stripped back reflow version

+

wendyreid: Ken described it as we have it
+… A lot of what we say now is text should be live (actual text), visual order and programmatic should match, we should have image descriptions
+… Also some recs on what to do when you cross the fold (spread over 2 FL pages)
+… That is the current emphasis. Follow best practices and the advanced stuff will follow

+

AvneeshSingh: Have we looked at the problem from the other side?
+… That is can we start with something that has all the proper structure, then create a FL doc entirely from CSS

+

wendyreid: You mean convert reflow to FL?

+

<Zakim> tzviya, you wanted to respond

+

AvneeshSingh: Yes, basically start with a proper flowing doc, then just apply CSS to make it look right

+

tzviya: This isn't really considering how publishers work
+… doing something like that is probably not feasible

+

liisamk: There is a moment here to socialize good use of FL
+… there are a lot of people who use FL when they just don't want to make a flowing doc
+… because it is easier
+… This may be a good opportunities to push people to flowing text since they are thinking about what it really means to make something FL
+… very little needs to be FL

+

wendyreid: We need to get people to question whether content needs to be FL
+… Sometimes positioning helps with a11y (having images adjacent to text may help some readers)
+… so sometimes FL can help a11y

+

liisamk: This gets me back to mixed formats
+… A single FL page in a reflowable book would be really nice

+

<AvneeshSingh> +1 Liisa

+

Also +1 to Liisa

+
+ +
+

Extracting textual content

+

Hadrien: Goes way beyond extracting text
+… eg language used
+… Need a separate structure for TTS and creating reader mode
+… though could use the same structure for both

+

George: We are getting feedback from students with dyslexia that read aloud is inadequate for their needs
+… I agree we need to improve the description of how it is done (highlighting, speed, etc)

+

<wolfgang>

+

George: Need a lot of control in the TTS

+

wolfgang: Next, how do we render in reflowable mode

+

AvneeshSingh: This kind of thing is done by the screen readers
+… Is there a need to tie this to FL?
+… isn't more of a generic thing, how to extract and read the text?
+… Seems like a big topics, and screen readers have researched it for years

+

Hadrien: Agree, this is beyond FL
+… some specific FL things do exist (e.g. small content chunks)
+… but in general it should be for all epub

+

wendyreid: ARIA wg is also interested in the same topic
+… may even be become an all-web topic

+

George: Do they join us or do we join them?

+

wendyreid: Good question. May even need it's own CG. Hard to tell at this point

+
+ +
+

Webtoons

+

wendyreid: PMWG is discussing a potential change for this
+… current proposal is to expand FLOW-CONTINUOUS to FL.

+

laurent_: We did not discuss pronunciation for TTS purposes

+
+
+ + +
Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).
+ + + diff --git a/Meetings/Minutes/2025-03-20-publishingcg.html b/Meetings/Minutes/2025-03-20-publishingcg.html new file mode 100644 index 0000000..3cf5915 --- /dev/null +++ b/Meetings/Minutes/2025-03-20-publishingcg.html @@ -0,0 +1,129 @@ + + + + + +W3C Publishing Community Group Plenary: Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems + + + + + + + + + + +
+

W3C

+ +

W3C Publishing Community Group Plenary: Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems

+

20 March 2025

+ + +
+ +
+
+

Attendees

+
+
Present
gautierchomel, liisamk, rickj, wolfgang
+
Regrets
-
+
Chair
wolfgang
+
Scribe
rickj
+
+
+ + +
+ +
+

Meeting minutes

+

<wolfgang> Agenda: Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems

+

wolfgang: welcome to all

+

Nick: From Principles to Practice overview
+… VitalSource overview... talking about 'the engagement challenge'
+… who are we? 28 m units delivered last year, 19+m users served, users around the world world, localized in 37 languages
+… we act as a 'learning delivery network' sitting between learning providers and institutions/students/bookstores
+… How do we do this? 1. Driving day 1 affordable access for millions. 2. Helping students stay engaged on Day 2 and beyond
+https://research.vitalsource.com/research researching what works
+… more than 30 published papers about learning science
+… engagement is step 1

+

https://dl.acm.org/doi/10.1145/3576050.3576086 referenced
+… shows students not reading

+

14% average number of assigned textbook pages read by students
+… one solution: VitalSource CoachMe.

+

feature inside Bookshelf reader, adds AI generated low-stakes formative practice questions inside the textbook reading experience

+

generated from the textbook, not a general LLM model

+

instructors can assign engagement with these questions as a part of their grade

+

uses VitalSource proprietary AI for automatic question generation. Based on the Doer Effect from Carnegie Mellon OLI work

+

'practice while reading causes 6x more learning gain than reading alone'

+

bring active learning to a 'passive medium'

+

(deeper dive into how it works)

+

timeline of deployment and achievement

+

21 million AI generated questions answered in a learning context

+

published 35 peer reviewed papers

+

3 best paper awards

+

examination at Iowa State University course with A/B test with the same book, looking at how many times students opened their book showing dramatic improvement in engagement when CoachMe questions are assigned as a part of the grade

+

multiple school study showing increase in engagement by students
+… 'but what about Gen AI?'

+

https://www.aacu.org/research/leading-through-disruption discussed

+

explosion of GenAI usage

+

Key opportunities: things we can do for students and faculty that could not be done before

+

Pro's and Con's of using AI

+

The importance of responsible AI use: https://get.vitalsource.com/ai-principles

+

referenced 1EdTech rubric https://www.1edtech.org/standards/ai-rubric

+

Why do these principles matter?

+

avoid "AI for AI's sake', focus on real learning gains not hype, maintain strong publisher and institutional partnerships

+

where we are heading next...

+

High quality AI answers aligned with textbook content, no model training, no IP leakage, SOC2 compliant use with LLMs, DRM protected

+

Q&A

+

gautierchomel: when content/questions are generated, we need an evaluation methodology and a way to advise the user this is AI generated. Have you been thinking about this? Is it possible?

+

uptownnickbrown: that concern drove a lot of our development and decisions. 'generation' is a misnomer, as nothing is 'generated'. There is zero risk, as the sentence for the 'fill in the blank' is from the book.
+… there are feedback mechanisms inside the book to give positive/negative feedback on the questions
+… recommend: have clear disclaimers that this is AI generated content. It's the right ethical thing to do.
+… measuring quality is also critical, and hard.
+… automated judges that use LLMs to judge LLM responses, trained to do different tasks in different ways. Judge a few different things at the same time (factually accurate, ...)
+… reduce hallucinations by having the concrete source material
+… comfortable with a 'the book does not address that question' response
+… also trying to evaluate the underlying pedagogy behind the model
+… 'can answer that... best I can do is recommend you read page 62 of the book' type of answers

+

liisamk: have you thought about using AI for 'other things' with your reading system?

+

uptownnickbrown: a few ways to tie this into the reading system. Things like making good flashcards (hard to do now).
+… also looking at ways to evolve search beyond a 'find' function
+… search with mixed languages...
+… alt-text for screen readers
+… lots of places to pervade the reading system

+

Michalis0: are questions created in real time, or pre-loaded?

+

uptownnickbrown: yes, the CoachMe questions are pre-generated
+… also aligned with where to ask the question in the book flow
+… may evolve over time as AI improvements come
+… over 1,000,000 questions in production at scale

+

wolfgang: am I right that you mainly use NLP and not LLMs?

+

uptownnickbrown: Yes. That's exactly how we built CoachMe

+

wolfgang: did you take into account the AI legislation from the EU?

+

uptownnickbrown: we are starting to think more about that now, as it was not in place at the time.

+

rssagent, generate minutes

+
+
+ + +
Minutes manually created (not a transcript), formatted by scribe.perl version 244 (Thu Feb 27 01:23:09 2025 UTC).
+ +
+

Diagnostics

+

Found 'Agenda:' not followed by a URL: 'Nick Brown (VP Product Vitalsource): From Principles to Practice - Responsible AI for Enhanced Student Engagement in Reading Systems'.

+

Maybe present: Michalis0, Nick, uptownnickbrown

+

All speakers: gautierchomel, liisamk, Michalis0, Nick, uptownnickbrown, wolfgang

+

Active on IRC: gautierchomel, liisamk, Michalis0, rickj, wolfgang

+
+ + diff --git a/Meetings/Minutes/2025-05-15-publishingcg.html b/Meetings/Minutes/2025-05-15-publishingcg.html new file mode 100644 index 0000000..3c141e5 --- /dev/null +++ b/Meetings/Minutes/2025-05-15-publishingcg.html @@ -0,0 +1,98 @@ + + + + + +Publishing CG plenary: Senthil Nathan (Ailaysa, Chennai) discuss steps towards Responsible Digital Publishing: Content Exclusion and AI Training + + + + + + + + + + +
+

W3C

+ +

Publishing CG plenary: Senthil Nathan (Ailaysa, Chennai) discuss steps towards Responsible Digital Publishing: Content Exclusion and AI Training

+

15 May 2025

+ + +
+ +
+
+

Attendees

+
+
Present
miia
+
Regrets
-
+
Chair
-
+
Scribe
wolfgang
+
+
+ + +
+ +
+

Meeting minutes

+

gautier: Talk of Senthil (Ailaysa, Chennai) - we are taking notes

+

<gautierchomel> presentmiia

+

senthil: speak about the concept, then provide a demo, then Q & A
+… Senthil Nathan from Ailaysa - AI company - content translation based on AI - taking content in different languages - international book fair in Chennai - introduced products into publsihing - before mainly translation/localization - automatic translations using AI
+… concepts: how to develop a responsible content in an AI context - we cannot have walled days - great data rush for training AI systems without knowledge and permission of owners - awareness that quality content is very important for AI - quality data should come from publishers, media companies, research institutes - shifting to being active

+

negotiators
+… Content exclusion of content as training data - in case of use responsible usage + permission needed - in 2024 ppl are actively discussing - should be a fair deal with proper compensation - illegal scraping was a big problem - is coming to an end - much more reduced now
+… terms of permission are set by both parties - technical barriers can now be easily implemented - clear legal terms prohibiting use without limits - content watermarking and provenance tracking tools
+… to include: fair licensing terms - mandatory source citations in AI output - quality control: selective participation with responsible AI companies - usage tracking: monitoring how content influences AI responses - consent frameworks: granular control over AI uses
+… factors: technical, business, regulatory and market dynamics
+… AI-specific exclusion protocols (better than robots.txt) - rise of new AI-crawlers (require new blocking mechanisms) - dynamic paywalls and anti-scraping tech - emergence for content-tracking tools
+… blockers (NYT, Guardian) vs. partners (Axel Springer with OpenAI) vs. open access (But seeking attribution) vs. wait-and-see
+… EU: Ai-Act - US: considering legal framework - courses of copyright offices
+… market: growing need for high-quality content - AI is not thinking, algorithmic, not creative - publishers see new revenue streams via partnerships - data brokers like literary agency - syndication rights
+… principle of fair monetization - important to track extent of usage and kinds of usage
+… from authoring to reading: AI environment is set - book discovery enhanced through LLM recommendation and search systems - going beyond metadata and keywords: asking questions on the contents of the book (e.g. ChaiReader)
+… future options: read book in another language such as Tamil thx to automatic translation or as audiobook - in libraries, bookstores, schools use of books may be changed -
+… HarperCollins works with MS, also Sage, CUP,
+… have to find common ground between publishers and AI companies

+

Demo Chai Reader: Reading, Chatting and Buying in one portal - multilingual Q&A - buy routine integrated - in future: book recommendations based on search terms - translation of a book into a target language

+

gautier: when I'm chatting with a book, answers only from book content - LLM only used to prepare a nice answer - not training each book in LLM -

+

Senthil: completely separated

+

michalis: concerned that access to content should be fair use - esp. in the US -next months will be critical in legal aspects

+

senthil: big publishers have great interest - different for small publishers or even authors -

+

michalis: in education or academic this would be quite useful

+

senthil: exactly useful to expolore several books in parallel to formulate an answer - we work with EDRLabs to improve on it - ChaiReader still in Beta - working with publishers - can chat with a collection of books, not only one at the same time - impact of "AI on economics" - reasoning capacity - more important than just referring back - great

+

thing for book

+

discovery

+

ivan: aren't you forced to make some sort of ranking between books consumed - need a local ranking for books you have

+

senthil: possible to rank or categorize dependent on prompting

+

vishal: the more correct the prompt, the more precise the answer will be - if 3 books have an answer - semantic ranking combined with keyword level ranking - still experimental feature - as Google and Amazon do

+

ivan: in some cases this is not the best answer - in scholarly usage - ranking by systems outside your bookshop - based on reputation of answers - you use LLM only for niceties of input and output

+

vishal: reinforcement learning - librarian knows the authors - deepseek uses this feature - integrate human expertise into machine

+

senthil: good question

+

<gautierchomel> RSSAgent make minutes

+
+
+ + +
Minutes manually created (not a transcript), formatted by scribe.perl version 244 (Thu Feb 27 01:23:09 2025 UTC).
+ +
+

Diagnostics

+

Maybe present: gautier, ivan, michalis, senthil, vishal

+

All speakers: gautier, ivan, michalis, senthil, vishal

+

Active on IRC: gautierchomel, Michalis, wolfgang

+
+ + diff --git a/Meetings/Minutes/2025-06-11-publishingcg.html b/Meetings/Minutes/2025-06-11-publishingcg.html new file mode 100644 index 0000000..efe710a --- /dev/null +++ b/Meetings/Minutes/2025-06-11-publishingcg.html @@ -0,0 +1,100 @@ + + + + + +W3C Publishing Community Group Plenary: "Advanced Features in Colibrio Reader" – 11 June 2025 + + + + + + + + + + +
+

W3C

+ +

– DRAFT –
+W3C Publishing Community Group Plenary: "Advanced Features in Colibrio Reader"

+

11 June 2025

+ + +
+ +
+
+

Attendees

+
+
Present
gautierchomel, jimsaya, jonas, Lars, wolfgang
+
Regrets
-
+
Chair
wolfgang
+
Scribe
gautierchomel
+
+
+ + +
+ +
+

Meeting minutes

+

Lars: I have no formal presentation. I have been experimenting with AI in Colibrio since long time now. I am particulary interested in having conversation with a book. Because I am a fan of fictions.

+

Lars: I have beeen using openAI, experimenting with theyre API is easy, it runs in the browser and it is client side. Last year, they released an asistant API that takes care of the boring stuff.

+

Lars: to use LLM we need tools, those are the APIs. Without tool, the LLM is a stupid huge base of knowledge. To be more precise we feed with very contextual information. You can give the context in a prompt. prompt engineering is about yourself being the tool. The more precise you are, the more accurate you get an answer. To go further in using LLMs we use context inputs from other databases. That's the role of the API.

+

Lars: adding context is costly and time consuming. It needs to be structured and expressed in a way understandable by the LLM. This complexity needs to be managed.

+

Lars: showing screen. This is the vanilla reader, available online. You need an openAI jey to make it work. It can be costly, when you work with images. Less when it is about text. At the opening of the book, i strip away unnecessary markup and keep only the HTML semantic. I clean to the bare minimum of code. That's pushed to the LLM as embedded. Think about it as a computer edition of the book. An edition made for computers. A

+

numerical representation of the book. It feeds a vector database. You can then ask question, queries, to the database.

+

Lars: A note, this embedded version, in my opinion, should be built and sold by the publisher.

+

Lars: Anyway that's an important part because this is the step allowing to get contextualised answers.

+

Lars: next, I open the dialog, a chat box built in the app, and start to ask.

+

Lars: the question goes to the vector database, which performs semantic search and provides chucks of 500 caracters to the LLM who is formulating the answers displayed to me.

+

Lars: To make sure the responses are from the book, the app performs a search and provides link references for each part of the answer. So you can activate the link and go to the part of the book stating that.

+

Lars: the models are not smart, it is the context and the dispositive deployed by the App developers that make it usefull. As a consequence, the best quality of the book make the best answers. Metadata are important too, we exctract and use them to feed the database.

+

Lars: Metadata, semantics, Table of content, all the ebook appareal is used here. It is our best chance to get good results.

+

<wolfgang> Gautier: publishers rely on AI systems - risks involved for customers

+

Gautier: there is a risk of loop. AI analysing data created by AI.

+

Lars: yes, that's a major problem actually, on every digital contents

+

Gautier: so probably it is of use to have a refine property to indicate that "this metadata or content was AI produced".

+

Lars: For sure! So we could alert the user, give a proportion of risk.

+

Lars: the LLM hype is too much, but still, the results are good. Let see with images. Here I send Image + context, including visible content aside of the image (the visible range we call it) , and always in the context of the book thanks to the embedded version stored in a database. I get good result. Trying with a contemporary art photo and a world map with data represented on it. This is complex to achieve on the production pipeline.

+

It is easier in the reading system because we have the complete numerical representation of the book stored in a database.

+

jonas: what is included in the visible range?

+

Lars: text that is available on the visual page. It is risky to expand too much, it could interpolate topics from other parts of the book. We could experiment adding title structure per example.

+

wolfgang: I feel, for science content, the chapter level can be the context.

+

Lars: this is to experiment, there are many different books fortunately! The solution will differ largely depending on this diversity. The more granular you are in the information (semantic, metadata, structure) you give, the best result you'll get. A schema attribute would bring a strong help, per example. Be smart when you build your ebook, you'll get strong feedback.

+

Lars: I am also adding semantic search and translation. All we add is meant for non visual readers, they have a stronger need.

+

Lars: it also works with local models so you are not obliged to send your content to feed the LLM. It is slower but it works.

+

jonas: what happens with copyrighted material?

+

Lars: never use free services. I pay for openAI, the contract say they don't use my contents for training. That's why we just provide a way to give your API key, then you are responsible. I don't want to take that responsability.

+

Lars: also, publishers should build and sell rights on embedded version. Meaning licensing your content, but ready for machine usage.

+

jonas: for libraries it's tricky, we usually don't own copyright.

+

Lars: you would need to buy two licences, one for public reading and one for machine usage.

+

wolfgang: in fact all the knowledge used in your system comes from the book. The LLM is only a vehicule here.

+

Lars: yes, the LLM is a conversonial interface, good at language, but we need to give them the knowledge by running other code aside.

+

Lars: and adding control checks to make the answer accurate and verifiable. That's part of the agrement with computers, we want to be able to check because they don't always tell the truth.

+
+
+ + +
Minutes manually created (not a transcript), formatted by scribe.perl version 244 (Thu Feb 27 01:23:09 2025 UTC).
+ +
+

Diagnostics

+

Succeeded: s/LLm/LLM/

+

Succeeded: s/by/buy/

+

Maybe present: Gautier

+

All speakers: Gautier, jonas, Lars, wolfgang

+

Active on IRC: gautierchomel, wolfgang

+
+ + diff --git a/Meetings/Minutes/2025-07-17-publishingcg.html b/Meetings/Minutes/2025-07-17-publishingcg.html new file mode 100644 index 0000000..379229e --- /dev/null +++ b/Meetings/Minutes/2025-07-17-publishingcg.html @@ -0,0 +1,97 @@ + + + + + +Publishing CG plenary: "AI use cases and technical considerations in Thorium Reader, an open source reading system" + + + + + + + + + + +
+

W3C

+ +

Publishing CG plenary: "AI use cases and technical considerations in Thorium Reader, an open source reading system"

+

17 July 2025

+ + +
+ +
+
+

Attendees

+
+
Present
DanielWeck, gautierchomel, George, james, ori, vladimir, wolfgang
+
Regrets
-
+
Chair
gautierchomel, wolfgang
+
Scribe
wolfgang
+
+
+ + +
+ +
+

Meeting minutes

+

Gautier:This session is part of a series with AI in RS.
+… today Daniel Weck, lead developer of Thorium (reading system from EDRLab)

+

DanielWeck: AI-generated content descriptions in Thorium - unreleased experiment, thus work in progress - lays foundations for user experience we want to follow - demo two page spread with image of Jules Verne with description - inspect the HTML
+… image has empty @alt and empty title - lnik takes me to an appendix where the image is displayed - but here the @alt is not empty - it says "linked image"
+… would be great if we hade some help from AI - Thorium has a zoom feature - leads to room for textual description - can choose an LLM to generate description -
+… decide between "short" or "extended" description - I can edit the system prompt, but there is a default system prompt - Gemini very good at discovering ppl in images - extended description would have two paragraphs - advanced view of system prompt in JSON format - additional information in prompt in this format
+… select text fron answer - run a search on the Internet to get more information
+… new work from W3C WG - complex image (bar chart) - link to extended description - rich text that is not part of a short description - plan to create a modal interface where you might consult AI

+

(1) user see descriptions (2) chat with AI (3) do further research on the web - familiar chat UI - modal interlay - default system prompt which sets useful boundaries - we also feed in metadata
+… request short or extended descriptions easily - just "one shot" - we need to inform the user that an AI will hallucinate
+… MCP Model Context Protocol for tool calls out of scope - RAG also not implemented - beyond basic embedding - also not local LLMs - response times OK, but not the quality - Gemini better for image descriptions
+… you may give metadata as embedded context for the prompt - advanced user may edit the systemprompt and might remove blatantly irrelevant metadata

+

George: publishers are not happy with AI getting trained with their copyrighted materials. Any protections?

+

Daniel: All conversations in the chat with AI, are used for training if I don't pay for using the LLM - If I were to pay for the service, the data remain private -always depends on the terms and conditions of a particular model - for publishers TDM reservation protocol allows to opt in or out - Thorium would respect this
+… , any ideas how that could be solved?

+

George: if image is not used for training, publishers are OK with that.

+

Daniel: Thorium would have to police the use of data by an LLM - Would Thorium have to blacklist some models?

+

James: Publishers are very twitchy about copyrighted material - with an epub you can mark the TDM or place a couple of metatags - 6 or 7 different ways to signal that training ist not accepted - training is an issue -
+… on-device LLMS would be helpful

+

Daniel: Publishers don't want RS to create friction - with images copies and text scanning, it's so easy to be done (e.g. on a Mac) - we have to send the image to the AI, but can't control what the LLM wil be doing with it

+

James: could a publisher embed a token ?

+

Daniel: agreement with Mistral - access token for EDRLab - could run on a Thorium server - but Thorium doesn't transport the key itself- but uses it in accessing the LLM to answer users' requests

+

Ori: if you using the user's API key, you can't know what the AI does with it - Gemini say they don't use it for training, no idea what OpenAI does - using another key is problematic
+… Gemini doesn't feed requests for image descriptions to humans

+

Daniel: main stumbling block: potential of legal issues - we could enable it in nightly build, but not in production builds

+

George: JPEG has metadata in it - is that transmitted?

+

Daniel: in FB Messenger or Signal I check that GPS data is erased before I share pictures - with AI once the image payload is transmitted - it will be readable for AI

+

Ori: guess it will not ingest geographical data

+

Daniel: most LLMs have restrictions - in Thorium we don't create requests for LLMs manually - we feed image data into an abstraction interface
+… abstraction layer is fully client-side - it allows us to speak Javascript -

+

Ori: had to reduce size of image - don't send EXIf or geographical data

+

Daniel: images processed before sending them on the wire - reduction in size before sending

+

gautier: WCAG criteria; description must offer same service as the image - a way to fulfil this - focus on authored description (if available) - real success for WCAG requirement

+
+
+ + +
Minutes manually created (not a transcript), formatted by scribe.perl version 244 (Thu Feb 27 01:23:09 2025 UTC).
+ +
+

Diagnostics

+

No scribenick or scribe found. Guessed: wolfgang

+

Maybe present: Daniel, Gautier

+

All speakers: Daniel, DanielWeck, Gautier, George, James, Ori

+

Active on IRC: DanielWeck, gautierchomel, George, wolfgang

+
+ + diff --git a/Meetings/Minutes/index.md b/Meetings/Minutes/index.md index db800d7..fb0ddc2 100644 --- a/Meetings/Minutes/index.md +++ b/Meetings/Minutes/index.md @@ -5,6 +5,13 @@ title: minutes # Meeting Minutes -[Search for minutes in the mailing list](https://www.w3.org/Search/Mail/Public/search?lists=public-publishingcg&keywords=minutes) +* [2025-07-17](2025-07-17-publishingcg.html) +* [2025-06-11](2025-06-11-publishingcg.html) +* [2025-05-15](2025-05-15-publishingcg.html) +* [2025-03-20](2025-03-20-publishingcg.html) +* [2024-02-15](2024-02-15-publishingcg.html) +* [2022-08-10](2022-08-10-publishingcg.html) +* [2020-10-21](2020-10-21-publishingcg.html) -March 20th,2025 . \ No newline at end of file +Not all minutes have been pasted to this github repo, to find more, you can +[Search for minutes in the mailing list](https://www.w3.org/Search/Mail/Public/search?lists=public-publishingcg&keywords=minutes)