Skip to content

Commit 3d2d55e

Browse files
authored
Merge pull request #33 from radekosmulski/tutorial
add a tutorial walking the reader through adding llms.txt to their project
2 parents a26fc16 + 0024f99 commit 3d2d55e

File tree

2 files changed

+187
-2
lines changed

2 files changed

+187
-2
lines changed

nbs/nbdev.qmd

+184
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
---
2+
title: "How to help LLMs understand your nbdev project"
3+
date: 2024-01-20
4+
author: "Radek Osmulski"
5+
description: "A practical guide to making your nbdev library LLM-friendly using llms.txt, with automation techniques and implementation examples."
6+
---
7+
8+
## Overview
9+
10+
This tutorial demonstrates how to add `llms.txt` to your nbdev project, creating a clear interface between your code and LLMs. You'll learn to generate `llms-ctx.txt` and `llms-ctx-full.txt` files and integrate them with your documentation.
11+
12+
While this guide focuses on `nbdev`, the underlying principles and tools are framework-agnostic and can help make any codebase more accessible to LLMs.
13+
14+
Let's explore how to implement this.
15+
16+
## The key ingredient: llms.txt
17+
18+
The foundation of LLM-friendly documentation is the `llms.txt` file. At its core, it is just a Markdown file with information about your library found at a specific URL (root of your site followed by `/llms.txt`).
19+
20+
However, it needs to follow a certain structure as outlined in the [llms.txt](https://llmstxt.org/#format) format.
21+
22+
Do not be intimidated by the specification, though. In reality, it offers a lot of flexibility and by conforming to it you'll gain access to several very helpful tools that we will look at in a second.
23+
24+
First, let's start working on our `llms.txt` file. If you would like to, you can open your favorite editor and start working on an `llms.txt` for your library as we go along.
25+
26+
Here is how the `llms.txt` file could begin:
27+
28+
```markdown
29+
# FastHTML
30+
31+
> FastHTML is a python library which...
32+
33+
When writing FastHTML apps remember to:
34+
35+
- Thing to remember
36+
```
37+
38+
The required elements are:
39+
40+
- the H1 header (FastHTML)
41+
- a blockquote with a short summary of the project (FastHTML is a python library which...)
42+
43+
And they can optionally be followed by zero or more paragraphs and lists. Usually, this is the place where you would add a short description of your library.
44+
45+
The description can be as simple as this (this is an excerpt from the [llms.txt](https://fastcore.fast.ai/llms.txt) for [fastcore](https://fastcore.fast.ai/)):
46+
47+
```
48+
Here are some tips on using fastcore:
49+
50+
- **Liberal imports**: Utilize `from fastcore.module import *` freely. The library is designed for safe wildcard imports.
51+
- **Enhanced list operations**: Substitute `list` with `L`. This provides advanced indexing, method chaining, and additional functionality while maintaining list-like behavior.
52+
- **Extend existing classes**: Apply the `@patch` decorator to add methods to classes, including built-ins, without subclassing. This enables more flexible code organization.
53+
```
54+
55+
Below are a few ideas on how to make writing the description feel even more seamless:
56+
57+
- Consider the content you already have that can be used as a starting point (e.g. your project's README, blog posts and articles, social media discussions, etc.)
58+
- Think of how you would describe your library to a new team member --- this often yields the right balance of precision and comprehension.
59+
- Use an LLM to help you synthetize content from multiple sources into cohesive prose (though you might need to do some post-processing to combat the LLM's tendency to be verbose).
60+
61+
### Adding resource sections
62+
63+
After the optional description, you can include zero or more sections starting with an H2 heading and containing links to supplementary resources.
64+
65+
Markdown files are strongly recommended here as they offer a good balance of structure and readability. You could attempt linking to other formats, but your results may vary. For instance, HTML tends to be verbose, and formats like CSV rarely contain information that lends itself well to documenting functionality.
66+
67+
Here's an example of what this section might look like:
68+
69+
```markdown
70+
## Docs
71+
72+
- [Surreal](https://host/README.md): Tiny jQuery alternative with Locality of Behavior
73+
- [FastHTML quick start](https://host/quickstart.html.md): An overview of FastHTML features
74+
75+
## Examples
76+
77+
- [Todo app](https://host/adv_app.py)
78+
```
79+
80+
### The Optional section
81+
82+
If you'd like to, you can include a section with `Optional` as the heading. This section has a special meaning and provides a mechanism for managing context size. Resources listed in this section appear only in `llms-ctx-full.txt`, while being omitted from `llms-ctx.txt`. This allows the user (be that a human or an agent) to choose the right amount of context based on their use case and the capabilities of the LLM they plan to use.
83+
84+
- `llms.txt`: just the initial section with an optional description and optional resource sections with unexpanded links
85+
- `llms-ctx.txt`: as above but with links expanded apart from the 'Optional' section
86+
- `llms-ctx-full.txt`: all sections with expanded links
87+
88+
Here is a small example of the `Optional` section:
89+
90+
```markdown
91+
## Optional
92+
93+
- [Starlette docs](https://host/starlette-sml.md): A subset of the Starlette docs
94+
```
95+
96+
Your `llms.txt` file is now complete! Time to give yourself a pat on the back for a job well done and let's move on to the next, automated step.
97+
98+
## Generating context files
99+
100+
The [llms-txt](https://llmstxt.org/intro.html) library automates the process of generating context files from your `llms.txt`. It can be used either through its CLI interface or as a Python module.
101+
102+
Using the CLI:
103+
104+
```bash
105+
llms_txt2ctx llms.txt --save_nbdev_fname llms-ctx.txt
106+
```
107+
108+
Or via Python:
109+
110+
```python
111+
from llms_txt import *
112+
samp = Path('llms-sample.txt').read_text()
113+
parsed = parse_llms_file(samp)
114+
```
115+
116+
Both approaches read your `llms.txt` file and retrieve the linked content. This is the process that takes your `llms.txt` and turns it into `llms-ctx.txt` and `llms-ctx-full.txt`.
117+
118+
But there is another very exciting library --- [pysymbol-llm](https://github.com/AnswerDotAI/pysymbol-llm) --- that will allow us to add even more useful information in an automated way.
119+
120+
## Enhancing context with API reference
121+
122+
While LLMs generally understand high-level concepts, they often struggle with implementation details, especially when their training data is outdated. Providing a comprehensive list of your library's symbols - functions, classes, and their documentation - helps bridge this gap.
123+
124+
This is where the [pysymbol-llm](https://github.com/AnswerDotAI/pysymbol-llm) library enters the picture. It generates a complete API reference in Markdown, extracting existing docstrings along the way.
125+
126+
This is a short excerpt from the [apilist.txt](https://fastcore.fast.ai/apilist.txt) for `fastcore`:
127+
128+
```markdown
129+
# fastcore Module Documentation
130+
131+
## fastcore.ansi
132+
133+
> Filters for processing ANSI colors.
134+
135+
- `def strip_ansi(source)`
136+
Remove ANSI escape codes from text.
137+
138+
- `def ansi2html(text)`
139+
Convert ANSI colors to HTML colors.
140+
141+
- `def ansi2latex(text)`
142+
Convert ANSI colors to LaTeX colors.
143+
```
144+
145+
The tool works great even with larger libraries. For instance, generating the API reference for numpy requires just one command:
146+
147+
```bash
148+
pysym2md numpy
149+
```
150+
151+
To implement this in your project, generate an `apilist.txt`, serve it alongside your documentation, and reference it from your `llms.txt` file.
152+
153+
## Configuration
154+
155+
The final step is to configure your nbdev project to generate and serve these context files. This requires three changes:
156+
157+
1. Add your `llms.txt` file to the `nbs` directory of your project.
158+
159+
2. Add the required dependencies to `settings.ini`:
160+
```
161+
dev_requirements = pysymbol_llm llms-txt
162+
```
163+
3. Configure Quarto's build process in `nbs/_quarto.yml`:
164+
```
165+
project:
166+
type: website
167+
pre-render:
168+
- pysym2md --output_file apilist.txt nbdev
169+
post-render:
170+
- llms_txt2ctx llms.txt --optional true --save_nbdev_fname llms-ctx-full.txt
171+
- llms_txt2ctx llms.txt --save_nbdev_fname llms-ctx.txt
172+
resources:
173+
- "*.txt"
174+
```
175+
176+
Remember to manually add a link to the generated `apilist.txt` in your `llms.txt` file. Once you commit these changes and rebuild your docs, your library will be ready for deeper, more accurate conversations with LLMs!
177+
178+
## Learning from examples
179+
180+
It is often useful to study how others went about implementing the thing we are working on. The [fastcore](https://fastcore.fast.ai/llms.txt) and [FastHTML](https://docs.fastht.ml/) projects offer a good reference, and you can find additional examples in the [llmstxt.site](https://llmstxt.site/) and [llmstxt.cloud](https://directory.llmstxt.cloud/) directories.
181+
182+
To see all the necessary changes in one place, here's a [complete example](https://github.com/AnswerDotAI/nbdev/pull/1485/files) of adding `llms.txt` to an existing nbdev project.
183+
184+
Providing the right context opens up new possibilities for AI-assisted development and exploring topics you might want to learn more about. We hope this guide helps you and the users of your library take advantage of these exciting new tools.

nbs/sidebar.yml

+3-2
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ website:
1111
- section: Editors and IDEs
1212
contents:
1313
- ed.md
14-
- section: Guidlines
14+
- section: Tutorials
1515
contents:
16-
- domains.md
16+
- domains.md
17+
- nbdev.qmd

0 commit comments

Comments
 (0)