Skip to content

Commit 725ada2

Browse files
Add sitemap generator script to use from GH action
Signed-off-by: Kate Goldenring <[email protected]>
1 parent ed1f826 commit 725ada2

File tree

3 files changed

+63
-4
lines changed

3 files changed

+63
-4
lines changed

Diff for: .github/workflows/mdbook.yml

+5-3
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@ jobs:
3030
runs-on: ubuntu-latest
3131
env:
3232
MDBOOK_VERSION: 0.4.21
33-
SITEMAP_GEN_VERSION: 0.2.0
3433
ALERTS_VERSION: 0.6.7
3534
PUBLISH_DOMAIN: component-model.bytecodealliance.org
3635
steps:
@@ -40,17 +39,20 @@ jobs:
4039
curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf -y | sh
4140
rustup update
4241
cargo install --version ${MDBOOK_VERSION} mdbook
43-
cargo install --version ${SITEMAP_GEN_VERSION} mdbook-sitemap-generator
4442
cargo install --version ${ALERTS_VERSION} mdbook-alerts
4543
- name: Setup Pages
4644
id: pages
4745
uses: actions/configure-pages@v3
4846
- name: Build with mdBook
4947
run: mdbook build component-model
48+
- name: Setup Python
49+
uses: actions/setup-python@v2
50+
with:
51+
python-version: 3.10
5052
- name: Generate sitemap
5153
run: |
5254
cd component-model
53-
mdbook-sitemap-generator -d ${PUBLISH_DOMAIN} -o book/sitemap.xml
55+
python3 ../scripts/generate_sitemap.py --domain "component-model.bytecodealliance.org" --higher-priority "design" --output-path book/sitemap.xml
5456
cd ..
5557
- name: Upload artifact
5658
uses: actions/upload-pages-artifact@v2

Diff for: CONTRIBUTING.md

-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@ This repository also makes use of mdBook plugins. To install mdBook and the plug
1414

1515
```console
1616
cargo install --version 0.4.21 mdbook
17-
cargo install --version 0.2.0 mdbook-sitemap-generator
1817
cargo install --version 0.6.7 mdbook-alerts
1918
```
2019

Diff for: scripts/generate_sitemap.py

+58
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
import os
2+
from urllib.parse import urljoin
3+
from datetime import datetime
4+
import argparse
5+
6+
def parse_summary():
7+
"""Parse URLs from the SUMMARY.md file."""
8+
with open("src/SUMMARY.md", "r") as file:
9+
for line in file:
10+
if "](" in line:
11+
url = line.split("](")[1].split(")")[0]
12+
# Add .html extension if not the root URL
13+
if url.endswith(".md"):
14+
url = url[:-3] + ".html"
15+
yield url
16+
17+
def determine_priority(url_path, higher_priority_section):
18+
"""Determine the priority based on the URL path and specified higher priority section."""
19+
if url_path.count("/") <= 1: # Pages directly under the base URL
20+
return "1.0"
21+
elif higher_priority_section and url_path.startswith(f"./{higher_priority_section}"): # Pages in the specified higher priority section
22+
return "0.8"
23+
else:
24+
return "0.5" # All other pages
25+
26+
def generate_sitemap(domain, output_path, higher_priority_section):
27+
"""Generate a sitemap XML file from SUMMARY.md structure."""
28+
domain = "https://" + domain
29+
urls = parse_summary() # Add base URL to the list of URLs
30+
urls = [""] + list(urls)
31+
32+
sitemap = '<?xml version="1.0" encoding="UTF-8"?>\n'
33+
sitemap += '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n'
34+
35+
for url in urls:
36+
full_url = urljoin(domain, url)
37+
priority = determine_priority(url, higher_priority_section)
38+
39+
sitemap += " <url>\n"
40+
sitemap += f" <loc>{full_url}</loc>\n"
41+
sitemap += " <changefreq>weekly</changefreq>\n"
42+
sitemap += f" <priority>{priority}</priority>\n"
43+
sitemap += " </url>\n"
44+
45+
sitemap += "</urlset>"
46+
47+
# Write the sitemap to the specified output path
48+
with open(output_path, "w") as file:
49+
file.write(sitemap)
50+
51+
if __name__ == "__main__":
52+
parser = argparse.ArgumentParser(description="Generate a sitemap for mdBook")
53+
parser.add_argument("-d", "--domain", required=True, help="Domain for the mdBook site (e.g., component-model.bytecodealliance.org)")
54+
parser.add_argument("-o", "--output-path", default="sitemap.xml", help="Output path for the sitemap file")
55+
parser.add_argument("-p", "--higher-priority", help="Subsection path (e.g., 'design') to assign a higher priority of 0.8")
56+
args = parser.parse_args()
57+
58+
generate_sitemap(args.domain, args.output_path, args.higher_priority)

0 commit comments

Comments
 (0)