Skip to content

Upgrade to wombat 3.8.8 and other Python/JS dependencies #251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,15 @@ repos:
- id: trailing-whitespace
- id: end-of-file-fixer
- repo: https://github.com/psf/black
rev: '24.10.0'
rev: '25.1.0'
hooks:
- id: black
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.9.2
rev: v0.9.6
hooks:
- id: ruff
- repo: https://github.com/RobertCraigie/pyright-python
rev: v1.1.391
rev: v1.1.394
hooks:
- id: pyright
name: pyright (system)
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Changed

- Upgrade to wombat 3.8.8 and other Python/JS dependencies (#249)

## [5.1.0] - 2025-01-21

### Changed
Expand Down
14 changes: 7 additions & 7 deletions javascript/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,16 @@
"license": "GPL-3.0-or-later",
"author": "openZIM",
"devDependencies": {
"@rollup/plugin-commonjs": "28.0.1",
"@rollup/plugin-node-resolve": "15.3.0",
"@rollup/plugin-commonjs": "28.0.2",
"@rollup/plugin-node-resolve": "16.0.0",
"@rollup/plugin-strip": "^3.0.4",
"@rollup/plugin-terser": "0.4.4",
"ava": "^6.1.3",
"ava": "^6.2.0",
"babel-eslint": "^10.1.0",
"eslint": "9.13.0",
"eslint-config-prettier": "9.1.0",
"prettier": "3.3.3",
"rollup": "4.24.0",
"eslint": "9.20.1",
"eslint-config-prettier": "10.0.1",
"prettier": "3.5.1",
"rollup": "4.34.7",
"rollup-plugin-version-injector": "^1.3.3"
},
"scripts": {
Expand Down
999 changes: 535 additions & 464 deletions javascript/yarn.lock

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion openzim.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ execute_after=[

[files.assets.actions."wombat.js"]
action="get_file"
source="https://cdn.jsdelivr.net/npm/@webrecorder/[email protected].7/dist/wombat.js"
source="https://cdn.jsdelivr.net/npm/@webrecorder/[email protected].8/dist/wombat.js"
target_file="wombat.js"
20 changes: 10 additions & 10 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -56,30 +56,30 @@ scripts = [

]
lint = [
"black==24.10.0",
"ruff==0.9.2",
"black==25.1.0",
"ruff==0.9.6",
]
check = [
"pyright==1.1.391",
"pyright==1.1.394",
"pytest==8.3.4",
]
test = [
"pytest==8.3.4",
"pytest-mock==3.14.0",
"coverage==7.6.10",
"coverage==7.6.12",
]
docs = [
"mkdocs==1.6.1",
"mkdocstrings[python]==0.27.0",
"mkdocs-material==9.5.50",
"pymdown-extensions==10.14",
"mkdocstrings[python]==0.28.1",
"mkdocs-material==9.6.4",
"pymdown-extensions==10.14.3",
"mkdocs-gen-files==0.5.0",
"mkdocs-literate-nav==0.6.1",
"mkdocs-include-markdown-plugin==7.1.2",
"mkdocs-include-markdown-plugin==7.1.4",
]
dev = [
"ipython==8.31.0",
"pre-commit==4.0.1",
"ipython==8.32.0",
"pre-commit==4.1.0",
"zimscraperlib[scripts]",
"zimscraperlib[lint]",
"zimscraperlib[test]",
Expand Down
4 changes: 2 additions & 2 deletions src/zimscraperlib/filesystem.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
""" Files manipulation tools
"""Files manipulation tools

Shortcuts to retrieve mime type using magic"""
Shortcuts to retrieve mime type using magic"""

import pathlib
from contextlib import contextmanager
Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/fix_ogvjs_dist.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" quick script to fix videojs-ogvjs so that it triggers on webm mimetype """
"""quick script to fix videojs-ogvjs so that it triggers on webm mimetype"""

import logging
import pathlib
Expand Down
8 changes: 5 additions & 3 deletions src/zimscraperlib/html.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" Tools to work with HTML contents """
"""Tools to work with HTML contents"""

import pathlib
from typing import BinaryIO, TextIO
Expand Down Expand Up @@ -43,11 +43,13 @@ def find_language_in(content: str | BinaryIO | TextIO, mime_type: str) -> str:
continue
if (
nodename == "meta"
and not node.attrs.get("http-equiv", "").lower()
and not node.attrs.get(
"http-equiv", ""
).lower() # pyright:ignore[reportUnknownMemberType, reportAttributeAccessIssue]
== "content-language"
):
continue
return node.attrs[key]
return node.attrs[key] # pyright:ignore[reportReturnType]
return ""


Expand Down
30 changes: 15 additions & 15 deletions src/zimscraperlib/image/optimization.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
""" An image optimization module to optimize the following image formats:
"""An image optimization module to optimize the following image formats:

- JPEG (using optimize-images)
- PNG (using optimize-images)
- GIF (using gifsicle with lossy optimization)
- WebP (using Pillow)
- JPEG (using optimize-images)
- PNG (using optimize-images)
- GIF (using gifsicle with lossy optimization)
- WebP (using Pillow)

Some important notes:
- This makes use of the --lossy option from gifsicle which is present
only in versions above 1.92.
If the package manager has a lower version, you can build gifsicle
from source and install or
do not use the lossiness option.
Some important notes:
- This makes use of the --lossy option from gifsicle which is present
only in versions above 1.92.
If the package manager has a lower version, you can build gifsicle
from source and install or
do not use the lossiness option.

- Presets for the optimizer are available in zimscraperlib.image.presets.
- Presets for the optimizer are available in zimscraperlib.image.presets.

- If no options for an image optimization is passed, the optimizer
can still run on default settings which give
a bit less size than the original images but maintain a high quality. """
- If no options for an image optimization is passed, the optimizer
can still run on default settings which give
a bit less size than the original images but maintain a high quality."""

import io
import os
Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/misc.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" Miscelaneous utils"""
"""Miscelaneous utils"""

from typing import TypeVar

Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/rewriting/css.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" CSS Rewriting
"""CSS Rewriting

This modules contains tools to rewrite CSS retrieved from an online source so that it
can safely operate within a ZIM, linking only to ZIM entries everytime a URL is used.
Expand Down
10 changes: 7 additions & 3 deletions src/zimscraperlib/rewriting/html.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" HTML Rewriting
"""HTML Rewriting

This modules contains tools to rewrite HTML retrieved from an online source so that it
can safely operate within a ZIM.
Expand Down Expand Up @@ -101,8 +101,12 @@ def extract_base_href(content: str) -> str | None:
if not soup.head:
return None
for base in soup.head.find_all("base"):
if base.has_attr("href"):
return base["href"]
if base.has_attr( # pyright:ignore[reportUnknownMemberType, reportAttributeAccessIssue]
"href"
):
return base[ # pyright:ignore[reportIndexIssue, reportUnknownVariableType, reportArgumentType, reportReturnType]
"href"
]
return None


Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/rewriting/js.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" JS Rewriting
"""JS Rewriting

This modules contains tools to rewrite JS retrieved from an online source so that it
can safely operate within a ZIM. It is based on the assumption that wombat.js will be
Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/rewriting/url_rewriting.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" URL rewriting tools
"""URL rewriting tools

This module is about url and entry path rewriting.

Expand Down
20 changes: 10 additions & 10 deletions src/zimscraperlib/types.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
""" File extensions to MIME-Type mapping
"""File extensions to MIME-Type mapping

All libzim *articles* contains the mime-type of their content, for the libzim
reader to properly return it.
All libzim *articles* contains the mime-type of their content, for the libzim
reader to properly return it.

Providing accurate mime-type for ZIM Article is important to prevent broken features
upon reading.
Ex.: youtube scraper uses Web Assembly files (.wasm) for the WebM codecs.
Without the proper mime-type, wasm files are returned as octet-stream and thus
not loaded efficiently.
Providing accurate mime-type for ZIM Article is important to prevent broken features
upon reading.
Ex.: youtube scraper uses Web Assembly files (.wasm) for the WebM codecs.
Without the proper mime-type, wasm files are returned as octet-stream and thus
not loaded efficiently.

Should your scraper need additional mapping, use mimetypes.add_type() and it will
be automatically used. """
Should your scraper need additional mapping, use mimetypes.add_type() and it will
be automatically used."""

import mimetypes
import pathlib
Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/uri.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" URI handling module"""
"""URI handling module"""

import urllib.parse

Expand Down
12 changes: 6 additions & 6 deletions src/zimscraperlib/zim/__init__.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
""" ZIM file creation tools
"""ZIM file creation tools

zim.creator: create files by manually adding each article
zim.filesystem: zimwriterfs-like creation from a build folder
zim.providers: contentProvider for serving libzim with data
zim.items: item to add to creator
zim.archive: read ZIM files, accessing or searching its content"""
zim.creator: create files by manually adding each article
zim.filesystem: zimwriterfs-like creation from a build folder
zim.providers: contentProvider for serving libzim with data
zim.items: item to add to creator
zim.archive: read ZIM files, accessing or searching its content"""

from libzim.writer import Blob # pyright: ignore[reportMissingModuleSource]

Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/zim/_libkiwix.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
r""" [INTERNAL] libkiwix's internal features copies
r"""[INTERNAL] libkiwix's internal features copies

CAUTION: this is __not__ part of zimscraperlib's API. Don't use outside scraperlib!

Expand Down
12 changes: 6 additions & 6 deletions src/zimscraperlib/zim/archive.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
""" ZIM Archive helper
"""ZIM Archive helper

Convenient subclass of libzim.reader.Archive with:
- direct access to Item from path
- direct access to suggestions and suggestions count
- direct access to search results and number of results
- public Entry access by Id"""
Convenient subclass of libzim.reader.Archive with:
- direct access to Item from path
- direct access to suggestions and suggestions count
- direct access to search results and number of results
- public Entry access by Id"""

from collections.abc import Iterable
from types import TracebackType
Expand Down
30 changes: 15 additions & 15 deletions src/zimscraperlib/zim/creator.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
""" ZIM Creator helper

Convenient subclass of libzim.writer.Creator with:
- easier configuration of commonly set props during init
- start/stop methods to bypass the contextmanager
- method to create an entry directly from args
- direct method to add redirects without title
- prevent exeption on double call to close()

Convenient subclasses of libzim.writer.Item with:
- metadata set on initialization
- metadata stored on object
Sister subclass StaticItem (inheriting from it) with:
- content stored on object
- can be used to store a filepath and content read from it (not stored) """
"""ZIM Creator helper

Convenient subclass of libzim.writer.Creator with:
- easier configuration of commonly set props during init
- start/stop methods to bypass the contextmanager
- method to create an entry directly from args
- direct method to add redirects without title
- prevent exeption on double call to close()

Convenient subclasses of libzim.writer.Item with:
- metadata set on initialization
- metadata stored on object
Sister subclass StaticItem (inheriting from it) with:
- content stored on object
- can be used to store a filepath and content read from it (not stored)"""

import io
import logging
Expand Down
36 changes: 18 additions & 18 deletions src/zimscraperlib/zim/filesystem.py
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
""" zimwriterfs-like tools to convert a build folder into a ZIM
"""zimwriterfs-like tools to convert a build folder into a ZIM

make_zim_file behaves in a similar way to zimwriterfs and expects the same options:
make_zim_file behaves in a similar way to zimwriterfs and expects the same options:

- Guesses file mime-type from filenames
- Add all files to respective namespaces based on mime type
- Add redirects from a zimwriterfs-compatible redirects TSV
- Adds common metadata
- Guesses file mime-type from filenames
- Add all files to respective namespaces based on mime type
- Add redirects from a zimwriterfs-compatible redirects TSV
- Adds common metadata

Also included:
- Add redirect from a list of (source, destination, title) strings
Also included:
- Add redirect from a list of (source, destination, title) strings

Note: due to the lack of a cancel() method in the libzim itself, it is not possible
to stop a zim creation process. Should an error occur in your code, a Zim file
with up-to-that-moment content will be created at destination.
Note: due to the lack of a cancel() method in the libzim itself, it is not possible
to stop a zim creation process. Should an error occur in your code, a Zim file
with up-to-that-moment content will be created at destination.

To prevent this (creating an unwanted Zim file) from happening,
a workaround is in place. It prevents the libzim from finishing its process.
While it results in no Zim file being created, it results in the zim temp folder
to be left on disk and very frequently leads to a segmentation fault at garbage
collection (on exit mostly).
To prevent this (creating an unwanted Zim file) from happening,
a workaround is in place. It prevents the libzim from finishing its process.
While it results in no Zim file being created, it results in the zim temp folder
to be left on disk and very frequently leads to a segmentation fault at garbage
collection (on exit mostly).

Meaning you should exit right after an exception in your code (during zim creation)
Use workaround_nocancel=False to disable the workaround. """
Meaning you should exit right after an exception in your code (during zim creation)
Use workaround_nocancel=False to disable the workaround."""

import datetime
import pathlib
Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/zim/indexing.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" Special item with customized index data and helper classes """
"""Special item with customized index data and helper classes"""

import io
import pathlib
Expand Down
2 changes: 1 addition & 1 deletion src/zimscraperlib/zim/items.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" libzim Item helpers """
"""libzim Item helpers"""

import io
import pathlib
Expand Down
Loading