Skip to content

Cannot build documentation on Mac OS #32203

@jrhe

Description

@jrhe

System Info

  • transformers version: 4.44.0.dev0
  • Platform: macOS-14.3-arm64-arm-64bit
  • Python version: 3.8.19
  • Huggingface_hub version: 0.23.4
  • Safetensors version: 0.4.3
  • Accelerate version: 0.32.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.1.2 (False)
  • Tensorflow version (GPU?): 2.13.0 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
  • Jax version: 0.4.13
  • JaxLib version: 0.4.13
  • Using distributed or parallel set-up in script?: no

Who can help?

@stevhliu - N.B. fix found and PR to be made on doc-builder. Raising issue here to document incase anyone else runs into it in the meanwhile.

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Run steps to build documentation as described in https://github.com/huggingface/transformers/tree/main/docs on Mac OS.


Initial build docs for transformers docs/source/en/ /var/folders/g7/h9hst8551g74rd1jsf7txvj40000gn/T/tmpqf9yjhon/transformers/main/en
Building the MDX files:  49%|███████████████████████████████████████████████████████████████████▌                                                                       | 209/430 [00:10<00:14, 15.64it/s]/Users/jon/repos/github.com/huggingface/transformers/src/transformers/deepspeed.py:24: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
  warnings.warn(
Building the MDX files:  58%|████████████████████████████████████████████████████████████████████████████████▏                                                          | 248/430 [00:13<00:09, 18.72it/s]
Traceback (most recent call last):
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/build_doc.py", line 197, in build_mdx_files
    content, new_anchors, source_files, errors = resolve_autodoc(
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/build_doc.py", line 123, in resolve_autodoc
    doc = autodoc(
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/autodoc.py", line 490, in autodoc
    methods = find_documented_methods(obj)
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/autodoc.py", line 431, in find_documented_methods
    superclasses = clas.mro()[1:]
  File "/Users/jon/repos/github.com/huggingface/transformers/src/transformers/utils/import_utils.py", line 1526, in __getattribute__
    requires_backends(cls, cls._backends)
  File "/Users/jon/repos/github.com/huggingface/transformers/src/transformers/utils/import_utils.py", line 1514, in requires_backends
    raise ImportError("".join(failed))
ImportError:
TFBertTokenizer requires the tensorflow_text library but it was not found in your environment. You can install it with pip as
explained here: https://www.tensorflow.org/text/guide/tf_text_intro.
Please note that you may need to restart your runtime after installation.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/jon/.pyenv/versions/transformers/bin/doc-builder", line 8, in <module>
    sys.exit(main())
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/commands/doc_builder_cli.py", line 47, in main
    args.func(args)
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/commands/preview.py", line 175, in preview_command
    source_files_mapping = build_doc(
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/build_doc.py", line 367, in build_doc
    anchors_mapping, source_files_mapping = build_mdx_files(
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/build_doc.py", line 230, in build_mdx_files
    raise type(e)(f"There was an error when converting {file} to the MDX format.\n" + e.args[0]) from e
ImportError: There was an error when converting docs/source/en/model_doc/bert.md to the MDX format.

TFBertTokenizer requires the tensorflow_text library but it was not found in your environment. You can install it with pip as
explained here: https://www.tensorflow.org/text/guide/tf_text_intro.
Please note that you may need to restart your runtime after installation.

tensorflow_text is unavailable on Mac OSX.

The error is the result of mro() being called by doc-builder's autodoc on the dummy TFBertTokenizer from transformers.utils.dummy_tensorflow_text_objects. This call results in __getattribute__ being called on DummyObject from transformers.utils.import_utils, which calls requires_backends that throws the ImportError.

Expected behavior

Either:

  1. Docs can be built, without auto generated documentation for the platform specific dependencies.
  2. Building documentation is not supported on macOS and is documented in https://github.com/huggingface/transformers/blob/main/docs/README.md

1 is probably preferable. Documentation of platform specific dependencies will still be built by CI on Github actions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions