Skip to content

ValueError: filedescriptor out of range in select() #97

@vpv-csc

Description

@vpv-csc

We are doing digital preservation. In some cases we are scraping metadata from thousands of image files in the same python process. As far as I understand, pyexiftool handles multiple files in the -stay_open mode. We are seeing the ValueError: filedescriptor out of range in select() error a lot in production.

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 812, in run
    self._ver = self._parse_ver()
  File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 1199, in _parse_ver
    return self.execute("-ver").strip()
  File "/usr/lib/python3.9/site-packages/exiftool/helper.py", line 132, in execute
    result: Union[str, bytes] = super().execute(*str_bytes_params, **kwargs)
  File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 1009, in execute
    raw_stdout = _read_fd_endswith(fdout, seq_ready.encode(self._encoding), self._block_size)
  File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 142, in _read_fd_endswith
    inputready, outputready, exceptready = select.select([fd], [], [])
ValueError: filedescriptor out of range in select()
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/bin/check-sip-digital-objects-3", line 8, in <module>
    sys.exit(main())
  File "/usr/lib/python3.9/site-packages/ipt/scripts/check_sip_digital_objects.py", line 39, in main
    report = validation_report(
  File "/usr/lib/python3.9/site-packages/ipt/scripts/check_sip_digital_objects.py", line 448, in validation_report
    for result in validation(mets_path=mets_path, catalog_path=catalog_path):
  File "/usr/lib/python3.9/site-packages/ipt/scripts/check_sip_digital_objects.py", line 356, in validation
    yield _validate(metadata_info)
  File "/usr/lib/python3.9/site-packages/ipt/scripts/check_sip_digital_objects.py", line 328, in _validate
    scraper_result, streams, grade = check_well_formed(
  File "/usr/lib/python3.9/site-packages/ipt/scripts/check_sip_digital_objects.py", line 174, in check_well_formed
    (mime, version) = scraper.detect_filetype()
  File "/usr/lib/python3.9/site-packages/file_scraper/scraper.py", line 242, in detect_filetype
    self._identify()
  File "/usr/lib/python3.9/site-packages/file_scraper/scraper.py", line 77, in _identify
    self._update_filetype(exiftool_detector)
  File "/usr/lib/python3.9/site-packages/file_scraper/scraper.py", line 89, in _update_filetype
    tool.detect()
  File "/usr/lib/python3.9/site-packages/file_scraper/detectors.py", line 339, in detect
    with exiftool.ExifToolHelper() as et:
  File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 317, in __enter__
    self.run()
  File "/usr/lib/python3.9/site-packages/exiftool/helper.py", line 150, in run
    super().run()
  File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 816, in run
    raise ExifToolVersionError(f"Error retrieving Exiftool info.  Is your Exiftool version ('exiftool -ver') >= required version ('{constants.EXIFTOOL_MINIMUM_VERSION}')?")
exiftool.exceptions.ExifToolVersionError: Error retrieving Exiftool info.  Is your Exiftool version ('exiftool -ver') >= required version ('12.15')?

If you happen to be interested in the check-sip-digital-objects(-3) command seen in the backtrace, that's here: https://github.com/Digital-Preservation-Finland/dpres-ipt
And our scraping tool is here: https://github.com/Digital-Preservation-Finland/file-scraper/

We are running version 0.5.5 that we packaged ourselves. It seems that 0.5.6 does not change anything related to this issue.
Exiftool is 12.70.

Someone has reported this same issue here earlier: https://exiftool.org/forum/index.php?topic=11067.0

man 2 select says

WARNING: select() can monitor only file descriptors numbers that are less than FD_SETSIZE (1024)—an unreasonably low limit for many modern applications—and this limitation will not change. All modern applications should instead use poll(2) or epoll(7), which do not suffer this limitation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions