Skip to content

Minor: logging ingest from files returns incorrect "percentage remaining" values #276

@seasidesparrow

Description

@seasidesparrow

When processing a list of bibcodes from the command line (run.py --ignore-json-fingerprints -b@/tmp/bibcodes.txt), logger writes a message after each batch of 100 bibcodes are processed indicating both how many remain to be processed, and what percentage of bibcodes have been processed. However, the percentages are incorrect.

For example when processing a list of 682 bibcodes, the logfiles generated contain the following:

"message": "There are 582 records left (5.8% completed)"
"message": "There are 482 records left (4.8% completed)"
"message": "There are 382 records left (3.8% completed)"
"message": "There are 282 records left (2.8% completed)"
"message": "There are 182 records left (1.8% completed)"
"message": "There are 82 records left (0.8% completed)"

The calculation is done at L87-91 of run.py:

        if i / step > j:
            logger.info('There are %s records left (%0.1f%% completed)'
                        % (len(records)-i, ((len(records)-i) / 100.0)))
            j = i / step
        i += bpj

The result of this logic is that the code will print the number of remaining bibcodes left divided by 100, which is not a percentage, and is also not the number of bibcodes completed. The fraction of bibcodes completed is the total number minus i, divided by the step size, not 100. In this case, step is 6.82. So the logging statement should be using (len(records)-i for the number remaining, and j (= i/step) for the percentage completed.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions