Disable curation & spreadsheet handling

Following extensive testing that showed oracle performed much better than classic docmatching, we disabled curation of daily and weekly docmatching in June 2024.  The crontab that automatically looks for curated files was disabled in mid-June but doing so left some of the infrastructure activated within the pipeline itself.  The things that need to be disabled are
- uploading to Google Sheets
- broadcast of docmatching status via Slack

Additionally, the existing doc-matching process exports results to a backoffice file formatted for Google Sheets; see adsdocmatch/match_w_metadata.py, L86.  Right now, the backoffice matching script will reprocess this file using `grep/sed/awk` to extract only those flagged as "Match" to a three column file (preprint bibcode, published bibcode, and score), and it is this result that is uploaded to oracle (using the `-mf` option) on a daily/weekly basis.

We can handle the first two issues easily with changes to `run.py`, and there is a PR in progress that addresses these.  The latter issue will require a small amount of extra coding both in this repository and in the backoffice `match` scripts; we will need to update classic so that it assumes it is getting a file that's already correctly formatted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable curation & spreadsheet handling #33

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Disable curation & spreadsheet handling #33

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions