Skip to content
This repository has been archived by the owner on Apr 12, 2019. It is now read-only.

Data Maintance #79

Open
kordianbruck opened this issue Apr 17, 2017 · 1 comment
Open

Data Maintance #79

kordianbruck opened this issue Apr 17, 2017 · 1 comment

Comments

@kordianbruck
Copy link
Contributor

kordianbruck commented Apr 17, 2017

Hey guys,

one thing that might still be open: can the scraping cli command be run multiple times without inserting doubles?

We want to maintain the data in the coming 2-3 years at least. So that running the CLI multiple times should ensure that it checks if the rows are already in the database and update the entity accordingly.

We want to put this into a cronjob and run each week or month.

Thanks

@kordianbruck kordianbruck modified the milestones: 20.04 - Pre Hackathon, 23.04 Hackathon Apr 17, 2017
@sacdallago
Copy link
Member

http://docs.sequelizejs.com/en/latest/api/model/#upsertvalues-options-promisecreated might be a place to start, substituting inserts, but only if you have a logic in place to match two documents. You might eventually need to create a distance between to objects dictated by the object's fields (or features) and define a threshold by which you consider two objects similar :) but, but. For now, I would say that upsert instead of create should do, based on name & birthdate matching?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants