Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listed Variant ClinVar IDs return 404 #1179

Open
korikuzma opened this issue Jan 23, 2025 · 2 comments · May be fixed by #1182
Open

Listed Variant ClinVar IDs return 404 #1179

korikuzma opened this issue Jan 23, 2025 · 2 comments · May be fixed by #1182

Comments

@korikuzma
Copy link

I was talking with @larrybabb and we noticed that there were several ClinVar IDs for a variant. For example, CIViC variant ID 33 lists the following ClinVar IDs: 16609, 376282, 376280. The last two return 404 Not Found. BRAF V600E variant also has this issue, so we're assuming there are other variants that list ClinVar IDs that do not exist. We're not sure what these old ClinVar IDs were for. Should an update be made to only include ClinVar IDs that do not return 404?

@acoffman
Copy link
Member

Interesting find! The ClinVar ids are a manually curated field and it appears that the two now returning 404 were added in 2019:

Image

So, they presumably existed at one point but no longer do. I am not familiar with the process by which ClinVar handles deleting records, but I guess it looks like there isn't even a placeholder left afterwards?

Generally speaking, for manually curated fields, each change has to be proposed and then reviewed by an editor before accepting. In cases like this where some automation is possible, we can have CIViCbot propose changes for editors to review. It would be nice if ClinVar produced some sort of feed of record deletions but I'm not seeing anything. I can check how many we have in CIViC that have since been removed, and have CIViCbot suggest revisions to the records.

Depending on how many we find, it may make sense to have this be a recurring check that happens periodically.

@acoffman
Copy link
Member

acoffman commented Jan 28, 2025

It looks like we link to the following ClinVar IDs which no longer exist:

376090
376340
376137
375939
376069
376288
375880
375879
375883
376282
376280
375991
375992
375990
376205
376343
375995
375890
375892
375988
375968
375967
375871
375875
376055
376250
376345
376141
376142
376143
376144
375878
376218
76768
76769
376168
376133
13859
375966
376324
376349
376350
376351
376352
376094
376353
376354
376357
376358
376359
375938
376136
376362
375971
375973
376341
376081
376189
375957
376157
376156
376317
375888
376001
376070
375942
76687
40471
375877
376221
375980
375934
376167
376075
226028
376181
376183
375897
376096
376245
376053
375912
376101
376208
376079
376085
376086
376087
376089
376124
376295
376117
375977
375900
376281
376095
376251
376122
376120
376211
376119
375943
171079
376003
376047
376118
376123
376121
376435
376248
376219

Given how close the IDs of many of them are, I assume a lot of it is the removal of the DoCM data set, though not all. I will have a PR open shortly that will automatically flag the involved Variants for review.

@acoffman acoffman linked a pull request Jan 28, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants