Skip to content

Releases: PatentsView/PatentsView-DB

Historic Parser

11 Dec 17:59
dcdd51b
Compare
Choose a tag to compare
v0.1

Raw Parser

Release for data update 06/30/2020

22 Sep 14:54
37395f9
Compare
Choose a tag to compare

Data Changes

  1. H patents are temporarily removed from the database awaiting reparsing. Currently, H patent numbers are incorrectly loaded with the corresponding check digits (from the raw data) loaded as part of the patent number. The data has been reparsed to remove erraneous mapping and will be added to the database for the next update
  2. Claims data has been reparsed to include newlines in the text and improved dependent field extraction has been implemented. Data from 1976 - 2000 (newlines, dependent field improvement) and 2005 - 2020 (newlines) has been posted on the bulk download page. Data from 2001 -2004 is being processed and will be posted as and when they become available

Pregrant Publications Data

  1. A beta version of USPTO's pre-grant publication data is now available at : www.patentsview.org/download/pregrantpublications.html. Users should note that this is pre-release product and may be missing data elements. We encourage users to report any issues that they find in the data.

API Changes

  1. The API has moved to the Amazon's Beanstalk platform and consequently the URL has changed. The new URL is https://api.patentsiew.org/. Previous URLs will redirect to the new URL, but POST requests will not work. The redirection is a temporary failsafe and users should update their URL to the updated URL.

Querytool Changes

  1. In an effort to reduce the delay in communication during a Querytool failure, we have implemeted an email alert system. We hope to utilize this system to be bit more quicker in resolving any errors that the Querytool may face.

Release for data update 03/31/2020

10 Jun 15:14
Compare
Choose a tag to compare

Bulk Download Changes

  • Line Breaks retained in text data:
    • Claims: all text from 2001 and later will have the line breaks in the text
    • Brief Summary Text:
      * Data from 2020 and later will have the line breaks retained in the text.
      * Line breaks for older data will get included when the first opportunity to reparse older data arises.
    • Detailed Description Text:
      * Data from 2020 and later will have the line breaks retained in the text.
      * Line breaks for older data will get included when the first opportunity to reparse older data arises.
    • Draw Description Text: Line breaks are not included at this time.
  • Location ID added to patent_assignee and patent_inventor
    • Previously to identify the location of a patent by the way of the assignee, patent_assignee needed to be joined with location_assignee and then with the location table. A similar join was needed for the patent inventor. To reduce the complexity, patent_assignee and patent_inventor tables will carry an additional field: location_id. This field will map to the id field from the location table. This makes the data in location_assignee and location_inventor redundant. Future releases will not carry these two tables.
  • Read In Scripts:
    • Example Python & R scripts that demonstrate reading each bulk download file will be available here: Read In Scripts This is a work in progress and will be updated over time.
  • Planned changes after 2020.03.22v1 release (Documentation and details will be added with the release)
    • Claims:
      • Remove duplicates in some of the claims yearly files where the first set of records (about 300K) are duplicated.
      • Remove NULL text data in some of the claims files.
      • Recode NUM field and add documentation.
      • Recode Exemplary field (replacing TRUE/FALSE with 0/1)
      • Re-order header to be consistent with data dictionary
    • Brief Summary Text:
      • Break files into yearly files
    • Draw Description Text:
      • Break files into yearly files
      • Include line breaks in the text
Table File(s) Data Contains Line Break Field Separator Quote Settings Quote Character
claims Yearly files from 1976 - 2005 No \t Non Numeric Fields Quoted "
claims Yearly files from 2005 - 2020 Yes \t Non Numeric Fields Quoted "
brf_sum_text Single bulk file Yes \t Non Numeric Fields Quoted "
detail_desc_text 2020 data file Yes \t Non Numeric Fields Quoted "
detail_desc_text 2019 data file No \t Non Numeric Fields Quoted "
detail_desc_text Yearly files from 1976 - 2018 No \t Unquoted N/A
draw_desc_text Single bulk file No \t Non Numeric Fields Quoted "
all other tables Single bulk file No \t Non Numeric Fields Quoted "