Skip to content

JSON Data Cleanup #119

@superteece

Description

@superteece

There's some lack of standardization in the way the JSON word lists are formatted which make using them programmatically a tad more difficult than should be. For example, "term": "blackhat-whitehat" is used as a combination entry for two terms whereas, "term": "blast-radius" is a single term.

I've written a Semgrep rules generator that uses your word list and data to scan code projects for terms and reports instances of use as findings ranked according to the tier. To do this I also had to clean up the JSON. You can see the revamped word list here: https://gitlab.com/SuperTeece/inclusive-language-semgrep-rules/-/blob/595cb5cf1d27c9006f59832c3aa864a7827e947c/data/word-lists.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions