Skip to content

(feat): expand scope of information #30

@seanmcbroom

Description

@seanmcbroom

Purpose

Expanding the database with data from more sources (example sentences, pitch accent data, estimated JLPT levels) will make it more versatile and valuable. By integrating this information directly into the source, developers won’t need to build parallel processes later to fetch or merge external data.

Implementation Idea

Treat enrichment as a post-processing step after JMdict is processed. Create additional processors for each dataset (pitch accent, JLPT lists, example sentence corpora). These processors can attach the supplemental data to the existing entries in a structured, consistent way. JLPT data, while unofficial, can be estimated using historical usage from past exams or publicly available frequency lists.

This approach ensures modularity (each processor handles its own data source) while keeping the final dataset unified and easy to query.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions