Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to parse Heiner Eichmann's allged.ged #12

Open
wants to merge 58 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
473e3f4
add cool enum & trait for generic add data
pirtleshell Mar 1, 2021
e702719
start refactor into parsing with Parsable trait
pirtleshell Mar 1, 2021
5926a7e
add Parsable trait to Event
pirtleshell Mar 1, 2021
b58a287
implement Parsable for Header
pirtleshell Mar 1, 2021
dc97950
add Parsable to Family
pirtleshell Mar 1, 2021
5032f17
add handle_unexpected_token debug func
pirtleshell Mar 19, 2021
ebc5339
skip some unhandled tags, don't panic!
pirtleshell Mar 19, 2021
e9f7661
better debug function names & more graceful skips
pirtleshell Mar 20, 2021
a6334b1
handle Y line value for events
pirtleshell Mar 20, 2021
b847200
handle all the events! :tada: :beers:
pirtleshell Mar 20, 2021
8875d60
successfully parse allged.ged with only warnings!
pirtleshell Mar 20, 2021
62ddf9a
allow repeat event facts for family
pirtleshell Mar 20, 2021
8f910e7
Merge branch 'refactor/parseable' into main
pirtleshell Mar 20, 2021
ed3adef
refactor out family_link to own file
pirtleshell Mar 20, 2021
95cdb7f
impl more Parseable types
pirtleshell Mar 20, 2021
644736b
use take_tag & it parses CustomTag strs
pirtleshell Mar 20, 2021
4ed9780
support event descriptors
pirtleshell Mar 20, 2021
04fe045
set & get level directly from parser
pirtleshell Mar 20, 2021
7bdb0bd
dbg() outputs level & skip_block requires no args
pirtleshell Mar 20, 2021
592f852
unhandled block skippers don't take level arg
pirtleshell Mar 20, 2021
83cc2e3
parse_citation does not need level arg
pirtleshell Mar 20, 2021
d94e706
remove more level fn signatures
pirtleshell Mar 20, 2021
c235cd3
Handle header structure, including header source
ge3224 Aug 28, 2022
d3e5029
Merge branch 'handle-head-sour'
ge3224 Nov 3, 2022
9e4b8c9
Add Parse trait and implement types.
ge3224 Nov 4, 2022
b0cec26
Merge branch 'parse-trait'
ge3224 Nov 8, 2022
4afc2b1
Resolve conflicts after merge with parse-trait
ge3224 Nov 8, 2022
d3dab7c
Fix warning in tests/json_feature.rs
ge3224 Nov 8, 2022
4be608c
Handle multimedia_record
ge3224 Nov 16, 2022
9e6491f
Modify comment in Multimedia struct
ge3224 Nov 20, 2022
77c127c
Modify README
ge3224 Nov 20, 2022
51bdfdc
Modify README
ge3224 Nov 20, 2022
5061062
Handle additional tags in submitter
ge3224 Nov 22, 2022
1d6cf1c
Make some util helper functions methods of Tokenizer
ge3224 Nov 23, 2022
dba368c
Format tokenizer and copyright
ge3224 Nov 25, 2022
123566e
Handle more tags in SourceCitation and Event structures
ge3224 Nov 25, 2022
3a03f8b
Refactor of several structs and tests
ge3224 Nov 26, 2022
8c6b26b
Add tests for Date
ge3224 Nov 26, 2022
c31c9a2
Modify test for Date
ge3224 Nov 26, 2022
9674aff
Handle LineValue for events, e.g. RESI
ge3224 Nov 26, 2022
f97c873
Modify Continuation and Gender datasets
ge3224 Nov 27, 2022
d222adc
Add Cremation event type
ge3224 Nov 27, 2022
d67b095
Modify FamilyLink dataset
ge3224 Nov 27, 2022
7ff27b5
Add handling for Submission Record
ge3224 Nov 28, 2022
88bebe2
Modify event, individual, family, and source
ge3224 Nov 28, 2022
3ba2fcd
Remove typo in Event
ge3224 Nov 28, 2022
6b08f25
Generalize documentation re: SubmissionRecord
ge3224 Nov 28, 2022
9540260
Modify example in SubmissionRecord docs
ge3224 Nov 28, 2022
db662d0
Clarify docs for Event
ge3224 Nov 28, 2022
551dcaf
Modify README
ge3224 Nov 28, 2022
842ef4e
Add handler for Individual Attributes
ge3224 Nov 28, 2022
98ceb18
Handle NOTE, CHAN for Individual; NOTE for FamilyLink
ge3224 Nov 28, 2022
c8146f0
Modify Family & Source datasets to handle more tags
ge3224 Nov 30, 2022
872806b
Modify README
ge3224 Nov 30, 2022
f6a10de
Refactor Parser implementations
ge3224 Dec 1, 2022
69706e4
Handle possible subset data of UserDefinedDatasets
ge3224 Dec 3, 2022
fd45867
Add more documentation individual types
ge3224 Dec 12, 2022
7f2cc55
Update README
ge3224 Jan 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
/target
.DS_Store
10 changes: 6 additions & 4 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

66 changes: 66 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# rust-gedcom

<!-- <a href="https://crates.io/crates/gedcom"> -->
<!-- <img style="display: inline!important" src="https://img.shields.io/crates/v/gedcom.svg"></img> -->
<!-- </a> -->
<!-- <a href="https://docs.rs/gedcom"> -->
<!-- <img style="display: inline!important" src="https://docs.rs/gedcom/badge.svg"></img> -->
<!-- </a> -->

> A gedcom parser written in rust 🦀

## About this project

GEDCOM is a file format for sharing genealogical information like family trees.

`rust-gedcom` hopes to be ~~fully~~ mostly compliant with the [Gedcom 5.5.1 specification](https://edge.fscdn.org/assets/img/documents/ged551-5bac5e57fe88dd37df0e153d9c515335.pdf).

Later specifications, such as [5.5.2](https://jfcardinal.github.io/GEDCOM-5.5.2/gedcom-5.5.2.html) and [7.0.11](https://gedcom.io/specifications/FamilySearchGEDCOMv7.html#purpose-and-content-of-the-familysearch-gedcom-specification), are useful in assessing which tags are worth supporting or not.

## Usage

This crate comes in two parts. The first is a binary called `parse_gedcom`, mostly used for testing & development. It prints the `GedcomData` object and some stats about the GEDCOM file passed into it:
```bash
parse_gedcom ./tests/fixtures/sample.ged

# outputs tree data here w/ stats
# ----------------------
# | Gedcom Data Stats: |
# ----------------------
# submissions: 0
# submitters: 1
# individuals: 3
# families: 2
# repositories: 1
# sources: 1
# multimedia: 0
# ----------------------
```

The second is a library containing the parser.

## JSON Serializing/Deserializing with `serde`
This crate has an optional feature called `json` that implements `Serialize` & `Deserialize` for the gedcom data structure. This allows you to easily integrate with the web.

For more info about serde, [check them out](https://serde.rs/)!

The feature is not enabled by default. There are zero dependencies if just using the gedcom parsing functionality.

Use the json feature with any version >=0.2.1 by adding the following to your Cargo.toml:
```toml
gedcom = { version = "<version>", features = ["json"] }
```

## 🚧 Progress 🚧

There are still parts of the specification not yet implemented, and the project is subject to change. The way development has been happening is by taking a GEDCOM file, attempting to parse it and acting on whatever errors or omissions occur. In its current state, it is capable of parsing [Heiner Eichmann's](http://heiner-eichmann.de/gedcom/allged.htm) [`allged.ged`](tests/fixtures/allged.ged) in its entirety.

Here are some notes about parsed data & tags. Page references are to the [Gedcom 5.5.1 specification](https://edge.fscdn.org/assets/img/documents/ged551-5bac5e57fe88dd37df0e153d9c515335.pdf).

### Top-level tags

Tags for families (`FAM`), individuals (`IND`), repositories (`REPO`), sources (`SOUR`), and submitters (`SUBM`) are handled. Many of the most common sub-tags for these are handled though some may not yet be parsed. Mileage may vary.

## License

Licensed under [MIT](license.md).
77 changes: 0 additions & 77 deletions readme.md

This file was deleted.

10 changes: 6 additions & 4 deletions src/bin.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
use gedcom::parser::Parser;
use gedcom::GedcomData;
// use ged::{GedcomDocument, GedcomData};

use std::env;
use std::fs;
use std::path::PathBuf;

use gedcom::{GedcomData, GedcomDocument};

fn main() {
let args: Vec<String> = env::args().collect();
match args.len() {
Expand All @@ -21,8 +23,8 @@ fn main() {
let data: GedcomData;

if let Ok(contents) = read_relative(filename) {
let mut parser = Parser::new(contents.chars());
data = parser.parse_record();
let mut doc = GedcomDocument::new(contents.chars());
data = doc.parse_document();

println!("Parsing complete!");
// println!("\n\n{:#?}", data);
Expand Down
Loading