Skip to content

Commit

Permalink
merged (again??)!
Browse files Browse the repository at this point in the history
  • Loading branch information
Franz Matthies committed Sep 23, 2016
2 parents 4cefa8c + 2d3f8ea commit 6b18534
Show file tree
Hide file tree
Showing 25 changed files with 122 additions and 11 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@

# /
*.tmp
.settings
.project
.classpath
Expand Down
7 changes: 6 additions & 1 deletion jcore-ace-reader/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# JCoRe ACE Collection Reader
A Collection Reader that converts ACE-XML ([Automatic Content Extraction](https://www.ldc.upenn.edu/collaborations/past-projects/ace)) files to CAS objects.
A Collection Reader that converts ACE-XML ([Automatic Content Extraction](https://www.ldc.upenn.edu/collaborations/past-projects/ace)) files to CAS objects.

**Descriptor Path**:
```
de.julielab.jcore.reader.ace.desc.jcore-ace-reader
```

### Objective
The JULIE Lab ACE Reader is a UIMA Collection Reader (CR). It reads the English section of the ACE 2005 Multilingual Training Corpus data, which is given as XML files, and converts it to types defined in the UIMA type system that we provide as well.
Expand Down
7 changes: 6 additions & 1 deletion jcore-acronym-ae/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# JCoRe Acronym Analysis Engine
This is a reimplementation of the Schwartz and Hearst Algorithm for the resolution of acronyms (short form → long form).
This is a reimplementation of the Schwartz and Hearst Algorithm for the resolution of acronyms (short form → long form).

**Descriptor Path**:
```
de.julielab.jcore.ae.acronymtagger.desc.jcore-acronym-ae
```

### Objective
JULIE Lab Acronym Annotator (JACRO) is an UIMA Analysis Engine that annotates acronyms with their full-forms when locally introduced in the current document. The functionality of the engine is based on the simple algorithm for abbreviation recognition by Schwartz and Hearst. We have reimplemented the algorithm and extended it with respect to some pattern definitions and normalizations.
Expand Down
7 changes: 6 additions & 1 deletion jcore-bionlp09event-consumer/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# JCoRe BioNLP 09 Event Consumer
Consumer that writes CAS annotations into the [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) format.
Consumer that writes CAS annotations into the [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) format.

**Descriptor Path**:
```
de.julielab.jcore.consumer.bionlp09event.desc.jcore-bionlp09event-consumer
```

### Objective
This consumer takes the annotations specified in **Capabilities** and outputs three seperate text files for each document to `outDirectory`. The text files follow the format of the [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) and are therefore applicable for being evaluated by their eval tool or the online evaluation of the test files.
Expand Down
7 changes: 6 additions & 1 deletion jcore-bionlp09event-reader/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# JCoRe BioNLP 09 Event Reader
Reader that converts [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) formatted files to CAS objects.
Reader that converts [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) formatted files to CAS objects.

**Descriptor Path**:
```
de.julielab.jcore.reader.bionlp09event.desc.jcore-bionlp09event-reader
```

### Objective
This reader takes as input a folder with `txt`,`a1` & `a2` files - format of the [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) - and creates CAS annotations as described in **Capabilities**. Therefore it serves e.g. as reader in [relation extraction pipelines](https://github.com/JULIELab/jcore-pipelines/tree/master/jcore-relation-extraction-pipeline).
Expand Down
7 changes: 6 additions & 1 deletion jcore-cas2iob-consumer/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# JCoRe CAS2IOB Consumer
Consumer that generates IOB formatted files for specified annotations.
Consumer that generates IOB formatted files for specified annotations.

**Descriptor Path**:
```
de.julielab.jcore.consumer.cas2iob.desc.jcore-cas2iob-consumer
```

### Objective
This consumer writes annotations in a UIMA CAS out to the IOB format. If two annotations are concurring for the same token, the annotation with the longer span is preferred.
Expand Down
10 changes: 9 additions & 1 deletion jcore-coordination-baseline-ae/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# JCoRe Coordination Baseline Analysis Engine
This is the baseline to the [JCoRe Coordination Analysis Engine](https://github.com/JULIELab/jcore-base/tree/issue7-coordFix/jcore-coordination-ae). As of now the Coordination AE is only available in its baseline form for the JCoRe package. The full-fledged AE is work-in-progress and not yet ready for distribution (see issue 7).
This is the baseline to the [JCoRe Coordination Analysis Engine](https://github.com/JULIELab/jcore-base/tree/issue7-coordFix/jcore-coordination-ae). As of now the Coordination AE is only available in its baseline form for the JCoRe package. The full-fledged AE is work-in-progress and not yet ready for distribution (see issue 7).

**Descriptor Path**:
```
de.julielab.jcore.ae.coordbaseline.desc.jcore-coordination-baseline-ae-conjunct
de.julielab.jcore.ae.coordbaseline.desc.jcore-coordination-baseline-ae-coordination
de.julielab.jcore.ae.coordbaseline.desc.jcore-coordination-baseline-ae-eee
de.julielab.jcore.ae.coordbaseline.desc.jcore-coordination-baseline-ae-ellipsis
```

### Objective

Expand Down
7 changes: 6 additions & 1 deletion jcore-dta-reader/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# JCoRe DTA Collection Reader
Reader for DTA files (German digital humanties corpus).
DTA uses a TEI variant, cf. http://www.deutschestextarchiv.de/doku/basisformat
DTA uses a TEI variant, cf. http://www.deutschestextarchiv.de/doku/basisformat

**Descriptor Path**:
```
de.julielab.jcore.reader.dta.desc.jcore-dta-reader
```

### Objective

Expand Down
7 changes: 6 additions & 1 deletion jcore-file-reader/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
JCoRe File Reader for reading in text files.
JCoRe File Reader for reading in text files.

**Descriptor Path**:
```
de.julielab.jcore.reader.file.desc.jcore-file-reader
```

### Objective
This is a reader for reading in text files, providing them to UIMA for further processing.
Expand Down
7 changes: 6 additions & 1 deletion jcore-iexml-consumer/README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,7 @@
# JCoRe IEXML Consumer
Consumer that generates stand-off IEXML files as used in the mantra project/challenge
Consumer that generates stand-off IEXML files as used in the mantra project/challenge

**Descriptor Path**:
```
de.julielab.jcore.consumer.iexml.desc.jcore-iexml-consumer
```
7 changes: 6 additions & 1 deletion jcore-iexml-reader/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# JCoRe IEXML Collection Reader
Reader for IEXML files as used in the mantra project/challenge
Reader for IEXML files as used in the mantra project/challenge

**Descriptor Path**:
```
de.julielab.jcore.reader.iexml.desc.jcore-iexml-reader
```

### Objective

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,8 @@ public void train(InstanceList instList, Pipe dataPipe) {
CRFTrainerByLabelLikelihood crfTrainer = new CRFTrainerByLabelLikelihood(model);

// do the training with unlimited amount of iterations
// --> refrained from using modified version of mallet;
// it's now the original source
boolean b = crfTrainer.train(instList);
LOGGER.info("SentencesSplitter training: model converged: " + b);

Expand Down
2 changes: 2 additions & 0 deletions jcore-jtbd-ae/src/main/java/de/julielab/jtbd/Tokenizer.java
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,8 @@ void train(final InstanceList instList, final Pipe myPipe) {
model);

// do the training with unlimited amount of iterations
// --> refrained from using modified version of mallet;
// it's now the original source
final boolean b = crfTrainer.train(instList);
LOGGER.info("Tokenizer training: model converged: " + b);

Expand Down
6 changes: 6 additions & 0 deletions jcore-lingpipe-porterstemmer-ae/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Lingpipe Porterstemmer

**Descriptor Path**:
```
de.julielab.jcore.ae.lingpipe.porterstemmer.desc.jcore-lingpipe-porterstemmer-ae
```
4 changes: 4 additions & 0 deletions jcore-mantra-xml-types/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# JCoRe Mantra XML Types

### Reference
[1] Some Reference
7 changes: 6 additions & 1 deletion jcore-muc7-reader/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# JCoRe MUC7 Collection Reader
Reader that converts MUC-7 (Message Understanding Conference) files to CAS objects
Reader that converts MUC-7 (Message Understanding Conference) files to CAS objects

**Descriptor Path**:
```
de.julielab.jcore.reader.muc7.desc.jcore-muc7-reader
```

### Objective
The MUC7 Reader reads in the data from the Message Understanding Conference (MUC) 7 Corpus.
Expand Down
4 changes: 4 additions & 0 deletions jcore-opennlp-chunk-ae/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# JCoRe OpenNLP Chunker Wrapper

### Reference
[1] Some Reference
4 changes: 4 additions & 0 deletions jcore-opennlp-parser-ae/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# JCoRe OpenNLP Parser Wrapper

### Reference
[1] Some Reference
4 changes: 4 additions & 0 deletions jcore-opennlp-postag-ae/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# JCoRe OpenNLP POS Tagger Wrapper

### Reference
[1] Some Reference
4 changes: 4 additions & 0 deletions jcore-opennlp-sentence-ae/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# JCoRe OpenNLP Sentence Segmenter Wrapper

### Reference
[1] Some Reference
4 changes: 4 additions & 0 deletions jcore-opennlp-token-ae/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# JCoRe OpenNLP Tokenizer Wrapper

### Reference
[1] Some Reference
7 changes: 7 additions & 0 deletions jcore-stanford-lemmatizer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Stanford Lemmatizer


**Descriptor Path**:
```
de.julielab.jcore.ae.stanford.lemma.desc.jcore-stanford-lemmatizer
```
2 changes: 2 additions & 0 deletions jcore-utilities/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# JCoRe Utilities

6 changes: 6 additions & 0 deletions jcore-xmi-writer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# JCoRe XMI Writer

**Descriptor Path**:
```
de.julielab.jcore.consumer.xmi.desc.jcore-xmi-writer
```
3 changes: 3 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,9 @@
<module>jcore-xml-mapper</module>
<module>jcore-xml-reader</module>
<module>jcore-xmi-writer</module>
<module>jcore-file-reader</module>
<module>jcore-lingpipe-porterstemmer-ae</module>
<module>jcore-dta-reader</module>
</modules>
<scm>
<connection>scm:git:git://github.com/JULIELab/jcore-base.git</connection>
Expand Down

0 comments on commit 6b18534

Please sign in to comment.