merged (again??)!

JULIELab · Sep 23, 2016 · 6b18534 · 6b18534
2 parents 4cefa8c + 2d3f8ea
commit 6b18534
Show file tree

Hide file tree

Showing 25 changed files with 122 additions and 11 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,6 @@
 
 # /
+*.tmp
 .settings
 .project
 .classpath

diff --git a/jcore-ace-reader/README.md b/jcore-ace-reader/README.md
@@ -1,5 +1,10 @@
 # JCoRe ACE Collection Reader
-A Collection Reader that converts ACE-XML ([Automatic Content Extraction](https://www.ldc.upenn.edu/collaborations/past-projects/ace)) files to CAS objects.
+A Collection Reader that converts ACE-XML ([Automatic Content Extraction](https://www.ldc.upenn.edu/collaborations/past-projects/ace)) files to CAS objects.  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.reader.ace.desc.jcore-ace-reader
+```
 
 ### Objective
 The JULIE Lab ACE Reader is a UIMA Collection Reader (CR). It reads the English section of the ACE 2005 Multilingual Training Corpus data, which is given as XML files, and converts it to types defined in the UIMA type system that we provide as well.

diff --git a/jcore-acronym-ae/README.md b/jcore-acronym-ae/README.md
@@ -1,5 +1,10 @@
 # JCoRe Acronym Analysis Engine
-This is a reimplementation of the Schwartz and Hearst Algorithm for the resolution of acronyms (short form → long form).
+This is a reimplementation of the Schwartz and Hearst Algorithm for the resolution of acronyms (short form → long form).  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.ae.acronymtagger.desc.jcore-acronym-ae
+```
 
 ### Objective
 JULIE Lab Acronym Annotator (JACRO) is an UIMA Analysis Engine that annotates acronyms with their full-forms when locally introduced in the current document. The functionality of the engine is based on the simple algorithm for abbreviation recognition by Schwartz and Hearst. We have reimplemented the algorithm and extended it with respect to some pattern definitions and normalizations.

diff --git a/jcore-bionlp09event-consumer/README.md b/jcore-bionlp09event-consumer/README.md
@@ -1,5 +1,10 @@
 # JCoRe BioNLP 09 Event Consumer
-Consumer that writes CAS annotations into the [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) format.
+Consumer that writes CAS annotations into the [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) format.  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.consumer.bionlp09event.desc.jcore-bionlp09event-consumer
+```
 
 ### Objective
 This consumer takes the annotations specified in **Capabilities** and outputs three seperate text files for each document to `outDirectory`. The text files follow the format of the [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) and are therefore applicable for being evaluated by their eval tool or the online evaluation of the test files.

diff --git a/jcore-bionlp09event-reader/README.md b/jcore-bionlp09event-reader/README.md
@@ -1,5 +1,10 @@
 # JCoRe BioNLP 09 Event Reader
-Reader that converts [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) formatted files to CAS objects.
+Reader that converts [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) formatted files to CAS objects.  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.reader.bionlp09event.desc.jcore-bionlp09event-reader
+```
 
 ### Objective
 This reader takes as input a folder with `txt`,`a1` & `a2` files - format of the [BioNLP Shared Task](http://www.nactem.ac.uk/tsujii/GENIA/SharedTask/index.shtml#data) - and creates CAS annotations as described in **Capabilities**. Therefore it serves e.g. as reader in [relation extraction pipelines](https://github.com/JULIELab/jcore-pipelines/tree/master/jcore-relation-extraction-pipeline).

diff --git a/jcore-cas2iob-consumer/README.md b/jcore-cas2iob-consumer/README.md
@@ -1,5 +1,10 @@
 # JCoRe CAS2IOB Consumer
-Consumer that generates IOB formatted files for specified annotations.
+Consumer that generates IOB formatted files for specified annotations.  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.consumer.cas2iob.desc.jcore-cas2iob-consumer
+```
 
 ### Objective
 This consumer writes annotations in a UIMA CAS out to the IOB format. If two annotations are concurring for the same token, the annotation with the longer span is preferred.

diff --git a/jcore-coordination-baseline-ae/README.md b/jcore-coordination-baseline-ae/README.md
@@ -1,5 +1,13 @@
 # JCoRe Coordination Baseline Analysis Engine
-This is the baseline to the [JCoRe Coordination Analysis Engine](https://github.com/JULIELab/jcore-base/tree/issue7-coordFix/jcore-coordination-ae). As of now the Coordination AE is only available in its baseline form for the JCoRe package. The full-fledged AE is work-in-progress and not yet ready for distribution (see issue 7).
+This is the baseline to the [JCoRe Coordination Analysis Engine](https://github.com/JULIELab/jcore-base/tree/issue7-coordFix/jcore-coordination-ae). As of now the Coordination AE is only available in its baseline form for the JCoRe package. The full-fledged AE is work-in-progress and not yet ready for distribution (see issue 7).  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.ae.coordbaseline.desc.jcore-coordination-baseline-ae-conjunct
+de.julielab.jcore.ae.coordbaseline.desc.jcore-coordination-baseline-ae-coordination
+de.julielab.jcore.ae.coordbaseline.desc.jcore-coordination-baseline-ae-eee
+de.julielab.jcore.ae.coordbaseline.desc.jcore-coordination-baseline-ae-ellipsis
+```
 
 ### Objective
 

diff --git a/jcore-dta-reader/README.md b/jcore-dta-reader/README.md
@@ -1,6 +1,11 @@
 # JCoRe DTA Collection Reader
 Reader for DTA files (German digital humanties corpus).
-DTA uses a TEI variant, cf. http://www.deutschestextarchiv.de/doku/basisformat
+DTA uses a TEI variant, cf. http://www.deutschestextarchiv.de/doku/basisformat  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.reader.dta.desc.jcore-dta-reader
+```
 
 ### Objective
 

diff --git a/jcore-file-reader/README.md b/jcore-file-reader/README.md
@@ -1,4 +1,9 @@
- JCoRe File Reader for reading in text files.
+ JCoRe File Reader for reading in text files.  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.reader.file.desc.jcore-file-reader
+```
 
 ### Objective
  This is a reader for reading in text files, providing them to UIMA for further processing.

diff --git a/jcore-iexml-consumer/README.md b/jcore-iexml-consumer/README.md
@@ -1,2 +1,7 @@
 # JCoRe IEXML Consumer
-Consumer that generates stand-off IEXML files as used in the mantra project/challenge
+Consumer that generates stand-off IEXML files as used in the mantra project/challenge  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.consumer.iexml.desc.jcore-iexml-consumer
+```
diff --git a/jcore-iexml-reader/README.md b/jcore-iexml-reader/README.md
@@ -1,5 +1,10 @@
 # JCoRe IEXML Collection Reader
-Reader for IEXML files as used in the mantra project/challenge
+Reader for IEXML files as used in the mantra project/challenge  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.reader.iexml.desc.jcore-iexml-reader
+```
 
 ### Objective
 

diff --git a/jcore-jsbd-ae/src/main/java/de/julielab/jsbd/SentenceSplitter.java b/jcore-jsbd-ae/src/main/java/de/julielab/jsbd/SentenceSplitter.java
@@ -137,6 +137,8 @@ public void train(InstanceList instList, Pipe dataPipe) {
 		CRFTrainerByLabelLikelihood crfTrainer = new CRFTrainerByLabelLikelihood(model);
 
 		// do the training with unlimited amount of iterations
+		// --> refrained from using modified version of mallet;
+		// it's now the original source
 		boolean b = crfTrainer.train(instList);
 		LOGGER.info("SentencesSplitter training: model converged: " + b);
 

diff --git a/jcore-jtbd-ae/src/main/java/de/julielab/jtbd/Tokenizer.java b/jcore-jtbd-ae/src/main/java/de/julielab/jtbd/Tokenizer.java
@@ -312,6 +312,8 @@ void train(final InstanceList instList, final Pipe myPipe) {
 				model);
 
 		// do the training with unlimited amount of iterations
+		// --> refrained from using modified version of mallet;
+		// it's now the original source
 		final boolean b = crfTrainer.train(instList);
 		LOGGER.info("Tokenizer training: model converged: " + b);
 

diff --git a/jcore-lingpipe-porterstemmer-ae/README.md b/jcore-lingpipe-porterstemmer-ae/README.md
@@ -0,0 +1,6 @@
+# Lingpipe  Porterstemmer
+
+**Descriptor Path**:
+```
+de.julielab.jcore.ae.lingpipe.porterstemmer.desc.jcore-lingpipe-porterstemmer-ae
+```
diff --git a/jcore-mantra-xml-types/README.md b/jcore-mantra-xml-types/README.md
@@ -0,0 +1,4 @@
+# JCoRe Mantra XML Types
+
+### Reference
+[1] Some Reference
diff --git a/jcore-muc7-reader/README.md b/jcore-muc7-reader/README.md
@@ -1,5 +1,10 @@
 # JCoRe MUC7 Collection Reader
-Reader that converts MUC-7 (Message Understanding Conference) files to CAS objects
+Reader that converts MUC-7 (Message Understanding Conference) files to CAS objects  
+
+**Descriptor Path**:
+```
+de.julielab.jcore.reader.muc7.desc.jcore-muc7-reader
+```
 
 ### Objective
 The MUC7 Reader reads in the data from the Message Understanding Conference (MUC) 7 Corpus.

diff --git a/jcore-opennlp-chunk-ae/README.md b/jcore-opennlp-chunk-ae/README.md
@@ -0,0 +1,4 @@
+# JCoRe OpenNLP Chunker Wrapper
+
+### Reference
+[1] Some Reference
diff --git a/jcore-opennlp-parser-ae/README.md b/jcore-opennlp-parser-ae/README.md
@@ -0,0 +1,4 @@
+# JCoRe OpenNLP Parser Wrapper
+
+### Reference
+[1] Some Reference
diff --git a/jcore-opennlp-postag-ae/README.md b/jcore-opennlp-postag-ae/README.md
@@ -0,0 +1,4 @@
+# JCoRe OpenNLP POS Tagger Wrapper
+
+### Reference
+[1] Some Reference
diff --git a/jcore-opennlp-sentence-ae/README.md b/jcore-opennlp-sentence-ae/README.md
@@ -0,0 +1,4 @@
+# JCoRe OpenNLP Sentence Segmenter Wrapper
+
+### Reference
+[1] Some Reference
diff --git a/jcore-opennlp-token-ae/README.md b/jcore-opennlp-token-ae/README.md
@@ -0,0 +1,4 @@
+# JCoRe OpenNLP Tokenizer Wrapper
+
+### Reference
+[1] Some Reference
diff --git a/jcore-stanford-lemmatizer/README.md b/jcore-stanford-lemmatizer/README.md
@@ -0,0 +1,7 @@
+# Stanford Lemmatizer
+
+
+**Descriptor Path**:
+```
+de.julielab.jcore.ae.stanford.lemma.desc.jcore-stanford-lemmatizer
+```
diff --git a/jcore-utilities/README.md b/jcore-utilities/README.md
@@ -0,0 +1,2 @@
+# JCoRe Utilities
+
diff --git a/jcore-xmi-writer/README.md b/jcore-xmi-writer/README.md
@@ -0,0 +1,6 @@
+# JCoRe XMI Writer
+
+**Descriptor Path**:
+```
+de.julielab.jcore.consumer.xmi.desc.jcore-xmi-writer
+```
diff --git a/pom.xml b/pom.xml
@@ -70,6 +70,9 @@
 		<module>jcore-xml-mapper</module>
 		<module>jcore-xml-reader</module>
 		<module>jcore-xmi-writer</module>
+		<module>jcore-file-reader</module>
+		<module>jcore-lingpipe-porterstemmer-ae</module>
+		<module>jcore-dta-reader</module>
 	</modules>
 	<scm>
         <connection>scm:git:git://github.com/JULIELab/jcore-base.git</connection>