Skip to content

Commit 9189308

Browse files
committed
bi-directional relationships added
1 parent a741839 commit 9189308

File tree

2 files changed

+41
-37
lines changed

2 files changed

+41
-37
lines changed

lib/rules.js

+1-1
Original file line numberDiff line numberDiff line change
@@ -480,7 +480,7 @@ const PrimaryMaterial = {
480480
communicationMode: RepositoryCollection.communicationMode,
481481
inLanguage: {
482482
clause:
483-
'MUST have a inLanguage property, or the RepositoryObject that is `partOf` MUST have a inLanguage property, referencing a Language item (language my be inhereted from the parent RepoObject)',
483+
'MUST have an `inLanguage` property, or the RepositoryObject that is `isPartOf` MUST have an `inLanguage` property, referencing a Language item (language my be inhereted from the parent RepoObject)',
484484
validate({ error, results }, values) {
485485
if (!values || !values.length) {
486486
error('There is no language property');

profile-text/profile.md

+40-36
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,8 @@ This profile covers various kinds of crate metadata:
5050
the abstract structure of nested collections (e.g. collections/corpora or other
5151
curated datasets) and objects of study; linguistic Items, Sessions or Texts).
5252
This profile assumes that a repository (for example, an OCFL storage root,
53-
with an API for accessing it) exists and that it can at a minimum support
54-
(a) listing all items of the repository and returning their RO-Crate metadata, and
53+
with an API for accessing it) exists and that it can at a minimum support
54+
(a) listing all items of the repository and returning their RO-Crate metadata, and
5555
(b) retrieving an item given its ID.
5656

5757
- **Types of language data** - is this resource a dialogue? A written text? A
@@ -105,10 +105,21 @@ transmitted either as:
105105
Object data.
106106

107107
Distributed Collections may reference member collections or Objects in
108-
hasMember property but should not include descriptions of Objects that
108+
\`hasMember\` property but should not include descriptions of Objects that
109109
are stored elsewhere in the repository.
110110

111-
Objects are placed in a Collection using the \`memberOf\` property (\`pcdm:memberOf\`), which is required. The reverse may also be encoded using the \`hasMember\` property on a Collection.
111+
<br>
112+
113+
## Bi-directional relationships
114+
115+
| Term | Inverse Of | Micro |
116+
| ------------- | ---------- | ------------ |
117+
| \`hasPart\` || \`isPartOf\` |
118+
| \`hasMember\` || \`memberOf\` |
119+
120+
Objects are placed in a Collection using the \`memberOf\` property (\`pcdm:memberOf\`), which is required. The inverse will be encoded automatically using the \`hasMember\` property on a Collection. Similarly, if using \`hasMember\`, \`memberOf\` will also be automatically encoded.
121+
122+
Depending on the data, one term relationship may be preferable. For example, if you are describing multiple files in a spreadsheet, it is easier to use \`isPartOf\` at the file level referencing the object it belongs to, rather than listing all the \`hasPart\` entries at the object level.
112123

113124
The following diagram shows how these relationships are encoded in a single "bundled" RO-Crate.
114125

@@ -132,7 +143,7 @@ repository developers.
132143
files.
133144

134145
- The collection and all its files can easily be transferred in a
135-
single transaction - say 20Gb total.
146+
single transaction - say 20 GB total.
136147

137148
- All the material in the corpus shares the same license for reuse.
138149

@@ -142,9 +153,9 @@ repository developers.
142153

143154
- The collection is not yet stable:
144155

145-
- New items are being added or changed.
156+
- New items are being added or changed.
146157

147-
- There is a chance that some data may have to be taken down or withdrawn at the request of participants.
158+
- There is a chance that some data may have to be taken down or withdrawn at the request of participants.
148159

149160
- The total size of the collection will present challenges for
150161
data transfer.
@@ -161,7 +172,7 @@ objects such as PARADISEC collections which bring together items
161172
collected in a region or on a session with informants. This follows the
162173
Alveo usage:
163174

164-
> Items \[*Objects* in this model\] are grouped into collections which might
175+
> Items \[_Objects_ in this model\] are grouped into collections which might
165176
> correspond to curated corpora such as ACE or informal collections such as a
166177
> sample of documents from the AustLit archive
167178
> ([http://www.austlit.edu.au/](http://www.austlit.edu.au/)).
@@ -172,7 +183,7 @@ resolvable ID (within the context of a repository or service) of the
172183
parent Collection. The Collection may also list its members in a hasMember
173184
property, but this is not required.
174185

175-
The root dataset must have at least these \@type values: \["Dataset",
186+
The root dataset must have at least these \`@type\` values: \["Dataset",
176187
"RepositoryCollection"\]
177188

178189
### A RepositoryCollection:
@@ -187,28 +198,28 @@ An Object is a single unit linked to tightly related files, for example,
187198
a dialogue or session in a speech study, or a work (document) in a written
188199
corpus. This is based on the use of the term _Item_ in Alveo:
189200

190-
>The data model that we have developed for the storage of language
191-
>resources is built around the concept of an item which corresponds
192-
>(loosely) to a record of a single communication event. An item is
193-
>often associated with a single text, audio or video resource but could
194-
>include a number of resources, for example, the different channels of
195-
>audio recording, or an audio recording and associated textual
196-
>transcript. Items are grouped into collections which might correspond
197-
>to curated corpora such as ACE or informal collections such as a
198-
>sample of documents from the AustLit archive
199-
>(<http://www.austlit.edu.au/>).
200-
><https://www.researchonline.mq.edu.au/vital/access/services/Download/mq:37347/DS01>
201+
> The data model that we have developed for the storage of language
202+
> resources is built around the concept of an item which corresponds
203+
> (loosely) to a record of a single communication event. An item is
204+
> often associated with a single text, audio or video resource but could
205+
> include a number of resources, for example, the different channels of
206+
> audio recording, or an audio recording and associated textual
207+
> transcript. Items are grouped into collections which might correspond
208+
> to curated corpora such as ACE or informal collections such as a
209+
> sample of documents from the AustLit archive
210+
> (<http://www.austlit.edu.au/>).
211+
> <https://www.researchonline.mq.edu.au/vital/access/services/Download/mq:37347/DS01>
201212
202213
The definition of an object is necessarily loose and needs to reflect
203214
what data owners have chosen to do with their collections in the past.
204215

205216
If an RO-Crate contains a single Object the Root Dataset would have a
206217
\`@type\` property of ["Dataset", "RepositoryObject"] with a
207-
conformsTo property pointing to the language-data-commons Object profile
218+
\`conformsTo\` property pointing to the language-data-commons Object profile
208219
(this document).
209220

210221
If an RO-Crate contains an entire collection then each Object has a
211-
\`@type\` property of ["Dataset", "RepositoryObject"] and a conformsTo
222+
\`@type\` property of ["Dataset", "RepositoryObject"] and a \`conformsTo\`
212223
property referencing this document. For example:
213224

214225
Objects SHOULD have files (which may be included in an RO-Crate for the
@@ -244,7 +255,6 @@ analysis of the \`PrimaryMaterial\` or \`DerivedMaterial\`.
244255

245256
\`PrimaryMaterial\` may be a video or audio file if it is available or may be a ContextualEntity referencing a primary text such as a book.
246257

247-
248258
#### A [File, PrimaryMaterial]:
249259

250260
${rules.PrimaryMaterial}
@@ -253,10 +263,8 @@ ${rules.PrimaryMaterial}
253263

254264
DerivedMaterial is a non-analytical derivation from PrimaryMaterial, for example, downsampled video or excerpted text.
255265

256-
257266
${rules.DerivedMaterial}
258267

259-
260268
#### A [File, DerivedMaterial]:
261269

262270
${rules.DerivedMaterial}
@@ -274,12 +282,11 @@ ${rules.Annotation}
274282
CSV or similar tabular files are often used to represent transcribed
275283
speech or sign language data, sometimes also with time codes. To enable
276284
automated location of which column is which, use a [frictionless Table
277-
Schema](https://specs.frictionlessdata.io/table-schema/) described by a File entity in the crate.
285+
Schema](https://specs.frictionlessdata.io/table-schema/) described by a File entity in the crate.
278286

279287
For example:
280288
${exampleEntities('art', ['art_schema.json'])}
281289

282-
283290
### Language
284291

285292
#### A [Language] entity:
@@ -290,7 +297,7 @@ ${rules.Language}
290297

291298
## Places
292299

293-
The place in which data was collected may be indicated using the \`contentLocation\` property.
300+
The place in which data was collected may be indicated using the \`contentLocation\` property.
294301

295302
${exampleEntities('paradisec-item-NT1-001', ['./', 'https://www.ethnologue.com/country/VU', '#Vanuatu'])}
296303

@@ -301,7 +308,7 @@ ${exampleEntities('paradisec-item-NT1-001', ['./', 'https://www.ethnologue.com/c
301308
Identifiers for Objects and Collections MUST be URIs.
302309

303310
Internally, identifiers for all entities that do not have their own URIs
304-
may use the Archive and Packaging identifier scheme ([ARCP]), which allows for a DNS-like namespacing of
311+
may use the Archive and Packaging identifier scheme (ARCP), which allows for a DNS-like namespacing of
305312
identifiers. For example, the Sydney Speaks corpus top-level
306313
collection would have the ID:
307314

@@ -327,15 +334,15 @@ Some corpora express ages and other demographics of participants - this
327334
presents a data modelling challenge, as age and some other variables change
328335
over time, so if the same person appears over time then we need to have a
329336
base Person with DoB etc. and then time-based instances of the person
330-
with an age, social status, gender etc. *at that time*.
337+
with an age, social status, gender etc. _at that time_.
331338

332339
There are three levels at which contributions to an object can be
333340
modelled:
334341

335342
1. Include one or more Person items as context in a crate and reference
336343
them with properties such as schema:creator or the
337344
language-data-commons contribution properties such as [ldac:compiler]
338-
or [ldac:depositor]. The \@id of the person MUST be a URI and SHOULD
345+
or [ldac:depositor]. The \`@id\` of the person MUST be a URI and SHOULD
339346
be re-used where the same person appears in multiple objects in a
340347
collection or repository.
341348

@@ -363,12 +370,12 @@ modelled:
363370

364371
Where data is collected from participants in a speech study with
365372
elicitation tasks such as "sessions" (see this [IMDI
366-
document]](https://www.mpi.nl/ISLE/documents/draft/ISLE_MetaData_2.5.pdf))
373+
document](https://www.mpi.nl/ISLE/documents/draft/ISLE_MetaData_2.5.pdf))
367374
or field interviews this can be recorded in metadata via the
368375
CollectionEvent class.
369376

370377
The indirection in this conforms-to relationship is to allow multiple
371-
objects to have a conformsTo property which indicates that they conform
378+
objects to have a \`conformsTo\` property which indicates that they conform
372379
to the _same_ schema while having a local copy of the schema, as per
373380
RO-Crate best practice of having all local context to use a data
374381
packages in the package where possible.
@@ -393,9 +400,6 @@ Dakota: The University of North Dakota M.A.
393400

394401
[https://www.mpi.nl/ISLE/documents/docs_frame.html](https://www.mpi.nl/ISLE/documents/docs_frame.html)
395402

396-
397-
398-
399403
[ldac:PersonSnapshot]: https://purl.archive.org/language-data-commons/terms#PersonSnapshot
400404
[ldac:depositor]: https://purl.archive.org/language-data-commons/terms#depositor
401405
[ldac:compiler]: https://purl.archive.org/language-data-commons/terms#compiler

0 commit comments

Comments
 (0)