@@ -89,9 +89,8 @@ the public domain of software.
89
89
Non-software licences
90
90
---------------------
91
91
92
- Open source software licences can also be used for works that are not software.
93
- They are often also the best choice, especially if the works in question are
94
- edited and versioned as source code.
92
+ Open source software licences are often used for works that are not software.
93
+ However, they are often not the best choice.
95
94
96
95
Data, media, :abbr: `etc. ( et cetera ) `
97
96
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -104,37 +103,49 @@ they are `not recommended for software
104
103
<https://creativecommons.org/faq/#can-i-apply-a-creative-commons-license-to-software> `_.
105
104
106
105
.. tip ::
107
- The `DEAL consortium <https://deal-konsortium.de/en/ >`_ recommends the CC BY
108
- licence for Open Access publications, see also `Why CC BY is the Best Choice
109
- for Open Access Publishing <https://deal-konsortium.de/en/why-ccby> `_.
110
-
111
- :abbr: `RADAR ( Research Data Repository ) `, a cross-disciplinary repository for
112
- archiving and publishing research data, recommends only one of the `CC
113
- licences
114
- <https://radar.products.fiz-karlsruhe.de/en/radarfeatures/lizenzen-fuer-forschungsdaten#cc-licenses> `_.
115
-
116
- The `Open Knowledge Foundation <https://okfn.org/en/ >`_ has also published a set
117
- of `Open Data Commons <https://opendatacommons.org >`_ licences for
118
- data/databases:
119
-
120
- `Open Data Commons Open Database License (ODbL) v1.0 <https://opendatacommons.org/licenses/odbl/1-0/ >`_
121
- Attribution and sharing under equal terms.
122
- `Open Data Commons Attribution License (ODC-By) v1.0 <https://opendatacommons.org/licenses/by/1-0/ >`_
123
- Attribution.
124
- `Open Data Commons Public Domain Dedication and License (PDDL) v1.0 <https://opendatacommons.org/licenses/pddl/1-0/ >`_
125
- The PDDL places the data in the public domain and waives all rights.
126
-
127
- `GovData <https://www.govdata.de >`_ has submitted the *Data Licence Germany * in two variants:
128
-
129
- * `Datenlizenz Deutschland – Namensnennung – Version 2.0
130
- <https://www.govdata.de/dl-de/by-2-0> `_
131
- * `Datenlizenz Deutschland – Zero – Version 2.0
132
- <https://www.govdata.de/dl-de/zero-2-0> `_
133
-
134
- When using the `Community Data License Agreement – Permissive, Version 2.0 <https://cdla.dev/permissive-2-0/ >`_ the copyright notices must be retained.
135
-
136
- Another possible licence for artistic works is the `Free Art License 1.3
137
- <https://artlibre.org/licence/lal/en/> `_.
106
+ * The `DEAL consortium <https://deal-konsortium.de/en/ >`_ recommends the CC
107
+ BY licence for Open Access publications of research results, see also `Why
108
+ CC BY is the Best Choice for Open Access Publishing
109
+ <https://deal-konsortium.de/en/why-ccby> `_.
110
+
111
+ * :abbr: `RADAR ( Research Data Repository ) `, a cross-disciplinary repository
112
+ for archiving and publishing research data, recommends only one of the `CC
113
+ licences
114
+ <https://radar.products.fiz-karlsruhe.de/en/radarfeatures/lizenzen-fuer-forschungsdaten#cc-licenses> `_.
115
+
116
+ * The `Open Knowledge Foundation <https://okfn.org/en/ >`_ has also published a
117
+ set of `Open Data Commons <https://opendatacommons.org >`_ licences for
118
+ data/databases:
119
+
120
+ `Open Data Commons Open Database License (ODbL) v1.0 <https://opendatacommons.org/licenses/odbl/1-0/ >`_
121
+ Attribution and sharing under equal terms.
122
+ `Open Data Commons Attribution License (ODC-By) v1.0 <https://opendatacommons.org/licenses/by/1-0/ >`_
123
+ Attribution.
124
+ `Open Data Commons Public Domain Dedication and License (PDDL) v1.0 <https://opendatacommons.org/licenses/pddl/1-0/ >`_
125
+ The PDDL places the data in the public domain and waives all rights.
126
+
127
+ * `GovData <https://www.govdata.de >`_ has submitted the *Data Licence Germany *
128
+ in two variants:
129
+
130
+ * `Datenlizenz Deutschland – Namensnennung – Version 2.0
131
+ <https://www.govdata.de/dl-de/by-2-0> `_
132
+ * `Datenlizenz Deutschland – Zero – Version 2.0
133
+ <https://www.govdata.de/dl-de/zero-2-0> `_
134
+
135
+ * The Community Data License Agreement <https://cdla.dev>_ can be used in four
136
+ different ways:
137
+
138
+ * `Community Data License Agreement – Permissive, Version 2.0
139
+ <https://cdla.dev/permissive-2-0/> `_
140
+ * `Community Data License Agreement – Sharing, Version 1.0
141
+ <https://cdla.dev/sharing-1-0/> `_
142
+ * `Open Use of Data Agreement, Version 1.0
143
+ <https://cdla.dev/open-use-of-data-agreement-v1-0/> `_
144
+ * `Computational Use of Data Agreement, Version 1.0
145
+ <https://cdla.dev/computational-use-of-data-agreement-v1-0/> `_
146
+
147
+ * Another possible licence for artistic works is the `Free Art License 1.3
148
+ <https://artlibre.org/licence/lal/en/> `_.
138
149
139
150
Machine learning models
140
151
~~~~~~~~~~~~~~~~~~~~~~~
@@ -164,19 +175,95 @@ been developed for a company or specific models:
164
175
* `BigScience OpenRAIL-M (Responsible AI License
165
176
<https://www.licenses.ai/blog/2022/8/26/bigscience-open-rail-m-license> `_
166
177
167
- RAIL-D
178
+ There are other `Responsible AI Licenses (RAIL) <https://www.licenses.ai >`_
179
+ with various restrictions on use:
180
+
181
+ OpenRAIL-D
168
182
contains usage restrictions that only apply to the data.
169
- RAIL -A
183
+ OpenRAIL -A
170
184
contains usage restrictions that only apply to the
171
185
application/executability.
172
- RAIL -M
186
+ OpenRAIL -M
173
187
contains usage restrictions that only apply to the model.
174
- RAIL-S
188
+
189
+ .. seealso ::
190
+ `RAIL-M
191
+ <https://www.licenses.ai/blog/2022/8/26/bigscience-open-rail-m-license> `_
192
+
193
+ OpenRAIL-S
175
194
contains usage restrictions that only apply to the source code.
176
195
196
+ AI models that are licensed under an open source licence but whose training data
197
+ and programmes have **not ** been published are not compliant with the `Debian
198
+ Free Software Guidelines (DFSG)
199
+ <https://de.wikipedia.org/wiki/Debian_Free_Software_Guidelines> `_, see also
200
+ `Interpretation of DFSG on Artificial Intelligence (AI) Models
201
+ <https://www.debian.org/vote/2025/vote_002> `_.
202
+
203
+ .. _osaid :
204
+
205
+ For the `Open Source Initiative (OSI) <https://opensource.org/ >`__, the
206
+ definition of open source AI also goes far beyond the use of a model – it must
207
+ also be understandable how the model was created, and the model must also be
208
+ able to be modified and shared with others for any purpose. These four freedoms
209
+ are fulfilled with
210
+
211
+ Open Data
212
+ Sufficiently detailed information about the data used to train the system so
213
+ that an essentially equivalent system can be built
214
+ Open Code
215
+ The complete source code used to train and operate the system under licences
216
+ approved by the OSI
217
+ Open Weights
218
+ Model parameters, such as weights or other configuration settings under
219
+ licences approved by the OSI
220
+
221
+ Accordingly, the OSI developed `OSAID 1.0
222
+ <https://opensource.org/ai/open-source-ai-definition> `_, which applies to the
223
+ following models:
224
+
225
+ * EleutherAI: `Pythia <https://github.com/EleutherAI/pythia >`_, `GPT-J
226
+ <https://www.eleuther.ai/artifacts/gpt-j> `_
227
+ * The Allen Institute for Artificial Intelligence: `OLMo 2
228
+ <https://allenai.org/olmo> `_, `Molmo <https://allenai.org/blog/molmo >`_
229
+ * LLM360: `K2
230
+ <https://huggingface.co/collections/LLM360/k2-6622ae6911e3eb6219690039> `_,
231
+ `Amber
232
+ <https://huggingface.co/collections/LLM360/amber-65e7333ff73c7bbb014f2f2f> `_,
233
+ `CrystalCoder <https://huggingface.co/LLM360/Crystal >`_
234
+ * Google: `T5
235
+ <https://github.com/google-research/text-to-text-transfer-transformer> `_
236
+
237
+ Presumably the following models would also fulfil the requirements if they were
238
+ to change their legal conditions:
239
+
240
+ * BigScience: `Bloom <https://huggingface.co/bigscience/bloom >`_
241
+ * BigCode: `StarCoder 2 <https://github.com/bigcode-project/starcoder2 >`_
242
+ * Technology Innovation Institute: `Falcon
243
+ <https://huggingface.co/collections/tiiuae/falcon-h1-6819f2795bc406da60fab8df> `_
244
+
245
+ However, there are also some models that have been analysed and failed because
246
+ they lack required components and/or legal agreements:
247
+
248
+ * Meta: Llama2
249
+ * xAI: Grok
250
+ * Microsoft: Phi-2
251
+ * Mistral AI: Mixtral
252
+
177
253
.. seealso ::
178
254
* `Licensing Machine Learning models
179
- <https://book.the-turing-way.org/reproducible-research/licensing/licensing-ml> `_ by The Turing Way Community
255
+ <https://book.the-turing-way.org/reproducible-research/licensing/licensing-ml> `_
256
+ by The Turing Way Community
257
+ * Alek Tarkowski, Open Future in partnership with the Open Source Initiative:
258
+ `Data Governance in Open Source AI
259
+ <https://opensource.org/wp-content/uploads/2025/02/2025-OSI-DataGovernanceOSAI-final-v5.pdf> `_
260
+
261
+ Databases
262
+ ~~~~~~~~~
263
+
264
+ One of the few licences for databases is the `Open Data Commons Open Database
265
+ License (ODbL) v1.0 <https://opendatacommons.org/licenses/odbl/1-0/> `_, which is
266
+ used, for example, by `OpenStreetMap (OSM) <https://www.openstreetmap.org >`_.
180
267
181
268
Documentation
182
269
~~~~~~~~~~~~~
0 commit comments