Skip to content
xizhao edited this page Nov 21, 2014 · 1 revision

License detection agents like bSAM and F1 match the text in a file with the text of a reference license to determine which license(s) match and which are the best matches. Think of this as a sophisticated diff(1). A reference license can be the full license text, for example, the gpl v2 complete text, or it can be text that declares the license, for example "This is under GPL v2". In each of this case, there would be two license_ref records, one for each text used by the analysis agent as a reference. The nomos license agent does not use reference licenses. This agent works with heuristics, typically expressed by regular expressions but could also involve procedural logic, for example, if regular expression B has a match and text C does not appear in the file, then report the license as "license A". Even though this agent does not use this table in the analysis, it still must refer to a license_ref table when it records its results in the database. The reason for this is so that license reports can use the same logic regardless of which agent was used. This common method to identify licenses through the license_ref table also means that it is possible to compare the results of different analysis agents.

When a license agent determines a license, it must know which license_ref record is to be used to report the license (the license_file table shows which licenses (license_ref records) were found in which file. In the case of nomos, after it determines which license(s) are in a file, it looks up that license by name (rf_shortname) and records the file and the found license_ref primary key (rf_pk) in the license_file table. In the case of F1, it knows which reference license it used to make the match and records that rf_pk in license_file. Note that since nomos and F1 use completely different methods to determine a license, they each have their own license_ref records. However, these records share common rf_shortnames. So they may refer to different license_ref records, but you can still compare their output via the shortnames. For this reason, shared shortnames must be applied to every new license agent.

What about the original bSAM license agent? The idea of having a single license table structure for all (new) license agents was introduced in v 1.2. Since there is no good way to combine bSAM/licterms agents with the new tables, and because we plan to deprecate bSAM/licterms, there was no attempts to backport all those old records into the new tables. This means there is no easy way to compare the output of bSAM/licterms with the new nomos and F1 agents.

Column Meaning
rf_pk Primary key
rf_shortname The common name for the license, eg. AGPL v3, AFL v2.1. This is the name of the license as it is reported. Multiple rows can have the same rf_shortname. There are multiple ways people declare a license. There is one row for each different way. They all mean the same license so rf_shortname is used to group different license_ref records together so they report the same. The data is still reported in license_file by individual rf_pk. So if someone wants to know the criteria for a specific license determination, the data is in the db.
rf_fullname Full license name (e.g. Academic Free License v2.1). Since the nomos agent doesn't use reference licenses, it looks up this name to get the rf_pk of the license to match.
rf_text License text used by the analysis agent. In the case of nomos, this text isn't used by the agent but may be populated with text describing how the agent determines how this license was determined (e.g. the regex used)
rf_url URL of the license
rf_add_date Date license was added to this table.
rf_notes Notes a user can record about this license. Since the note is attached to a license_ref record, multiple records that refer to the same reporting name (rf_shortname) each have their own notes.
rf_active Only rf_active=True reference license (license_ref records) are used by the analysis engine. Of Nomos and F1, only F1 use this. Since license_ref records cannot be removed (because existing analyses may refer to them), in order to keep a license ref from being used, you have to set rf_active=False.
rf_text_updatable If true, rf_text may be updated. This is used by Admin > License to allow or disallow editing the license text. This should be true for nomos license records and false for f1 license records.
rf_md5 Where the rf_text is used by the agent (F1), this is the md5 of that text. This has a unique constraint on it to insure no duplicate licenses are added to the table.
marydone Will be renamed or removed before release. It's simply a user flag one can attach to a record (how you use it is up to you)
rf_FSFfree FUTURE: True if this license is FSF free
rf_copyleft FUTURE: True if this is a copyleft license.
rf_OSIapproved FUTURE: True if this license is OSI approved.
rf_FSFfree FUTURE: True if this license is FSF free
rf_GPLv2compatible FUTURE: True if this license is GPL v2 compatible
rf_GPLv3compatible FUTURE: True if this license is GPL v3 compatible
rf_Fedora FUTURE: True if this license is a "Good" Fedora license
Clone this wiki locally