Unofficial Draft
Copyright © 2025 the document editors/authors. Text is available under the Creative Commons Attribution 4.0 International Public License; additional terms may apply.
This document defines the data model of MetaBelgica. It specifies how authority data can be represented in RDF and the MetaBelgica Wikibase.
This document was created as part of the BELSPO-funded research infrastructure project MetaBelgica.
The goal of MetaBelgica is to provide high quality reference data for Belgian Cultural Heritage. Data is collaboratively maintained by initially four Federal Scientific Institutes (FSIs) in a Wikibase instance. This also sets the scope: reference data about authorities linked to Federal Collections. For semantic interoperability we will use the Resource Description Framework (RDF) to define a common data model. Additionally we provide a Wikibase data model as well as a mapping between the RDF data model and the Wikibase data model.
This document specifies overall considerations about the data model. Detailed documentation about the RDF vocabulary is available via the tool Widoco.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, MUST NOT, OPTIONAL, RECOMMENDED, REQUIRED, SHALL, SHALL NOT, SHOULD, and SHOULD NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
Conformance requirements are expressed with a combination of descriptive assertions and [RFC2119] terminology.The key words MAY, MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.
Throughout the document, the following terminology is used.
MetaBelgica
Collaboratively maintained
Public interface
Within MetaBelgica we maintain authorities, represented as entities that are interlinked. Generally we distinguish the following three types of entities and additionally have the concept of time represented as property.
In the following we elaborate on the different entities and properties.
Similarly, organizations will be linked to locations to indicate where an organiation is established.
Each of these entities has a number of properties such as related dates, but most importantly external identifiers to other reliable databases like ISNI.
This data model follows the Open World Assumption (OWA), hence a missing property does not mean that it does not exist, just that it is not known. For example, in cases where we do not know if a person is dead, we do not indicate "unknown" or something similar, we rather leave the property empty.
Persons are the creators or in other form contributors to cultural content. More specifically a person within MetaBelgica can also be a public identity such as a pseudonym.
Our goal is to provide high quality reference data which requires a disambiguation of persons. Hence we define a number of properties that help to distinguish persons among each other. In other words, two persons with the same name can still be distinguished by other additional properties.
Historically, in authority control, a single uniform name was used to uniquely identify an authority record, hence the name as a sort of identifier. Instead of by name, we use a unique and persistent identifier to refer to persons. Nevertheless, we still use the preferred name spelling as the preferred label of a person.
Names might be spelled differently in other languages, there might be recorded nicknames or aliases. As mentioned above, historically a single uniform name was chosen as preferred name spelling. Other spellings were recorded as well to improve information retrieval.
This is the recorded date of birth of a person according to the Gregorian calender.
We allow dates according to the ISO8601:2019 extension [ISO8601:2019-2] that is based on the Extended Date Time Format (EDTF) from the Library of Congress. This allows to also record uncertain dates in various ways.
This is the recorded date of death of a person according to the Gregorian calender.
This is the occupation of a person according to a controlled vocabulary.
This property links a person to an entity of type Location (see definition below).
This property links a person to an entity of type Location (see definition below).
Todo
Todo
This concept is not an entity on its own, but modelled as a property.
Within MetaBelgica we collaboratively maintain authority data which requires a data governance strategy. In order to support data governance aspects, we define additional administrative entities and properties. How these entities and properties can be used in the Wikibase to on the one hand indicate data governance aspects such as what can be shown publicly and on the other hand enforce it, will be discussed in the following section mappings.
The Data Privacy Vocabulary [DPV] provides an extensive data model to cover many legal and practical aspects of data processing. For simplicity we only specify entities and properties needed for our use case, but we aim to reuse existing [DPV] terms as much as possible and align own terms to ensure compatibility.
Todo: mention DPV concepts
One goal of MetaBelgica is to provide its reference data to the public. However, certain entities or properties should not be made openly available.
On the one hand we indicate if a an entity or property should be shown publicly by using a visibility property,
and on the other hand we indicate the reason why with a property legal ground.
We employ the property visibility to indicate the envisioned target audience.
In MetaBelgica we distinguish between the following three use cases:
internal: An entity/property is only meant to be used internally, one example are other administrative propertiesshared: An entity/property can be shared in a research context and in the frame of a data sharing agreement with a trusted partner, one example is sensitive data such as gender informationpublic: An entity/property is meant to be displayed publicly, one example is the name of a person recordIndicating a visibility is merely an annotation, you (or the software using the data) still need to use this annotation and act/filter accordingly!
The visibility property MUST BE applicable to annotate a whole entity, a whole property or a specific property instance.
Entities annotated with internal or shared MUST NOT be shown publicly.
Entities annotated with internal MUST NOT be shared with third-parties.
An entity may have several properties, in case the entity itself has the visibility public,
but one of the used properties has the visibility internal or shared,
then those property values MUST NOT be shown publicly.
For example, the property gender has the visibility shared, hence all person entities with visibility public may be displayed publicly, but the value of the gender property shall not be shown.
We employ the property legal basis to qualify why an entity, a property or a property instance has a certain visibility.
There property legal basis MUST link to one of the following legal basis (future versions of this specification may extend this list):
Freedom of Information: Used to indicate that an entity, property or property value should be publicly visible. Based on Belgian law, covering the access of administrative documents held by public bodies.Deceased: Used to indicate that all personal data can be publicly visible, GDPR no longer appliesConsent: Used to indicate that something can be publicly visible, because the person described by the record (data-subject) provided consent to display the property value.Opt-out: Used to indicate that something must not be publicly visible, because the person described by the record (data-subject) opted out.Sensitive data: Used to indicate that a property or property value should not be publicly visible as it is sensitive data which also due to data minimization must not be made publicly visible.When the legal basis Consent is used, there MUST be a link to a Consent entity that provides further details about the consent.
The following default values apply, for example for the initial bulk-load from MetaBelgica partners.
Freedom of Information.DeceasedWith respect to GDPR's data minimization, certain data should be adapted. Todo: elaborate
yearOnly MUST be applied to the year of birth.For interoperability we provide a number of mappings to different RDF vocabularies, but also to the used Wikbiase data model. We have the intention to provide data of the MetaBelgica platform in different data formats, based on the following mappings.
Todo: Wikibase data model
The conceptual MetaBelgica data model, implemented as Wikibase entities and properties, can be used to support various data governance use cases.
It can be used in a consistent way and only relies on the properties visibility, applies To and legal ground, optionally with start date and end date.
The following use cases are supported which means that even fine-grained control is possible:
Additionally, the Wikibase data model can be used to annotate provenance about the visibility with a legal ground property when used as a qualifier.
Therefore it will be documented why something is public or not and when using start and end date qualifiers, a whole provenance record can be built.
Useful to indicate if a certain opt-in only occurred at some point in time or to indicate from when an opt-out was requested.
Placing the visibility property directly on the property that is annotated seems intuitive, but then the additional legal ground would qualify the property and not the visibility. Additionally, the history of data governance with start and end dates cannot be modeled. Yet when using using the visibility property as a claim, additional metadata like the legal ground and possible start and end dates can be consistently modeled with qualifiers.
Todo: Schema.org mapping
mb:legalDeposit a dpv:LegalBasis ; dpv:hasJurisdiction yy ; dpv:hasLaw xx .
Todo: SKOS mapping
Todo: CIDOC-CRM mapping
Todo: OSLO mapping
Referenced in:
Referenced in: