Input for "Apples to apples" Hackathon
Prepared by: Herbert Van de Sompel (DANS, Ghent University IDLab)
Created 20220225
Table of Contents
1. Background
An outcome of the "Apples to Apples" FAIR Evaluation Stakeholder meetings held on February 7th and 10th 2022
was a solution (using a subset of the
FAIR Signposting Profile) to the problems that Mark Wilkinson wanted to address:
- Use
describedby
link(s) on the landing page to support discovery of metadata
- Use the
type
attribute on the link to express the media type of the metadata
- In case of the media type of the metadata is
application/json+ld
, use the
profile
attribute to express JSON profile
- No
type
attribute suggests possibility to do HTTP content negotiation for metadata formats
- Use
item
link(s) on the landing page to support discovery of data
- Use the
type
attribute on the link to express the media type of the data
- Use a
cite-as
link on the landing page to express the PID as an HTTP URI
- For clients that arrive at landing page from a place other than the PID
- Removes the perceived ambiguity regarding links that have landing page URI as default anchor
because the combination of an
HTTP 302 Found
redirect from the PID (expressed as HTTP URI)
to the landing page and a cite-as
link
from the landing page to the PID (expressed as HTTP URI) conveys an equivalence between PID (expressed as HTTP URI) and landing page.
2. The cite-as
, describedby
, item
link relation types
The typed links that are considered for the hackathon are a subset of those
recommended for the
FAIR Signposting Profile. The below images compare
the typed links recommended for the FAIR Signposting Profile (top) and the subset used for the hackathon (bottom).
The below table shows cardinality and description for the
describedby
,
item
, and
cite-as
links that are provided
with the
landing page as link context. As is the case with all link relation types used in Signposting, these
relation types are registered in the
IANA Link Relation Registry
with inclusion of the formal specifications in which they have been defined.
Link Relation Type |
Link Cardinality |
Comment |
cite-as |
1 |
Provide a link that has the persistent identifier
of the scholarly object (e.g. DOI expressed as an HTTP URI) as target. |
describedby |
1 or more |
Provide one or more links that have the URI of metadata that describes the scholarly object
in a commonly used format as target.
On each link, provide the media type of the metadata in the type
attribute. Common media types include
application/x-bibtex (BibTeX),
application/vnd.citationstyles.csl+json (CiteProc JSON),
application/x-research-info-systems (RIS),
application/vnd.datacite.datacite+xml (DataCite XML),
application/vnd.datacite.datacite+json (DataCite JSON),
application/vnd.jats+xml (JATS),
application/vnd.codemeta.ld+json (Codemeta),
text/x-bibliography (Formatted text citation).
Many other bibliographic formats are in use that have text/plain ,
application/xml , application/json , or application/ld+json as media type.
When providing metadata that describes the scholarly object using these media types, use the profile
attribute on the link to convey, by means of an HTTP URI, the specific format of the metadata. For example, for
metadata expressed as application/xml , provide the XML Namespace URI in the profile
attribute.
Note that the use of the profile attribute is standardized for certain media types (e.g. for
application/ld+json ) but not for others (e.g. application/xml ). However, the Web Linking RFC does support
extension attributes. It seems logical to choose profile as an extension attribute to provide detailed information
for media types for which the profile attribute has not explicitly been standardized.
|
item |
1 or more |
The landing page is modeled as a collection of content resources.
As such, provide links that have content resources
(e.g. the PDF article, the CSV dataset, the ZIPped software repository)
as target. When doing so a
link should be provided for each content resource.
Use the type attribute on each link to convey the media type
of the content resource.
However, be aware that the number of content resources for a scholarly object might be large.
As such, in order to avoid the risk of the HTTP header becoming too large, it may be safer
to provide these links using the Link Set approach.
|
3. Two implementation approaches: links by-value or by-reference
Typed links can be conveyed in two ways:
- Approach 1 - By value in the HTTP
Link
header and/or the HTML <link>
element.
- Approach 2 - By reference in a Link Set, which is made discoverable by means of a link with the
linkset
relation type in the HTTP Link
header and/or the HTML <link>
element.
Approach 2 is attractive when it can be expected that a large number of typed links need to be conveyed, for example,
when the research outputs hosted by a repository typically have a significant number of content files (i.e. many
item
links)
as is sometimes the case with data repositories. It is also attractive when implementing a larget set of
typed links recommended by the
FAIR Signposting Profile. This is especially
the case when also implementing typed links pertaining to content resources. In that case, all typed links can be contained in a single
Link Set that is made discoverable from both landing page and content resources.
Link Sets are specified in the IETF Internet Draft (well on its way to become an RFC)
Linkset: Media Types and a Link Relation Type for Link Sets:
-
A Link Set is a collection of typed links, including links pertaining to the resource that
makes the set of links discoverable.
-
A Link Set is made discoverable by means of a typed link with the
linkset
relation type that will be added to the IANA Link Relation
Registry as soon as the Linkset Internet Draft
becomes an RFC. A type
attribute on that link conveys the media type that is
used to serialize the Link Set.
-
Two approaches exist to serialize a Link Set: one is JSON-based (media type
application/json+linkset
)
and the other uses the same format as the payload of the HTTP Link
header
(media type application/linkset
).
- For all typed links in a Link Set, both link context and link target must be explicitly provided and expressed as
absolute URIs. This allows to unambiguously interpret a Link Set without the need to
save contextual information such as the URI where it is published.
4. Examples
This section shows examples for providing the
cite-as
,
describedby
,
item
links with the landing page as link context, using the
by-value and
by-reference approaches.
The examples use the dataset with PID
https://doi.org/10.34894/SRSB8I as an example.
The below table lists important resources for that dataset. Note that various metadata formats are available
via content negotiation with https://doi.org/10.34894/SRSB8I as described in the section Supported Content Types of
the
DataCite Content Negotiation manual. The example
provides a link to only one of those formats.
Scholarly Object Resources |
HTTP URI |
Media Type |
Persistent Identifier |
https://doi.org/10.34894/SRSB8I |
|
Landing Page |
https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I |
text/html |
Content Resource 1 - PDF file |
https://dataverse.nl/api/access/datafile/192732 |
application/pdf |
Content Resource 2 - SPSS file |
https://dataverse.nl/api/access/datafile/192733 |
application/x-spss-sav |
Metadata Description 1 |
https://doi.org/10.34894/SRSB8I |
application/vnd.citationstyles.csl+json |
Metadata Description 2 |
https://dataverse.nl/api/datasets/export?exporter=schema.org&persistentId=doi:10.34894/SRSB8I |
application/json+ld |
4.1. Example of typed links by-value
The typed links can be provided
in the page's HTTP Link
header and/or in <link>
elements in the <head>
section of that page's HTML.
Remember that typed links in the HTTP Link
header are accessible
via both HTTP HEAD and GET, while links in the HTML are only accessible via HTTP GET.
When using the
Link
header approach:
- the required typed links can be conveyed in a single the HTTP
Link
header
or using multiple Link
headers, one header per link
- line breaks must not be used in
Link
headers as they are not allowed per
RFC7230;
only whitespaces and tabs are supported as separators
The below examples show responses to an HTTP GET request issued against the URI of the landing page of our example dataset.
In the first example, the HTTP header approach is used to convey the typed links, with each link provided in a separate Link
header.
Note that the same links could simultaneously be provided via <link>
elements in the HTML's <head>
.
In the second example, the typed links are provided in the HTML's <head>
but not in the HTTP Link
header.
Typed links with the landing page as anchor via the HTTP Link header |
$ curl --head "https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I"
HTTP/1.1 200 OK
Date: Fri, 25 Feb 2022 15:31:22 GMT
Content-Type: text/html
Content-Length: 25414
Link: <https://doi.org/10.34894/SRSB8I> ; rel="cite-as"
Link: <https://dataverse.nl/api/access/datafile/192732> ; rel="item" ; type="application/pdf"
Link: <https://dataverse.nl/api/access/datafile/192733> ; rel="item" ; type="application/x-spss-sav"
Link: <https://doi.org/10.34894/SRSB8I> ; rel="describedby" ; type="application/vnd.citationstyles.csl+json"
Link: <https://dataverse.nl/api/datasets/export?exporter=schema.org&persistentId=doi:10.34894/SRSB8I>
; rel="describedby" ; type="application/json+ld"
|
Typed links with the landing page as anchor via HTML <link> elements |
$ curl --include "https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I"
HTTP/1.1 200 OK
Date: Fri, 25 Feb 2022 15:31:22 GMT
Content-Type: text/html
Content-Length: 25414
<html lang="en">
<head>
<meta charset="utf-8">
<link rel="cite-as" href="https://doi.org/10.34894/SRSB8I">
<link rel="item" type="application/pdf" href="https://dataverse.nl/api/access/datafile/192732">
<link rel="item" type="application/x-spss-sav" href="https://dataverse.nl/api/access/datafile/192733">
<link rel="describedby" type="application/vnd.citationstyles.csl+json" href="https://doi.org/10.34894/SRSB8I" >
<link rel="describedby" type="application/json+ld"
href="https://dataverse.nl/api/datasets/export?exporter=schema.org&persistentId=doi:10.34894/SRSB8I">
...
|
4.2. Example of typed links by-reference
In this approach, a single
Link Set contains all typed links.
This Link Set is made discoverable through the provision of
linkset
links
on the landing page.
For the landing page,
this can be done using the HTTP
Link
header and/or via
<link>
element's in the HTML's
<head>
.
The below example illustrates the latter approach through the inclusion of two
linkset
links pointing to different serializations of the Link Set.
Note that the response also contains the
by-value links
that have the landing page as anchor.
For the purpose of the examples, we will assume that:
- an
application/linkset+json
Link Set is provided at
https://dataverse.nl/api/datasets/linkset/1?persistentId=doi:10.34894/SRSB8I
- an
application/linkset
Link Set is provided at
https://dataverse.nl/api/datasets/linkset/2?persistentId=doi:10.34894/SRSB8I
Landing page provides linkset links in HTTP Link header |
$ curl --head "https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I"
HTTP/1.1 200 OK
Date: Fri, 25 Feb 2022 15:33:15 GMT
Content-Type: text/html
Content-Length: 25414
Link: <https://doi.org/10.34894/SRSB8I> ; rel="cite-as"
Link: <https://dataverse.nl/api/access/datafile/192732> ; rel="item" ; type="application/pdf"
Link: <https://dataverse.nl/api/access/datafile/192733> ; rel="item" ; type="application/x-spss-sav"
Link: <https://doi.org/10.34894/SRSB8I> ; rel="describedby" ; type="application/vnd.citationstyles.csl+json"
Link: <https://dataverse.nl/api/datasets/export?exporter=schema.org&persistentId=doi:10.34894/SRSB8I>
; rel="describedby" ; type="application/json+ld"
Link: <https://dataverse.nl/api/datasets/linkset/1?persistentId=doi:10.34894/SRSB8I>
; rel="linkset" ; type="application/linkset+json"
Link: <https://dataverse.nl/api/datasets/linkset/2?persistentId=doi:10.34894/SRSB8I>
; rel="linkset" ; type="application/linkset"
|
Landing page provides linkset links in HTML <link> |
$ curl --include "https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I"
HTTP/1.1 200 OK
Date: Fri, 25 Feb 2022 15:33:15 GMT
Content-Type: text/html
Content-Length: 25414
<html lang="en">
<head>
<meta charset="utf-8">
<link rel="cite-as" href="https://doi.org/10.34894/SRSB8I">
<link rel="item" type="application/pdf" href="https://dataverse.nl/api/access/datafile/192732">
<link rel="item" type="application/x-spss-sav" href="https://dataverse.nl/api/access/datafile/192733">
<link rel="describedby" type="application/vnd.citationstyles.csl+json" href="https://doi.org/10.34894/SRSB8I" >
<link rel="describedby" type="application/json+ld"
href="https://dataverse.nl/api/datasets/export?exporter=schema.org&persistentId=doi:10.34894/SRSB8I">
<link rel="linkset" type="application/linkset+json"
href="https://dataverse.nl/api/datasets/linkset/1?persistentId=doi:10.34894/SRSB8I">
<link rel="linkset" type="application/linkset"
href="https://dataverse.nl/api/datasets/linkset/2?persistentId=doi:10.34894/SRSB8I">
...
|
An actual Link Set is obtained by issuing an HTTP GET on a URI discovered as described above. The below examples show responses
to requests issued against
https://dataverse.nl/api/datasets/linkset/1?persistentId=doi:10.34894/SRSB8I
(
application/linkset+json
serialization)
and
https://dataverse.nl/api/datasets/linkset/2?persistentId=doi:10.34894/SRSB8I
(
application/linkset
serialization), respectively.
Note that:
- the
application/linkset+json
serialization is described in detail in Linkset: Media Types and a Link Relation Type for Link Sets
- the
application/linkset
serialization is the same as the payload of an HTTP Link header that contains one or more links, with the only difference that the
application/linkset
does allow for line breaks whereas the Link header payload does not
- for all typed links, both link context and link target must be explicitly provided and expressed as absolute URIs
Typed links via a Link Set in application/linkset+json serialization |
$ curl --include -H "Accept: application/linkset+json" "https://dataverse.nl/api/datasets/linkset/1?persistentId=doi:10.34894/SRSB8I"
HTTP/1.1 200 OK
Date: Fri, 25 Feb 2022 15:56:44 GMT
Server: Apache-Coyote/1.1
Content-Length: 844
Content-Type: application/linkset+json
Connection: close
{
"linkset": [
{
"anchor": "https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I",
"cite-as": [
{
"href": "https://doi.org/10.34894/SRSB8I"
}
],
"item": [
{
"href": "https://dataverse.nl/api/access/datafile/192732",
"type": "application/pdf"
},
{
"href": "https://dataverse.nl/api/access/datafile/192733",
"type": "application/x-spss-sav"
}
],
"describedby": [
{
"href": "https://doi.org/10.34894/SRSB8I",
"type": "application/vnd.citationstyles.csl+json"
},
{
"href": "https://dataverse.nl/api/datasets/export?exporter=schema.org&persistentId=doi:10.34894/SRSB8I",
"type": "application/json+ld"
}
]
}
]
}
|
Typed links via a Link Set in application/linkset serialization |
$ curl --include -H "Accept: application/linkset" "https://dataverse.nl/api/datasets/linkset/2?persistentId=doi:10.34894/SRSB8I"
HTTP/1.1 200 OK
Date: Fri, 25 Feb 2022 15:56:44 GMT
Server: Apache-Coyote/1.1
Content-Length: 931
Content-Type: application/linkset
Connection: close
<https://doi.org/10.34894/SRSB8I>
; rel="cite-as"
; anchor="https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I",
<https://dataverse.nl/api/access/datafile/192732>
; rel="item" ; type="application/pdf"
; anchor="https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I",
<https://dataverse.nl/api/access/datafile/192733>
; rel="item" ; type="application/x-spss-sav"
; anchor="https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I",
<https://doi.org/10.34894/SRSB8I>
; rel="describedby" ; type="application/vnd.citationstyles.csl+json"
; anchor="https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I",
<https://dataverse.nl/api/datasets/export?exporter=schema.org&persistentId=doi:10.34894/SRSB8I>
; rel="describedby" ; type="application/json+ld"
; anchor="https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/SRSB8I"
|