Signposting the Scholarly Web

Signposting is an approach to make the scholarly web more friendly to machines. It uses Typed Links as a means to clarify patterns that occur repeatedly in scholarly portals. For resources of any media type, these typed links are provided in HTTP Link headers. For HTML resources, they may additionally be provided in HTML link elements. Throughout this site, examples use the former approach.

When visiting scholarly portals, readers can easily figure out landing pages, links to bibliographic records, authorship, etc. But, because portals use different conventions to convey such patterns, machines have a hard time finding their way around.

As a portal administrator or operator of scholarly infrastructure, you can change that by implementing some of the Signposting patterns listed on this site. Doing so will allow machines to navigate scholarly portals in a uniform manner. Which will lead to applications that make things easier for readers too.

Signposting the Scholarly web in Slideshare

Image courtesy of Patrick Hochstenbach.

As an example, Herbert Van de Sompel and Michael L. Nelson are the authors of the paper with DOI https://doi.org/10.1045/november2015-vandesompel; their respective ORCIDs are http://orcid.org/0000-0002-0715-6126 and http://orcid.org/0000-0003-3749-8116. CrossRef, the DOI registration agency, can express this authorship in a Link header provided in the response to an HTTP HEAD/GET issued against https://doi.org/10.1045/november2015-vandesompel:

HTTP/1.1 302 Found
Server: Apache-Coyote/1.1
Vary: Accept
Location: http://www.dlib.org/dlib/november15/vandesompel/11vandesompel.html
Link: <http://orcid.org/0000-0002-0715-6126> ; rel="author",
      <http://orcid.org/0000-0003-3749-8116> ; rel="author"
Expires: Tue, 31 May 2016 17:18:50 GMT
Content-Type: text/html;charset=utf-8
Content-Length: 217
Date: Tue, 31 May 2016 16:38:15 GMT
Connection: keep-alive

There are several reasons why the use of HTTP Link headers is appealing for scholarly portals. The headers approach can be used for resources of any media type, not just HTML. Hence, images, datasets, PDFs, etc. can all uniformly use the same approach to clarify patterns. Also, headers are accessible using HTTP HEAD requests that return only transaction metadata, not content. As such, headers can be obtained for massive resources, such as big data sets or high resolution images, without actually downloading these resources. In a similar way, HTTP HEAD requests could be used to obtain headers for restricted content, including paywalled articles. But HTTP headers, including the Link header, are not accessible from JavaScript. Therefor, for HTML resources, the typed links may additionally be provided using the link element in the HTML head element.

Signposting Patterns

Which pattern do you want machines to understand?

About

There is very little interoperability among scholarly portals on the web. Most portals focus on access via the user interface. Some provide APIs for machine access, in which case each portal exposes their own. But how about some uniform approaches to allow machines to interact with scholarly portals?

Understanding that resources are scarce, Signposting proposes a really simple approach, based on Typed Links conveyed in HTTP Link headers, to clarify patterns that occur repeatedly in scholarly portals. Typed Links are used to help machines answer questions like "What is the DOI of this PDF publication", "Where to find the publication resources from the landing page?", "Where to find the BibTeX metadata that describes the publication", "What is the ORCID of the author of this publication", ...

Signposting is not a formal standardization effort. It's just an accumulation of ideas from people that have spent a lot of time thinking about the web and scholarly communication on the web, working on specifications to improve on the interoperability status quo, and witnessing some specifications being adopted and others not. Those scarce resources, you know.

The Signposting approach is fully aligned with hypermedia (REST, HATEOAS) lines of thinking regarding web interoperability. Implementation of a pattern should be straightforward and would help machines significantly. Which would allow the emergence of new applications that make the life of a reader easier. Meaning great Return On Investment.

Please share feedback on the Signposting Google Group.


Some background material related to the Signposting effort:

Feedback

Do you have feedback regarding patterns listed on this site? Are you thinking of another pattern that should be addressed? Do you have concrete suggestions for addressing another pattern? Do you have ideas on how to promote Signposting in the scholarly community? Do you want to join the cause to make the scholarly web more friendly to machines?

Please let yourself be heard on the Signposting Google Group.