-
Notifications
You must be signed in to change notification settings - Fork 21
Citations – proposal
(proposal)
The goal of this proposal is to create a specification for handling citation metadata in CLARIN repositories. The specification only relies on the fact that all CLARIN B-centre repositories provide "human access to language resources/services and their metadata" and that is in practice realised as HTML pages for the resources. (Checklist for CLARIN B-Centres, v. 7.4.1, section 6.d).
We propose a recommendation to add item metadata in 3 different formats into the HTML source of the item web page served by the repository.
- Google Scholar format (HTML
<meta />elements) – in DSpace supported by default - Google Datasets format
ld+json- see https://github.com/ufal/clarin-dspace/issues/878;
- implemented in CLARIN-DSpace v5, regression in v7, currently not supported
- CSL JSON for bibliographic managers, exporting to other formats, and formating citations
Citation Style Language is a de facto standard solution for formating citations and bibliographies in the web. It is used by dozens of tools, services and libraries, and their repository provides 10,000+ citation styles. Technically the CSL consists of a CSL JSON data format for the citation data, and citation styles that are used by a javascript library to format the citations. There are also libraries providing export from CSL JSON into many other bibliographic formats. We propose that all CLARIN repositories, and also VLO would support all of the following 3 functionalities based on inclusion of the CSL JSON format.
CSL project provides thousands of citation styles, and also javascripts that allow transforming the JSON metadata into formated citations. Dozens of web-based projects like Mendeley, CitacePro, RefWorks, CrossRef, or Zenodo repository are using this approach.
There are several projects (citeproc-js, Citation.js, ad.) that provide import / export of other bibliographic formats. That means that CLARIN would not need to discuss whether we'll support traditional bibtex format via the @misc type for datasets, or rather the newer biblatex with less compatibility, but with support for @dataset type. Rather the CSL JSON would allow by integration of an appropriate library to export many bibliographic formats.
Several web-based bibliographic services like Mondeley or Zotero provide browser plug-ins that simplify adding a user-found resource on the wrb into the user's bibliographic library. When the web page includes metadata directly in CSL JSON the browser plug-in doesn't try to analyse the HTML to identify the metadata attributes, but takes the ready-made bibliographic record directly.
This would ensure that users of such services will have ideally simplified 1-click access to highest quality metadata of any resources in CLARIN repositories and VLO.
Zotero now supports explixit @dataset type (see the end of https://forums.zotero.org/discussion/63616/new-citation-type-research-data-dataset), but how well it is supported in all export formats and citation styles has to be investigated.
For support of the approach outlinesd in this proposal in CLARIN-DSpace see https://github.com/ufal/clarin-dspace/issues/567.