Adrian Pohl / @acka47
Linked
Open Data, Hochschulbibliothekszentrum NRW (hbz)
Show & Tell, World Wide Web, 2017-07-27
This presentation:
http://slides.lobid.org/lobid-show-and-tell/
North Rhine-Westphalian Library Service Centre, est. 1973
Software services for libraries in NRW and beyond
E.g. union catalog, discovery portal DigiBib, ILL, Digitization & Digital Preservation, consortial acquisition
See also hbz flyer in English (PDF)
LOD-based data infrastructure
Research & Development since 2010
Search UIs for end users &
web APIs (read only) for web developers
Version 2.0 recently went into production
Based on data from different sources
Cataloging libraries: 56 academic & special libraries,
1000 institute/departmental libraries
20 million records and 45 million holdings
Cataloging environment: Aleph
Source format: Aleph MAB2 XML
Daily export of Aleph MAB2 XML based on Aleph publishing mechanism
Transformation to N-Triples with Metafacture
Conversion to JSON-LD with addition of some concept labels with Etikett
Result is indexed into Elasticsearch
Example resource: Aleph export, JSON-LD
Data set
API, including response format
RDF properties and classes
Provenance
See also Data on the Web Best Practices
But examples are often only an annex to the documentation if given at all
"Descriptive approach" is predominant
Why not attach structured data (name, description, URI etc.) directly to examples?
Today, this is no problem with annotation tools like hypothes.is
Name
Description
Coverage
Use cases
Provenance
URI
Contextualization of the documentation
Example is up-to-date, because live data is annotated
Feedback from API users via hypothes.is possible
Intuitive usage
Enables quicker and better understanding
SPARQL is nice for complex queries
For lots of use cases a performant API is more reliable and convenient
Also, you can implement LDF on top of that API
Reuse of existing vocabularies might be overrated
Rather add one level of indirection than use subproperties (see also this post)
Creating nice JSON-LD isn't straightforward yet (especially with more complex data & bnodes) & there are different ways of doing it
When something's missing, contribute to existing libraries instead of rolling your own
Sometimes it's better to build a new house instead of fixing an old one
At best publish well-formed, consistent and correct JSON-LD
More realistically: provide reliable and performant interfaces for getting data and updates (e.g. dumps + ResourceSync)
Write API (Bibframe- or MARC-based)