Recap:

Metadata management is an important aspect of publishing. It’s the key information about a publisher’s products that needs to be shared with the remainder of the supply chain – from the basic title, author, ISBN, cover image, publication date and price, to richer information like abstracts or summaries, author bios, open access licenses and details of distribution arrangements. Metadata is often communicated via ONIX, a standard data file format that publishers, retailers, libraries and various intermediaries use, across many countries. It encompasses bibliographic information about the book itself, but also the vital marketing collateral and commercial arrangements. If you’re unfamiliar with ONIX, there’s a 15-minute briefing video available at https://tinyurl.com/3eyy3a9f.

EDItEUR is a member-supported independent trade association and standards body that’s best known for developing, supporting and promoting the ONIX and Thema standards, ensuring our ‘metadata supply chain’ remains based on free-to-use, open standards. The latest versions of its standards specifications can be found on the EDItEUR website https://www.editeur.org/.


In this series, Graham Bell, Executive Director of EDItEUR, shares occasional updates on EDItEUR’s work and the standards it manages.

In the previous article, I explained how a standard data framework like ONIX can be extended to meet new challenges as the book business evolves, using the advent of OA monograph publishing as an example. Over the course of 2024, there were two new evolutions of ONIX, each ratified by an international steering committee convened by EDItEUR then published in March and October. Many IT system vendors have already implemented one of both of these updates in the systems they supply to publishers.

Each of the updates builds on the preceding version – from release 3.1 to revision 3.1.1 and 3.1.2, each is a straightforward addition to support a new data requirement.

The first update, 3.1 revision 1 in March 2024, added new data fields to support EU-style opt-outs from the copyright exception that allows text and data mining (TDM) of digital products. One of the key current uses of TDM is for AI training of large language models (LLMs), and in many jurisdictions, TDM of published content is allowed under a copyright exception. When in place, opt-outs proscribe the use of the product’s content for TDM, except for research purposes. (Note that the opt-outs apply to the EU only, and an equivalent opt-out should wherever possible also be embedded in the content itself using the W3C’s TDM reservation protocol [https://www.w3.org/community/reports/tdmrep/CG-FINAL-tdmrep-20240510/].)

Earlier versions of ONIX can express these opt-outs where they apply to the content of the product itself. ONIX 3.1.1 enables opt-outs that relate to the collateral material that supports the product, collateral that is embedded within the ONIX metadata itself – for example text of a marketing description or a front cover image (the latter could be used to train a different type of generative AI). These opt-outs can apply to collateral for both digital and physical products.

The most recent update of ONIX, revision 3.1.2 published at the end of October, adds another new capability, inclusion of full postal addresses, to fulfil a requirement related to the EU’s new General Product Safety Regulation. This requires that the postal address of a party within the EU responsible for safety aspects of the product be made available within the supply chain, as well as the normal digital contact details. The responsible party might be the publisher or an importer. Hitherto, ONIX had not dealt with postal addresses, so this required the addition of new data fields within the standard.

The relevant product safety contacts, with personal or departmental phone, e-mail and now postal addresses, can be included in a structure that could already carry details of other types of contacts (for example, contacts for accessibility requests, CIP, permissions and so on):


<Product Contact>
<ProductContactRole>10</ProductContactRole> <!– safety contact –>
<ProductContactName>Mondadori Libri SpA</ProductContactName>
<ContactName>Giacinta Zampa</ContactName>
<EmailAddress>g.zampa@mondadorilibri.it</EmailAddress>
<StreetAddress>Via Mondadori 1</StreetAddress>
<LocationName>Segrate</LocationName>
<PostalCode>MI 20090</PostalCode>
<CountryCode>IT</CountryCode>
</ProductContact>


(The above example – the contact name and e-mail are fictitious – comes from EDItEUR’s ONIX Implementation and Best Practice Guide [DOI:10.4400/ejtx], which contains many realistic examples of ONIX usage and is a key resource for anyone implementing ONIX 3.0 or 3.1. The tags from <StreetAddress> to <CountryCode> are new in ONIX 3.1.2.)

To ensure you’re informed when EDItEUR releases new revisions of ONIX, join the ONIX mailing list (send a blank e-mail to onix+subscribe@groups.io). The list also carries discussion of technical queries, and tips for ONIX best practice. Your e-mail will not be flooded by hundreds of messages, as there are usually only one or two per week.

This update is two years in the making. A working group of EDItEUR members and invited experts was convened by EDItEUR to consider each of several hundred proposals for new categories, based on criteria including distinctiveness and whether the meaning of a proposed category could already be expressed by combining existing Thema categories and qualifiers, whether the proposed new concept would be understood globally, and the likelihood of significant use of any new category. After the working group winnowed the suggestions down, the proposals were circulated for comments to Thema user groups around the world, then ratified by an international steering committee to ensure broad acceptance of the revisions.

In all, there are around 130 new subject categories and over 400 new qualifiers, but as with all previous Thema updates, the new revision is fully backwards-compatible with all previous versions – no existing category has changed its meaning, although there are a few minor tweaks of exact wording to improve clarity and aid translation. So if you’re already using Thema, your backlist does not need to be re-categorised, except where you may wish to take advantage of those extra new categories.

Some of the key additions include:

  • qualifiers for ‘the literature of a particular place’ (these form groupings of books that are part of the ‘culture’ of a place, unrelated to the setting of a story, the nationality of the author, the language of the writing or the country of publication)
  • expansion of the range of popular fiction genres, most particularly within romance 
  • new qualifiers to highlight books about or in support of individual UN Sustainable Development Goals (SDGs – see https://sdgs.un.org/goals)
  • further improvements to the precise wording of various headings and notes in support of diversity, equity and inclusion (DEI) and decolonisation of the scheme
  • more detailed geographical qualifiers for coverage of Ukraine, Australia, Canada and other countries

Here’s an example showing how one of the new subject codes can be included in an ONIX product record:


<Subject>
<MainSubject/>
<SubjectSchemeIdentifier>93</SubjectSchemeIdentifier> <!– Thema –>
<SubjectSchemeVersion>1.6</SubjectSchemeVersion>
<SubjectCode>5YS-UN-D</SubjectCode>           <!– about SDG goal 4 –>
</Subject>


There is a list of all the added categories on the EDItEUR website (https://www.editeur.org/151/Thema/#New), and the Thema online category browser (https://ns.editeur.org/thema) has been updated. The categories, headings and notes are freely available in Excel, XML, JSON and readable HTML formats from https://www.editeur.org/151/Thema/#Code_lists. To ensure you’re informed when EDItEUR releases new revisions of Thema, and when new or updated translations become available, join the Thema mailing list (send a blank e-mail to thema+subscribe@groups.io).

It will inevitably take a few months for the supply chain as a whole to update to version 1.6, so in the meantime, the previous version of Thema, v1.5 – which many organisations will continue to use for a transitional period – is now at https://ns.editeur.org/thema15.

raps up a busy year of new releases for the EDItEUR ONIX and Thema standards, but if after evaluating Thema 1.6 you have suggestions for new subject categories or qualifiers, let us know – we’re already collecting suggestions for version 1.7!