Preparation for updates for two EDItEUR standards are proceeding in parallel. Thema, our international and multilingual subject classification is moving gradually towards version 1.7 on its regular biennial cycle, and the ONIX metadata framework is approaching publication of version 3.1.3, the third minor revision of release 3.1 since its initial introduction in March 2023.

For ONIX, three revisions in three years seems fast, but it’s a deliberate shift to ‘smaller’ but more frequent incremental updates. Each update adds functionality, meets new business requirements but maintains full backwards compatibility. But since each enhancement to the standard specification is optional, each can be released ‘when its ready’. And while the previous two updates – plus a great deal of other work during 2025 – were triggered by changes in the legal environment (AI opt-outs, GPSR and EUDR), version 3.1.3 is not. It provides extra flexibility around authorship of epitext (reviews, endorsements and the like) that is carried in the ONIX record. The clearest use case is that of a scholarly publisher aiming to provide not just the name of a reviewer of a book but also their ORCID, professional affiliation and a brief description alongside their review. This extra metadata might initially be inserted into the ONIX so it can be transferred to the publisher’s own website, but in time, it’s likely to be displayed in online reseller stores too. The 3.1.3 update makes a minor addition to <NameAsSubject> too. Software developers might note that this makes <TextSource> and <NameAsSubject> much more obviously subsets of the structure of <Contributor>, so that IT systems can potentially use the same data structure to hold names of book creators, names of people the book is about, names of people who have commented on the book, and names of participants in events about the book.

At the time of writing, the above ONIX revision is a proposal, but it’s likely to be ratified by the ONIX International Steering Committee and publication is planned for late March. EDItEUR members have early access to the draft 3.1.3 specification and XML schemas upon request.

A warning for any remaining users of ONIX 2.1 – this version has been obsolete for several years, and the old documentation will be removed from the EDItEUR website at the end of March 2026. The codelist browser for ONIX 2.1 will also be removed.

Thema revisions take much longer to develop, so that standard has a much clearer two-yearly update cycle. Progress on the next version, 1.7, is well advanced, and a working group of the International Steering Committee has been reviewing suggestions for new Thema subject categories and qualifiers – well over 1000 in all – whittling them down based on criteria agreed in advance, and leaving perhaps 50–75 proposals that will make the final draft. This working group process is vital because of the sheer number of suggestions, the working group’s international composition builds knowledge and shares expertise, and it has resulted in a mature and globally-relevant industry standard. Again, EDItEUR members will have early access to late-stage version 1.7 drafts upon request.

EDItEUR will be at the London Book Fair, sharing booth 5G101 with the Book Industry Study Group. E-mail info@editeur.org if you want to set up a meeting, or less formally, just drop by for a chat about ONIX, Thema, EDItX, ISNI, DOI, or just publishing and standards in general

In contrast to extension of the standards themselves, as described above, there are also extensions of the scope of use of the standards – brand new use cases. One such is long-term secure preservation of scholarly publications in so-called ‘dark archives’ like CLOCKSS and Portico. These are repositories dedicated to preserving digital content, inaccessibly in normal times but able to be accessed after some catastrophic event. It has long been the case that the preservation location of a book can be recorded in its ONIX metadata, but in addition, the metadata itself can be preserved ‘inside’ the book.

For EPUB files, embedding ONIX metadata within the EPUB package to be preserved is relatively straightforward – provide a <link> with the relative URL of the ONIX metadata file within the <metadata> section of the EPUB package document. There is an example illustrating this in the W3C’s latest EPUB 3.3 Recommendation (W3C-speak for a standard) at https://www.w3.org/TR/epub-33/#example-identifying-a-record-type-via-a-property), and although this example uses an ONIX file that is outside the EPUB itself (the link uses an absolute URL), the previous example shows a metadata record embedded in the EPUB package itself via a relative URL – obviously more appropriate for the preservation use case.

For PDF files – still used for a significant proportion of scholarly books – embedding ONIX metadata is a little trickier. Here credit must go to the PDF Association, which has recently provided technical guidance that has now been embedded within an ONIX application note Embedding ONIX metadata in PDFs. The impetus behind this was the provision of rich metadata including accessibility information within the PDF, primarily for library use, but it’s equally suitable for preservation.

In either case, PDF or EPUB, the ONIX must be a complete, valid ONIX file, release 3.0.8 or later, typically containing a single Product record (that of the book being preserved). The PDF application note provides other guidance on the nature of the ONIX itself, and this applies equally to EPUBs. And of course, the dynamic nature of ONIX – metadata records are always subject to later updates – means that any ONIX embedded for long-term preservation purposes will age. Clearly, core attributes like title, authorship, ISBN or publication date are fixed at publication, but statuses, prices, rights and peritext can change post-pub. Some of the latter could be omitted from the ONIX, but it must be recognised that preserved metadata is a snapshot of the metadata taken at or a little before the moment of preservation. Neither technique requires particular action from the preservation service, as the embedded metadata is simply a part of the preserved EPUB or PDF file itself.

This is a good example of expansion of ONIX into a new use case – the long-term preservation of the scholarly record – without requiring changes to the ONIX standard itself.

 

 

Metadata management is an important aspect of publishing. It’s the key information about a publisher’s products that needs to be shared with the remainder of the supply chain – from the basic title, author, ISBN, cover image, publication date and price, to richer information like abstracts or summaries, contributor biographies, open access licenses and details of international distribution arrangements. Metadata is often communicated via ONIX, a standard data file format that publishers, retailers, libraries and various intermediaries use, across many countries. It encompasses bibliographic information about the book itself, but also the vital marketing collateral and commercial arrangements. If you’re unfamiliar with ONIX, there’s a 15-minute video briefing available at https://tinyurl.com/3eyy3a9f.

EDItEUR is an independent member-supported trade association and standards body that’s best known for developing, supporting and promoting the ONIX and Thema standards, ensuring our ‘metadata supply chain’ remains based on free-to-use, open standards. The latest versions of its standards specifications can be found on the EDItEUR website https://www.editeur.org/.