Metadata is an important part of the publishing sector. It’s the key information about publishers’ products – from titles, ISBNs, cover images, publication dates and prices, to richer information like abstracts or summaries, author bios, open access licenses and details of distribution arrangements. It encompasses bibliographic information about the book itself, but also the vital marketing collateral and commercial arrangements. Metadata is often communicated via ONIX, a standard data file format that publishers, retailers, libraries and intermediaries use across many countries. If you’re unfamiliar with ONIX, there’s a 15-minute briefing video available at https://tinyurl.com/3eyy3a9f.

EDItEUR is a member-supported independent trade association and standards body that’s best known for developing, supporting and promoting the ONIX and Thema standards, ensuring our ‘metadata supply chain’ remains based on free-to-use, open standards. The latest versions of its standards specifications can be found on the EDItEUR website https://www.editeur.org/.

 

ONIX is most widely adopted and used of EDItEUR’s standards. But what does it look like, and what is it used for?

ONIX is a message – sent from one computer to another, that’s used to keep two databases of book metadata in sync. The originating database, often the publisher’s database, is the message sender, and the recipient might be a retailer, a library, or some industry service provider or intermediary such as a ‘books in print’ service or an ISBN agency. Each of these organisations holds rich information about books and book-related products, and the idea is that any changes in the metadata managed by the publisher should be reflected at the retailer or in the databases of the other recipients. The sending and receiving of messages is arranged in a metadata supply chain.

The data begins with the publisher’s product planning. Planning to publish in a year or more? Begin collating data about the planned product now. What form will the book take, hardcover, softcover, perhaps an e-book or digital audio download? What’s the working title? Who’s the primary author? Will other contributors be involved? What publication date are you aiming for, and what price have you used in your financial modelling? With just that information, you’ve got the bare minimum you need for an ONIX ‘product record’ that can be sent out to your trading partners.

 

<ProductForm>BC</ProductForm>

 

That one line snippet of ONIX data tells the recipient your planned product is a softcover – recipients know the form of the product can always be found between the <ProductForm> and </ProductForm> tags. And BC is a code taken from a controlled vocabulary that lists all the key product forms (see https://ns.editeur.org/onix/en/150). ONIX data is tightly structured like this, so that data handling can be highly automated. And the codes are used to give the data a measure of language-independence – while BC means softcover or paperback in English, it means Livre broché in French, Libro in brossura in Italian and Pehmeäkantinen kirja in Finnish.

Not every chunk of data is coded, of course – other tags carry text, dates or plain numbers. And not all tags are as simple as <ProductForm>. It might take ten or more tags to describe a contributor well, with not just their name but a detailed role, locations they are associated with, a short biography, links to their socials, birth and death dates, and an identifier like an ISNI (https://isni.org) to avoid any confusion with another contributor of a similar name. The latest version of ONIX is release 3.1, and a 3.1.1 revision is likely to be released this April. It has around 500 tags in all, able to convey almost any aspect of your product that is commercially-relevant  – although a typical product might use only 150 or so.

Plans change, of course. If you distribute details of your planned product to your supply chain partners several months in advance of its publication, then inevitably you’ll need to update that metadata by sending another ONIX message. Your working title becomes the final title, the planned publication date is postponed, or you incorporate press reviews into your records. So all ONIX metadata is subject to later updates, even after publication, as new data sent out replaces the older data already received. Don’t wait until you’re sure the metadata is ‘final’. ONIX data is expected to be dynamic in this way, changing throughout the lifecycle of the product.

Why collect so much data? The aim is to disseminate your product metadata widely – and to keep it up-to-date – for myriad different purposes, aimed both at increasing the efficiency of the supply chain itself and at delivering rich bibliographic information, marketing collateral and commercial details to help customers find and buy the books they need. Better data reduces costs and increases revenues.