Metadata is an important part of the publishing sector. It’s the key information about publishers’ products – from titles, ISBNs, cover images, publication dates and prices, to richer information like abstracts or summaries, author bios, open access licenses and details of distribution arrangements. It encompasses bibliographic information about the book itself, but also the vital marketing collateral and commercial arrangements. Metadata is often communicated via ONIX, a standard data file format that publishers, retailers, libraries and intermediaries use across many countries. If you’re unfamiliar with ONIX, there’s a 15-minute briefing video available at https://tinyurl.com/3eyy3a9f. 

EDItEUR is a member-supported independent trade association and standards body that’s best known for developing, supporting and promoting the ONIX and Thema standards, ensuring our ‘metadata supply chain’ remains based on free-to-use, open standards. The latest versions of its standards specifications can be found on the EDItEUR website https://www.editeur.org/. 

 

Last time, I introduced ONIX, the most widely adopted of EDItEUR’s standards. It provides an international and multilingual framework for metadata describing books, other book-like products and some other product types that flow along the book supply chain. 

Few publishers or resellers will adopt a standard like ONIX simply because it is ‘a standard’. EDItEUR takes the view that any standard has to solve real-world business problems. So what are the benefits of adopting ONIX broadly across the book supply chain? 

On the one hand it’s simple: it’s about reducing the costs of duplicated data entry and data processing, via automated communication and data sharing within a publishing organization and across the supply chain. Adopting ONIX encourages data interoperability across diverse IT systems, enabling the growth of an ecosystem of off-the-shelf product management software. ONIX provides the opportunity to develop robust business processes for the creation and maintenance of metadata, and thus it builds a foundation for excellent metadata quality and consistency – and metadata accuracy, consistency and timeliness in turn promote a growth in sales. 

So – reduced costs and boosted sales. But beyond that, ONIX has a truly international approach, suitable for use across geographies, languages, scripts, currencies, and for all types of books, whether digital or physical, for audio, and for many other book-related product types. 

Having a widely adopted global standard also promotes a common understanding of metadata concepts and an accepted industry-wide lexicon, understood at the points of creation and of use. That consistent and stable understanding of the metadata reduces change management issues and promotes staff availability. EDItEUR’s robust governance provides continuity and protects investments in metadata, systems and staff training. 

  

ONIX is created and supported by expert practitioners, whether EDItEUR staff or EDItEUR members, and the standard both informs and is informed by national and international consensus and metadata best practices. 

A good example here might be the way that author names are stored and communicated in ONIX. On the one hand, names might appear simple: just a given name and a family name might work in some parts of world. But ONIX takes a global approach: there are personal names and corporate names. Corporate names might appear in more than one language (“International Publishers Association” and “União Internacional de Editores”), personal names can be real, or pseudonymous. Both types are subject to change over time, and can be written in different scripts. And personal names can’t be easily sorted into alphabetical order – a common requirement – unless broken down into parts (or reversed, as is library practice, which can compromise the display of the name). Therefore, best practice in ONIX is to use up to eight different data fields, to hold the key names (usually but not always the family name) plus various name parts that precede or follow the key names: 

  • <TitlesBeforeNames> – Dr., Prof., Sheik or other honorifics 
  • <NamesBeforeKey> – usually the Western-style given names 
  • <PrefixToKey> – d’, de la, von, etc 
  • <KeyNames> – the only mandatory part of a name, and the part used first for sorting purposes 
  • <NamesAfterKey> – used particularly for Chinese, Hungarian or other names where the given names follow the family names 
  • <SuffixToKey> – Jr., III (the third) etc 
  • <LettersAfterNames> – PhD, MD and other qualifications 
  • <TitlesAfterNames> – other titles 

For scripts like Kanji without an inherent alphabetical order, a phonetic gloss can also be incorporated within each part of the name. 

Because the ONIX author name data is granular in this way, data sender and data recipient don’t have to agree on whether the prefix forms part of the key name (so every author with ‘von’ or ‘van de’ sorts under the letter V), or whether the prefix is ignored for sorting purposes – while the prefix can still easily be displayed immediately before the key names. Each data recipient can apply it’s own sorting rules. 

Although the name structure above seems relatively complex (and there are further options not mentioned here), this structure (see the diagram), plus two simple options <PersonName> and <PersonNameInverted>, meet the requirements of the supply chain, are relatively well understood by ONIX users, are supported in widely-available ONIX-compatible IT solutions, and have persisted in ONIX for two decades. Initially, it might appear a little daunting, but if the requirements are understood and the data structure is designed well, granular data is an investment that pays off with flexibility for reuse.  

Graham Bell 

Executive director, EDItEUR