It looks like you're offline.
Open Library logo
additional options menu
Last edited by dcapillae
July 28, 2020 | History

Forging the guidelines

Let's start by listing some issues where it seems helpful to reach a decision one way or the other. In some cases these decisions might be set in stone already - in which case we should indicate how they're dealt with. There's no real order to this - if someone thinks it can be ordered nicely, please do!

Work titles

If translations aren't their own work (see below), which should be the title of the work?

Relevant questions

If the original-language one, how do we mark the English language one in a way that is useful to (what I guess is a majority of) members who are English speakers when they try to find works originally released in, say, Russian? "Мастер и Маргарита" probably won't say much to a random user trying to find "The Master and Margarita".

If the English one, what do we do with those works which haven't been translated into English? Furthermore, what do we do with users who, understandably, want to use the original title for works in their non-English native language?

Opinions / Comments

Tom M

Work title - it doesn't need to have just one if the I18N is done right. Everyone can have the title in their own language selected by their UI preferences. Strings get stored as a text/language tuple and each language can have one, even for single valued fields/properties.

Nicolás

I pretty much agree with Tom, but especially if i18n is done right and people can choose what language to use for titles, I do think it's useful to have a way to select the original/native title as a "primary" name/alias. That way, people who want to always see the "real" title can do that, and people who want to see the title in their own language can do it too.

[[/people/bencompanjen|Ben]]

I think the original title should be the work title, if available. In search results an English or preferred language title could be shown under/above the original title, so that users see that they are looking for a translation.

Hi-storian

The work title should be the original language title. Editions have their own title (in Librarian mode), and that title should be the translated language title. Sometimes different translations of the same work into the same language have different titles ... so you see how the Edition Title is the only practical solution.

LeadSongDog

I tend to agree with Hi-storian's view, but given that we are starting from a position of erroneously having many work records with slight variations in spelling of title or author, the merge behind this will be some time coming. Still, the earliest known version should be the objective. Whether that needs to be transliterated is a related issue. Note that WorldCat hasn't solved this either. They eventually resolve it by redirecting works to the earliest-assigned (lowest numbered) OCLCn.


Translations

Should different translations get their own work?

Relevant questions

If not, how can we group editions of the same translation in a useful way? It sounds like something that certainly would improve the usability of the data.

If yes, how do we link them to the main work? (since if they're unlinked, they're not very useful).

Opinions / Comments

Tom M

Translations are not separate works, but it's useful to have a "translation" record which records information about a specific translation (translator, date, target language) and collects together all the editions of that language.

Nicolás

Translations shouldn't be considered as works, but they should exist between books and editions (as per work - expression - manifestation on FRBR).

hjgr

If you edit edition details in librarian mode, you can already tick a checkbox to mark the edition as a translation, provide the original title and language. Whe should use that.

[[/people/bencompanjen|Ben]]

hjgr is correct. The fields are rarely used though. I'm not sure if I like Tom's "translation grouping" or Nicolás's "expression" better. They seem not too different, but an expression could hold more distinctions. I rarely see translation dates in translated books, by the way.

Hi-storian

Agree with hjgr. As for "grouping", perhaps a simpler solution is to allow the user to sort the editions of a work by date (ascending or descending) or by language (then by date). That would create the "grouping" you're looking for without getting too complex on the code or making any changes to the database.

LeadSongDog

Yes. The user interface should move that out of the librarian mode to the main form. Also, the "original work" should be linked to a work ID, not a text string containing one version of the title.


Author names

The only guideline for author names I can find is to use natural order (one I certainly agree with, too). But some more are needed. The main one I see: Should the name for the author profile be the native name, or an adaptation? This is especially important for romanisations. For what it's worth, in MusicBrainz we decided to standardise on the native name of the author, with the rest as aliases.

Relevant questions

If the native name is used, how do we ensure people unfamiliar with it can easily find the author? Same for the Anglicised name.

Opinions / Comments

Tom M

As for work names (that is, "it doesn't need to have just one if the I18N is done right. Everyone can have the [name] in their own language selected by their UI preferences").

Nicolás

Again as for work names, I'd prefer a combination of that + a "primary" native name.

LeadSongDog

Linkages matters more than specific spelling. I've taken to throwing a VIAF link or a Wikipedia link into the bio field until the UI eventually supports that more directly.


Pseudonyms

Should pseudonyms share an author entry with the "legal name" author?

Opinions / Comments

Tom M

No opinion given (yet). Comment: Freebase breaks with what I understand to be library practice and collects all personal pseudonyms together as aliases for the main author record. Collective pseudonyms and house names are still cataloged separately. I think it's worth discussing whether pseudonyms should be cataloged separately or together with the main author.

Nicolás

I'd rather have pseudonyms as separate authors if and only if we can link them to their "legal names". If we can't add a way of linking them, fully merged sounds more useful than fully separate. Ideally, if fully merged, we'd have an option "Author: ThisGuy as Pseudonym"

[[/people/bencompanjen|Ben]]

Agree with Nicolás.


Authors in editions

Whatever the name we choose for the author entry, do we want to store the specific way the author is credited in an edition? (while still linking it to the same author entry). MusicBrainz's first answer to this was "no", which ended up turning into "yes" (it is now possible). To the average user, seeing "Мастер и Маргарита" by "Mikhail Bulgakov" or "The Master and Margarita" by "Михаил Афанасьевич Булгаков" might feel strange and somehow wrong. But one of the two will happen whatever we choose as the author's name. How do we solve this?

Opinions / Comments

Tom M

"Credited as" or equivalent can go on the edition record for completeness

Nicolás

I'd prefer having this but it's not so necessary, just a nice thing to have.

[[/people/bencompanjen|Ben]]

On Discogs, the Artist Name Variations help enormously with identification of music releases (although artists may be more creative with name variations than authors). I'm in favour of recording exact author credits.
This is what the "by statement" ("by-line") can be used for, found in the librarian mode just under contributors. I think in libraries it is used to add contributors too. A separate field could be used instead, if anyone feels strongly about keeping "[by] Bla Bla ; [trans. by] Another B. Bla ..." - wait, what am I saying? By-lines should eventually be removed completely. I want fields, not punctuation as field separators within fields :)

Hi-storian

Agree that the By line (in Librarian mode) would be the place for the name as it appears on that edition.


Printing runs

Do we want to store printing run info at all? I am not familiar with whether this data is seen as useful in libraries or not, but hopefully others will be!

Opinions / Comments

Tom M

Don't catalog

Nicolás

I have no problems with ignoring this completely.

[[/people/bencompanjen|Ben]]

It could be interesting for collectors, or perhaps college students who want to know in which print the errata were incorporated in the text (that could be a new edition too). If OL wants to attract users from these 'interest groups', more info is better (presentation of the info should make less important info less visible). Can be stored as notes for now, if user wants to.

Hi-storian

Agree with Ben that printing info has use in certain cases, but not in most cases. If a difference in printings is essential, I'd create a separate edition, with the printing info as a Note.


Other people

What should we do with non-authors? Currently we just show them as plain text (in tiny print). I'd argue at least translators should be more prominent, although not as much as the main author - and maybe get author pages as well. It might be also interesting to do something more than small plain text for audiobook narrators, which are fairly important for the audiobook. Also illustrators, who are usually more important than authors for kids' picture books and even for some adult books.

Opinions / Comments

Nicolás

All non-author people should ideally get their own entities. The Estonian libraries treat them in pretty much the same way as authors, and I see the LoC does the same at least for translators (although neither seem to have actual pages for people). If we don't give them proper entities, we should at least link the plain text stuff to searches like they do, so that it is possible to click on "TranslatorName, translator" and launch a search of all things translated by TranslatorName.

hjgr

Having a page for every person seems a good approach to me. Depending on the function a person has been marked in the whole database, the content is presented differently. Plus: a person can have more than one function and still has one central page storing all related information.

[[/people/bencompanjen|Ben]]

All entities listed as contributor (maybe excluding those with "thanks" role) should be a real entity. Related issues are setting the type of an entity (person, pseudonym, organisation, conference?), discouraging users to enter more than the name (e.g. 'digitised by': "thanks to a donation from X, Y could digitise the whole collection" (example made up, but it happens)) and having a dropdown list that shows all possible entities.


ISBNs

Should we normalise ISBN10 to ISBN13, or store what the book has?

Relevant questions

Should we have code to remove unneeded characters and validate the ISBN?

Opinions / Comments

Tom M

Accept both 10 & 13, normalize to 13 with no punctuation for storage/search. Check & warn on check digit, but don't reject (because sometimes they're printed with bad check digits).

Nicolás

Accept both ISBN10 and ISBN13 as input. Possibly normalise to 13 (and certainly normalise to no punctuation) for storage, in any case make sure the search can automatically do the conversion and find the same entries when given either of the two. Check for check digit validity and ask for confirmation ("mark this checkbox to confirm this is correct" or the like), but allow them in if that's marked.

Hi-storian

The parsing of an ISBN has information value. ISBN10 is a historical format (and the only format available prior to ISBN13 introduction). ISBN13 is the format going forward. As far as input goes, accept both ISBN10 and ISBN13. ISBN10 can be automatically converted into ISBN13, but not all ISBN13's have a corresponding ISBN10. (Assignments into 979 groups.) If you decide to store without parsing dashes, you should display with them.


URL linking

Should we have specific selectable "labels" for URLs? For example, at MusicBrainz we don't have free text labels at all, but a list of "types" of URLs from which to choose (a lot of which are autodetected when the user pastes an URL). For example, pasting a Wikipedia URL will do some cleanup on the URL to standardise it, automatically select the "Wikipedia" relationship and load it on the sidebar with a nice Wikipedia icon: http://musicbrainz.org/artist/f66b7fa3-1731-40ec-a2d9-710aa9e07c5d

While I don't see a reason to stop allowing free text labels, should we define a set of labels that can be detected and auto-filled from the URL, so that all links to Wikipedia/VIAF/whatever are shown in the same way instead of having various labels applied depending on the user? If we want them, which pages should we autodetect? (note also that explicit Wikipedia links would allow us to auto-load descriptions for books, if desired)

Opinions / Comments

Tom M

No opinion given (yet). Comment: Freebase uses a similar scheme to the one described for MusicBrainz. The identifier gets inserted into a template URL for external links to OpenLibrary, MusicBrainz, Wikipedia etc.

Nicolás

I think it's certainly useful to make it simpler for people to add links, and making sure they don't have to think about which label to use makes both the entry simpler and the data more clean, so I see this one as a very nice benefit.

LeadSongDog

Getting the info matters most. Code can always come along later to match patterns, reformat links, etc.


Works within works

How should we deal with the (common) cases where works are contained into other works? This can go both over books (book series, which at least FRBR seems to call works) and under them (books of short stories, compilation of essays or articles). The latter case seems especially interesting - and brings up two problems.

Problem 1 is how to link each particular subwork with its author(s).

Problem 2 is how to link a specific subwork (like a short story by Poe) to all book-works in which it appears.

Opinions / Comments

Nicolás

Comment: This seems like a quite complicated case and one that would require large changes to the UI, but it's also really useful information to provide. For a reader, finding and reading Poe's Berenice shouldn't need to be any harder than finding and reading Arthur Gordon Pym just because only the latter is published in a standalone manner. The difference seems comparable to release groups vs. works in MusicBrainz: release groups are containers for all versions of an album, while works are the actual compositions. In this case, I'd see "book" as the equivalent of a release group, and "work" as an actual... well, literary work. And books as having "tracklists", except that as opposed to music albums, a majority of them would only contain one "track" / literary work (but some could have hundreds).

Karen C

I should note here that the FRBR community itself has not solved this issue, even after a great deal of study. I'd try to explain their proposed solution, but others have shown that it has problems.

We did talk about using the Table of Contents area for included works, but that would require adding a linking capability. The linking itself isn't terribly difficult, but making an easy-to-use user input interface for that has some challenges.

I consider the UI to be the lesser problem here - not because it's simple, but it does sound doable. The main problem I see is the whole definition of work = book right now in OL - I'd say books are containers, and they often but not always contain only one work (I guess sometimes they contain less than one work, too?). Literary works themselves would seem to be a different kind of entity. Nicolás

Nicolás, that is exactly the problem that the FRBR folks took up. The problem comes about with the many-to-many aspect of containers with more than one work and works in more than one container. It requires a graph view of the book universe, and we aren't designed for that at the moment. I would love to see a design that has both books as containers (e.g. things that get an ISBN) and works as separate from, but linked to, such containers. FRBR does NOT take that view, but has the usual library cataloging emphasis on describing the containers the library owns.

Hi-storian

This question was brought up as /community/books. "Collective works" are distinct works in their own right. The Table of Contents is the way to go on this. Linking items that have been printed as separate works would be an interesting approach.

LeadSongDog

Various flavours of this come up: sequences of articles within multiple volumes of serials are very painful. In the basic case of compilations and anthologies for simplicity, I favour showing the editor(s) where the author would normally be, then providing any further info in the table of contents.

History

July 28, 2020 Edited by dcapillae Removed navigation menu
July 27, 2020 Edited by dcapillae highlighted active page in the navigation menu
July 27, 2020 Edited by dcapillae highlighted active page in the navigation menu
July 27, 2020 Edited by dcapillae Updated link to "Becoming a Librarian"
April 8, 2013 Created by Nicolás Tamargo de Eguren Long first draft