So I’ve been not blogging for a while–I managed to arrange my job so I could devote myself to some serious software development, and have been reminded of both what I liked AND what I didn’t about how obsessive I can get about coding. I get really caught up in it. Hopefully some news on what I’ve coded soon.
But meanwhile, I’ve been wanting for a while to write about authority control and identifiers, a topic I have written about before. There are some points I’ve really wanted to make, but I’ve had a bit of writer’s block on it, because it is so hard to talk clearly on this subject—it’s hard to even think clearly on this subject. But I think it’s crucial, and I think there are some important things to be said, that I’m getting a bit clearer thinking about and saying.
After trying to figure out how to say these things clearly, I decided that first we need to establish some basic agreement about the purpose of authority control. I’m sorry that this ends up being so lengthy, but I think it’s necessary to be clear.
The Purpose of Authority Control
The purpose of authority control is to make sets of objects. The typical library examples are the set of all works written by author X; the set of all ‘editions’ (aka ‘versions’, aka expressions/manifestations in FRBR terminology) of a given work; the set of all works reprsenting a given subject. The need to assemble these traditional sets is expressed (not entirely clearly) in the current version of our traditional cataloging principles as: “to locate… all resources belonging to the same work… all works and expressions of a given person, family, or corporate body… all resources on a given subject”. Of course, the sets to be assembled may go beyond these traditional primary ones: “all resources defined by other criteria”
So that is a fundamental purpose of cataloging, according to the principles.
But why should we also considered the primary purpose of authority control? Because you need authority control to accomplish this goal with the degree of accuracy we want. To guard against things ending up in in the same set that ought to be in different sets (the problem of dealing with polysemy); and to guard against things ending up in different sets that ought to be in the same set (the problem of dealing with synonymy).
So far this should be fairly uncontroversial. This is indeed the traditional understanding of the purpose of authority control, right? But it’s important to say it clearly.
Another way of describing the purpose of authority control
There’s another of describing this purpose.
We can also say that the purpose of authority control is in establishing un-ambiguous relationships between entities–or in practice, it’s really more clear to say between our records for given entities. For instance, to establish a precise and unambiguous relationship from (the record for) a work to (the record for) an author; or from (the record for) a version of a work to (the record for) the work as a whole; or from (the record for) a work to (the record for) a subject.
It is important to understand that this is not a second purpose of authority control, it’s in fact just another way of saying the same thing, using different language to describe the same thing. The two ways of describing this purpose are in fact ‘logically’, ‘mathematically’, or ‘semantically’ identical.
“To put things into sets” <=> “to create relationships”
“To create the set of (the records for) all works by a given author” <=> “To establish a relationship between (the record for) an author and (the record for) each work by that author”
“To create the set of all editions of a given work” <=> “To establish a relationship betweeb (the record for) a given work and (the records for) each edition of that work”.
Etc. If you’ve accomplished the left-side goals above, you’ve necessarily accomplished the right-side ones too, because they are logically equivalent. Just different ways of saying the same thing. [Incidentally, the whole theory of relational databases is built on the mathematical/logical equivalence of these two languages.]
So we still only have one single Purpose of Authority Control we’ve discussed, but we have two different ways of talking about/understanding it. Both those ways are useful. Either one may make more sense than the other in a particular context, and if we can be comfortable talking in both languages and switching from one to the other as convenient, we can be more powerful in our understandings and communication with each other.
But it’s still just one purpose we’ve identified. Which I claim is in fact THE purpose of authority control.
With me so far? Is this still uncontroversial?
Traditional means of accomplishing authority control
So, traditionally, through AACR2 and it’s predecessor practices and rules (as well as similar rules and customs internationally), we accomplish authority control using what used to be called ‘headings’. I think we are best to still use that term, heading, to refer specifically to that traditional system of strings.
These traditional headings are strings formulated to represent a particular entity. For instance, “Burroughs, William, 1914-1997” represents a particular person—and a particular (authority) record for that person. That heading, by attaching it to records for works (or editions of works) can be used to accomplish the purpose of authority control: To establish the set of all works authored by that person (aka: To create a relationship between the record for that person, and the records for each work or edition).
Another example heading, this one a”name-title heading”, is “Shakespeare, William, 1564-1616. Antony and Cleopatra“. This heading represents a Work, and therefore, referencing this heading in a record for an edition of the work can be used to establish the relationship between that edition and the overall work—that is, in the alternate language, to establish that editions membership in the set of all editions of the work.
I submit that the purpose of our traditional system of headings is authority control, that is, to establish sets of things (aka, to establish relationships between things). That was the motivation for creating the system of headings, and that is the purpose of the system of headings.
Are you still with me? Is this still uncontroversial?
Other Subsidiary Purposes
Note that if the purpose of authority control is to establish sets/relationships, this doesn’t say anything about labelling these sets or entities in the user interface, whether that is a bound catalog, a card catalog, or a computer monitor.
We have traditionally used our traditional headings to provide these labels. This next is important: This is not just another way of expressing the same purpose, this is a different purpose. In the computer age, it would be possible to accomplish the purpose of authority control with ‘dumb identifiers’. Let’s say the OCLC number, or LCCN, or even better yet, let’s imagine that the DOI system were somehow extended to all bibliographic items of concern to us, no matter how old, no matter electronic (yet) or not. By recording this DOI (in place of our traditional headings), we could still establish sets/relationships in exactly the same way–we could still accomplish the purpose of authority control. Without accomplishing the subsidiary purpose of providing a label to show the user (whether cataloger or searcher). This demonstrates that they are two different purposes.
Ah, you say, but we need to accomplish that second purpose too! We need to label these sets and these entities (at the end of our relationships) somehow, we need to show the user something! That is certainly true. You may even argue as to the qualities and characteristics required of that “somehow and something”, and those may be more debatable among us. But my point here is simply to understand that these are seperate purposes.
Purposes at cross-purposes?
Why does this matter? Well, one reason is because we must recognize that these purposes sometimes are at cross-purposes to each other, and if The Purpose of authority control is establishing sets/relationships, then whenever accomplishing one purpose well and efficiently comes at the expense of accomplishing another as well and efficiently as possible, that primary purpose ought ideally to win out.
For example, for the primary purpose of authority control, it is highly desirable that those heading (our current mechanism of accomplishing the purpose) never change. The heading is used to establish a relationship, and if the heading is ever changed, all the records recording the old heading have to be found and updated–it would be better if this never had to be done (because many will invariably slip through the cracks, leading to harm to the very purpose of authority control–as most of our catalogs demonstrate).
However, if you are also using this traditional heading as a label to the user, it may be desirable that it DOES change. For instance, to add a death date (a practice that I understand has fallen out of favor—evidence that the community realized it harmed the primary purpose of authority control, and therefore was undesirable). Or, say, if you discover that there was a mis-spelling in the original heading!
Likewise, for this primary purpose, it would be desirable that all catalogers and systems across the planet use the same heading for a given entity. That way, we could all easily share information on set membership/relationships to that entity. However, of course it is highly undesirable and impractical for us to all use the same display to the user. In China, Chinese script must be used for an author’s name, in Saudi Arabia Arabic script is highly preferable, etc. What if it were possible to take a large step toward us using the same headings internationally, a step which could give us all much greater efficiency at creating these sets/relationships (the purpose of authority control), and therefore better and more complete data for us all—but we couldn’t take that step, because it would mess up our displays? We would be sacrificing the primary purpose of authority control (and thus of headings) for it’s secondary one! But we’d have little choice–we can’t make American patrons read headings in Chinese script.
Another reason that we recognize that these purposes are separate purposes is that as we try to conceptualize what we do in a way that allows us to ‘inter-operate’ (conceptually and actually) with other communities, we will encounter other communities that do separate the accomplishment of these two purposes, perhaps using a system or device that accomplishes one without accomplishing the other.
Then and Now
So, there is some significant inconvenience in the nature of our traditional heading system as used for two purposes, one primary and one secondary. So how did we end up with it? In the pre-computer days, the inconvenience was outweighed by a huge convenience to us to use a heading to accomplish both these purposes.
The way we accomplished the purpose of authority control was by putting records in alphabetical lists—quite literal, physically ordered lists, originally in a bound catalog, then a card catalog. Ordered by those headings. If we instead used a ‘dumb identifier’ like a DOI to exersize the purpose of authority control, we would have STILL had to put records, very physically and concretely, in lists, because such a very physically ordered list was the only good way we had of searching a file. And in the pre-internet (and even pre-telephone!) days, how would these ‘dumb identifiers’ possibly be efficiently communicated to everyone who needed to know them? It would be virtually impossible. In the pre-computer era, changing a label would be huge impractical anyway, so the difficulty of doing it was irrelevant. Likewise, international cooperative cataloging, or providing different labels for different contexts (Chinese speaking user vs. English speaking user), all were hugely impractical anyway.
[The traditional system of headings has the theoretical advantage of allowing different people in different places to come up with the same heading for a given entity without communicating with each other. At one time, that was crucial, but that was a very long time ago. A cataloger today who assigned such a heading without looking it up in the NAF/SAF/etc would be guilty of cataloging negligence! Cheap, fast and easy communication between catalogers (via central authority files) is crucial to the contemporary actually existing practice of cooperative cataloging, notwithstanding that the traditional heading system can, sort of, in theory, make do without it.]
My primary purpose in this discussion is to establish an agreement about the fundamental purpose of authority control–what authority control is, and what it is for–to allow me to go on to my next step, which is discussing the nature of the concepts of ‘identifiers’, ‘access points’, ‘headings’, ‘main entries’, etc.–how they overlap, how they are distinguished, how they ought to be understood. With the goal of a critique of the FRAD document, which I think has confused some important things. So if anyone has possibly made it this far, do you agree with me on what is the primary purpose of authority control, and thus of our traditional headings? Do you agree with labelling as a secondary purpose (if one that must be accomplished somehow, and to certain standards), which the primary purpose should not be sacrificed to?
If so, I believe logic will inexorably lead you to some somewhat more controversial claims about how we understand identifiers et al that I will lay out in the next part. Which may have to wait until I’m back from my imminent vacation, at the end of August.
I will leave you with one thing. Recognizing the inconvenience of the combination of purposes of our headings, I will describe one possible system of exersizing authority control which could escape these problems—while (this is important) still providing the exact same interfaces we have now for users (and almost exactly the same for catalogers).
Instead of using our existing heading to record a relationship from one record to another, we could use a ‘dumb identifier’ such as an oclc number or a DOI. But we could also record the traditional heading as being related to that very same ‘dumb identifier’ (for instance, by recording it in the authority file). Our primary purpose of authority control is still accomplished, but now we can change the _displayed label_ (the heading itself) whenever we want (to fix a spelling error, to add a death date), without interfering with that primary purpose—without requiring us to change that ‘dumb identifier’. Likewise, we could have _several_ label headings hanging off the authority record (one for Chinese, one for English, etc.), to allow display of these headings in the appropriate local script, but we could all still participate in one big cooperative cataloging environment by all using the same mechanism to establish the primary purpose of authority control.
And if desired, the exact same interfaces (actual and imagined) could be provided under this new system. Those headings, hypothetically displaced by ‘dumb identifiers’ for accomplishing the primary purpose of authority control, could still be displayed in EVERY place you could display them in the system where the headings are in fact themselves used for the primary purpose of authority control.
It is important for the next step to understand that this is possible, and that some systems we want to inter-operate with will likely seperate things in this way. Whether it is desirable to us to do so is kind of a separate discussion (although an interesting one, and it’s probably clear that I think it is).