Tag Archive for database

What is this ‘Linked Data’ thing all about?

By Richard Light

PenClipartVectors via pixabay (CC0)You may have come across an enthusiast (like me!) who tells you that you should be publishing your museum collection as Linked Data. Your reaction may well have been to shrug, say “I don’t know what it is and I don’t know how to do this”, and get back to cataloguing your collection and recording your collections management work. At this stage in the game, that would probably be a wise choice.
This post tries to explain “what Linked Data is” from a cultural heritage point of view, what the possibilities are, and why it is currently really hard to do it.

The Web as a distributed database

We all know how the Web works. You find a page containing information that interests you: this usually involves using a well-known search engine. This initial page of search results contains lots of links to relevant pages, and you simply click on the links that look relevant to go to those pages. On each new page there are more links to follow. If you’re really lucky you can end up going round in circles. This is ‘browsing the Web’. It’s fine as far as it goes, for looking up and reading information, one page at a time.
However, if you want to treat these pages as data (for example, to add background information into an object catalogue record), you will find they are quite limited. You can copy and paste some (or all!) of a web page into one of your records, but you will find that you either end up with annoying HTML markup in your data along with the text, or that the markup disappears and all the text is kludged together. Either way, you can’t expect to extract data from web pages in a format which is compatible with your collections management system.
Linked Data works in the same way as web pages. The key difference is that each ‘page’ is actually a (sort of) database entry, containing structured data. You can browse from one Linked Data page to another, just as you browse web pages. The Linked Data web is, in effect, a loosely joined-up database that spans the entire Internet.

Using URLs to identify concepts

Linked Data, from our perspective, is something we could use to describe the entities that make up the cultural heritage world. These include people, places, events … and objects. A key feature of the Linked Data approach is that each concept has its own unique identifier. This is a URL, which follows exactly the same rules as the URLs which identify web pages. So this is a Linked Data identifier for a person from the Getty’s ULAN (Unified List of Artist Names) thesaurus:
http://vocab.getty.edu/ulan/500077287
Pop that URL into your browser, and you will see a slightly strange web page, which lists the facts known about this person. The page heading makes it clear that this person is John Gerald Platt – something that isn’t clear from the URL.
So far, not very exciting – but this is where the Linked Data magic comes in. Ask for the same URL in a different way, and you get real data back. I’ll gloss over the exact way you do this1 and the technical details of the data2 , and give you a sense of how it looks. This is a fragment of the XML version of John Gerald’s data:

This fragment lists the biographical data that is available. The key point is that each biographical statement has its own Linked Data URL, for example http://vocab.getty.edu/ulan/bio/4000231223, which you can look up:

This biographical fragment contains some real data: two dates and a summary description. There are also URLs for John Gerald’s gender and place of birth, which you could track down and extract data from. You’ll notice that these URLs come from different Getty thesauri: the gender URL comes from the AAT (Art and Architecture Thesaurus) and the place of birth from the TGN (Thesaurus of Geographic Names). This is a good way to do Linked Data: use existing frameworks to express the concepts you want to make statements about, rather than inventing new ones.
The really nice thing about using someone else’s Linked Data URLs in your records is that they give you additional data ‘for free’. For example, if you use a geographical resource like Geonames3 you get access to geolocation data for each place, which means you can publish distribution maps full of little pins at the cost of a little programming.

Publishing your collection as Linked Data

So let’s return to my original suggestion: that you publish information about your collection objects as Linked Data. There are two good reasons to do this: you stake a claim to your own material in the Linked Data world; and you provide an API for others to use when they want to access your data. I’ve had a go at doing this for U.K. museums, and a couple of them have taken up the opportunity4.

However, as I flagged up at the start, there are also good reasons not to publish your collection as Linked Data. Three which spring to mind: I’ll bet your collections management system lacks any support to help you add Linked Data URLs to your catalogue records; your web publishing software environment lacks any means of using Linked Data to add value to your web presence; and (perhaps most importantly) we currently lack Linked Data frameworks for the concepts we really want to share information about: people, places and events.

I’ll talk about these topics in more detail in a future post: in the meantime I look forward to responding to your comments and questions.

Richard Light is a U.K.-based information scientist and software developer who has been involved in museum information systems for nearly all his career. He helped computerize the Sedgwick Museum, Cambridge back in the days of punched paper tape and mainframes, and then worked on data standards and systems with the Museum Documentation Association (now Collections Trust). Since 1991 he has been an independent cultural heritage consultant, specializing in markup languages and Linked Data. He is the Chair of Free UK Genealogy5 and is a regular attendee at CIDOC6 meetings: something all museum documentation folk should do!

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

"Various" is not a category, and "object" is anything

Accession and Category: encoding or collections division

Each of these 3,000 objects of Mexican Folk Art need - and have - a category.Thanks to Aleida Garcia for the picture.  www.imasonline.org

Each of these 3,000 objects of Mexican Folk Art needs -and has- a category. Thanks to Aleida García for the picture.  www.imasonline.org

In the work of a museum collections registrar, finding accession encodings and a category for each object in your collection is indispensable. They are more than one number, for themselves carry a large amount of information, or open the door to more details.

These codes are a “QR” avant la lettre. Their use in software management and control of collections allows that they become starting points for numerous computerized search criteria; search fields of the software may include all numbering and terminology that contain these encodings.

The code or accession number is used universal and indispensable, the category seems to be less appropriate for some museums. However, I give more attention to this second part. While the code or accession number usually refers to the year in which an item entered the collection, sequentially for each calendar year [for example 2012.0034], the category defines object type, purpose and meaning. The category should be not an encoding that is used for aesthetic concerns or some supposed superiority or natural value, cultural or naturalcultural (artistic, scientific, technological, religious, etc.). A categorization can or should include as many subcategories as necessary. OBJECT TYPES, for example:

[PAINTings / ABStracts-0148];

[FURniture / CONsole-0025];

[VEHicles / AUTOmotives / TRUcks-0012];

[TOOLs / HAMmer-1135];

[CLOcks-0982];

[TAPestries-0023];

[PRINt / POSter-1128];

[CLOTHing / SHIrt / MALe / AFRican-0089];

[LITURgical / CHAlice / GREEk / ORTHOdox / CHURch-0051];

[MUSic / INSTRuments / WIND instruments / HEBrew-0129];

[MACHine TOOls / PERCussive / DRIll-0023]…

I refer here only to cultural and technological objects, due to my lack of knowledge about the natural areas, biological or mineral.

I typed in uppercase “OBJECT TYPES” because that little word, when used improperly, generates false information, vague and too generic, which is unacceptable for a museum collection. The same goes for the little word “VARIOUS” (Miscellaneous). Every object, of whatever type (natural, cultural, technological or naturalcultural) has a name and belongs to a genre, type, species, family, etc. This applies even when it comes to intangible cultural heritage or intangible natural heritage. This holds true for everything in the registrar’s universe, which means that he / she should be well aware of this fact and give indeep thought to the classification of every object he / she has in the collection. This means that the regsitrar should cooperate closely with curators and researchers, or even manufacturers, who know more of that object and possible categorizations than the registrar. In codings per category should always be an appropriate term for categorization or division. And if the existing categories in the collection don’t have a place for this type of object: create one! A good collection management software allows and encourages, as a good manager, a good healer and a good registrar.

In my work as a logger I never categorized an object as “Various”, but corrected and relocate some existing cases that were filed as such. Same goes with the truism category “objects”. Obviously, everything is an object! (at least until you create the “Museum of Thoughts and Feelings” … The recorder is in trouble there …).

I have seen cases, for example in a museum of Latin America, in which part of its collection (which appears on their website) is categorized as “Objects”. Even almost a year ago I made some comments and suggestions, but until now I got no response.

The correct title or generic name of an item are a must: I found a case in which an item was called “Armchair with two armrest” … A quick check in books reassured me that a chair that has two armrest is called Armchair … And a bit of reasoning helped me reconfirm that skateboards have rolls, because…

Registrars in museum collections can and should be able to open their schemes and reasoning in order to do their job properly, efficiently and creatively, adapting to the circumstances and type (Category) of the object that needs to be accessioned and documented in the collection. A good registrar must remain critical!

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

A registrar needs a flexible mind

The professional practice requires to keep up to date. You have to “think outside the box”, without constraining into one thought pattern or routine, especially when there are situations that require reflection and need to be addressed with a flexible mind. The registrar of the permanent collection of a museum should be especially flexible. Read more

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail