What is this ‘Linked Data’ thing all about?

By Richard Light

PenClipartVectors via pixabay (CC0)You may have come across an enthusiast (like me!) who tells you that you should be publishing your museum collection as Linked Data. Your reaction may well have been to shrug, say “I don’t know what it is and I don’t know how to do this”, and get back to cataloguing your collection and recording your collections management work. At this stage in the game, that would probably be a wise choice.
This post tries to explain “what Linked Data is” from a cultural heritage point of view, what the possibilities are, and why it is currently really hard to do it.

The Web as a distributed database

We all know how the Web works. You find a page containing information that interests you: this usually involves using a well-known search engine. This initial page of search results contains lots of links to relevant pages, and you simply click on the links that look relevant to go to those pages. On each new page there are more links to follow. If you’re really lucky you can end up going round in circles. This is ‘browsing the Web’. It’s fine as far as it goes, for looking up and reading information, one page at a time.
However, if you want to treat these pages as data (for example, to add background information into an object catalogue record), you will find they are quite limited. You can copy and paste some (or all!) of a web page into one of your records, but you will find that you either end up with annoying HTML markup in your data along with the text, or that the markup disappears and all the text is kludged together. Either way, you can’t expect to extract data from web pages in a format which is compatible with your collections management system.
Linked Data works in the same way as web pages. The key difference is that each ‘page’ is actually a (sort of) database entry, containing structured data. You can browse from one Linked Data page to another, just as you browse web pages. The Linked Data web is, in effect, a loosely joined-up database that spans the entire Internet.

Using URLs to identify concepts

Linked Data, from our perspective, is something we could use to describe the entities that make up the cultural heritage world. These include people, places, events … and objects. A key feature of the Linked Data approach is that each concept has its own unique identifier. This is a URL, which follows exactly the same rules as the URLs which identify web pages. So this is a Linked Data identifier for a person from the Getty’s ULAN (Unified List of Artist Names) thesaurus:
Pop that URL into your browser, and you will see a slightly strange web page, which lists the facts known about this person. The page heading makes it clear that this person is John Gerald Platt – something that isn’t clear from the URL.
So far, not very exciting – but this is where the Linked Data magic comes in. Ask for the same URL in a different way, and you get real data back. I’ll gloss over the exact way you do this1 and the technical details of the data2 , and give you a sense of how it looks. This is a fragment of the XML version of John Gerald’s data:

This fragment lists the biographical data that is available. The key point is that each biographical statement has its own Linked Data URL, for example http://vocab.getty.edu/ulan/bio/4000231223, which you can look up:

This biographical fragment contains some real data: two dates and a summary description. There are also URLs for John Gerald’s gender and place of birth, which you could track down and extract data from. You’ll notice that these URLs come from different Getty thesauri: the gender URL comes from the AAT (Art and Architecture Thesaurus) and the place of birth from the TGN (Thesaurus of Geographic Names). This is a good way to do Linked Data: use existing frameworks to express the concepts you want to make statements about, rather than inventing new ones.
The really nice thing about using someone else’s Linked Data URLs in your records is that they give you additional data ‘for free’. For example, if you use a geographical resource like Geonames3 you get access to geolocation data for each place, which means you can publish distribution maps full of little pins at the cost of a little programming.

Publishing your collection as Linked Data

So let’s return to my original suggestion: that you publish information about your collection objects as Linked Data. There are two good reasons to do this: you stake a claim to your own material in the Linked Data world; and you provide an API for others to use when they want to access your data. I’ve had a go at doing this for U.K. museums, and a couple of them have taken up the opportunity4.

However, as I flagged up at the start, there are also good reasons not to publish your collection as Linked Data. Three which spring to mind: I’ll bet your collections management system lacks any support to help you add Linked Data URLs to your catalogue records; your web publishing software environment lacks any means of using Linked Data to add value to your web presence; and (perhaps most importantly) we currently lack Linked Data frameworks for the concepts we really want to share information about: people, places and events.

I’ll talk about these topics in more detail in a future post: in the meantime I look forward to responding to your comments and questions.

Richard Light is a U.K.-based information scientist and software developer who has been involved in museum information systems for nearly all his career. He helped computerize the Sedgwick Museum, Cambridge back in the days of punched paper tape and mainframes, and then worked on data standards and systems with the Museum Documentation Association (now Collections Trust). Since 1991 he has been an independent cultural heritage consultant, specializing in markup languages and Linked Data. He is the Chair of Free UK Genealogy5 and is a regular attendee at CIDOC6 meetings: something all museum documentation folk should do!



…and the livin’ is easy…

While it seems that some of us just can’t stop working…

(Plug: Reibel’s Registration Methods has seen a major revamp that brought it to the 21st century thanks to Deb Rose van Horn)

…we have set up a creative workshop using meta-planning techniques in our garden. Starting off with neither a solution nor a problem after 3 hours of intensive creative work and purposeful improvisation, only using the materials and tools at hand we finally came up with this:

We are still not quite sure which problem we solved but we are somewhat proud of the solution. (Most obvious: everybody agreed that whatever problem we solve, the solution should be adjustable in height.)

Enjoy the summer!


Registrar’s Shoes – More Thoughts on Professional Footwear

Working as a registrar might require unexpected skills: Like being all dressed up for the big opening and still be able to deliver a cart of desperately needed tools to the mount-maker.

Working as a registrar might require unexpected skills: Like being all dressed up for the big opening and still be able to deliver a cart of desperately needed tools to the mount-maker.
Thanks to Lisa Kay Adam for the picture.

Three things happened in the last four weeks:

1. I changed offices and decided t get rid of my very first safety boots.
2. My current summer safety boots died the usual unpleasant death that awaits all my safety boots.
3. I re-read the piece about shoes at conferences by Janice Klein.

It inspired me to write a piece about a registrar’s working shoes. It’s the same problem like with shoes for conferences, only worse. As a registrar in a small museum you need to be one moment on the top of the ladder, exchanging the light bulb, at the next moment guiding a group of students and yet the next moment shake hands with the president of your university.

As a registrar in a larger museum, you are not really better off: You have to walk miles in the gallery spaces, again climb ladders and if you enter visitor’s spaces you should look halfway presentable.

Each task requires different clothing and it is likely that you have several working outfits in your locker. Along with them there is an army of different working shoes, from rubber boots for the annual springtime water leak in the cellar to high-heels that fit your evening dress for events. A male registrar’s arsenal might be slightly smaller, but I don’t know a single registrar who can work with just one pair of shoes.

There are some advantages of being a collections manager at a science and technology museum.

There are some advantages of being a collections manager at a science and technology museum.

As a collection manager in a science & technology museum with the history of working conditions in its mission, I’m slightly better off. I decided a long time ago that I’m a living representation of working conditions and therefore usually wear working attire no matter what (with a few exceptions, like opening ceremonies and lectures). However, this comes with a downside:

Because I wear my safety boots almost every time at work they tend to die an unpleasant death within a timespan of about a year to a year and a half. This is a problem because a the same time it’s incredibly hard to find safety boots in size 37 (U.S. size 6 1/2). My very first safety boots – the ones I ditched and which are still under consideration to be accessioned for our collection of working clothes – were 36 (5 1/2) because I couldn’t find safety boots my size on the market. The first two years of my career I worked in boots that were too small. In fact, according to a friend, they were the “cutest little safety boots I ever saw”. So, everytime a pair of boots start to show signs of weakness, I search frantically for new ones my size. An exhausting race against time.

Fortunately, this time I’m spared: my niece has exactly the same shoe size and gave me the safety boots she got for her summer job. As she graduated to become an elementary school teacher last year, she doesn’t need them anymore.

Always keep your feet on the ground!

And for your amusement: A gallery of shoes that were killed in action:

light summer safety shoe

Light summer safety shoe, bought 2015. The seam that tied the leather to the sole snapped and the leather ripped. Probably due to the stress imposed on this part of the shoe by standing on my toes frequently. To make matters worse, I often need the fine feeling of my toes to give the forklift truck the exactly right dose of gas when handling a delicate load. A former more sturdy all-year safety boot, I think it was the 2007/2008 one, died exactly the same way.

sole of a safety boot

The most common way my safety boots die is however that the sole becomes so thin that they start to leak. You usually realize this when you are standing in a puddle of water. If it’s a dry season, you realize it when you suddenly feel every stone you walk over like you walk barefoot.

hiking boot without sole

This is the shoe that died the most spectacular way. These were pretty good light hiking shoes I loved to wear when there were no heavy duty jobs that require safety boots, only light work that requires a lot of walking. In the middle of an exhibit installation in 2011 parts of the sole literally fell off.

Got boots that died a similar – or more spectacular -way? Share your photos and send them along with their story to story@museumsprojekte.de!


Build Your Own Data Logger – Investigating Your Climate Graphs

I’m certain you all looked at climate graphs a lot in your professional career and tried to make sense of what you see. We now want to use Calc or Excel to get to know your gallery climate and make decisions about trigger values for an alarm system. Yes, at the moment our logger is just a dumb device that sits in the corner and registers what happens. But we can tell it to blink a warning with its LEDs when a climate value is not okay, we can give it a piezo speaker so it can ring a warning tone or we can build another logger who is able to send warning messages via WiFi. But you might remember Aesop’s boy who cried wolf? Yep, if alarms come too frequently and for no serious reasons we tend to ignore them. That’s why we first have to understand what is usual and unusual behavior of our room climate by analyzing our graphs.

The problem with fixed trigger values

Most devices with an alarm function allow to set up an alarm when the temperature or humidity is above or beyond a certain value. Good professional devices allow to decide how many times a reading has to be beyond or above this value to trigger an alarm, avoiding alarms caused by only minor trespasses or simple false readings of a sensor.
This is good for institutions with a relatively stable climate like it is provided by a HVAC system. Here we can set an alarm if the temperature falls beyond 19 °C (66 °F) or rises above 22 °C (71 °F) and we can define a slightly broader range for our relative humidity, probably circling around 55%. But if you are still reading an article series that deals with building your own data logger you probably don’t have this ideal setting.
More likely (museum studies students, brace yourself, here comes a real-life graph) it will look like this:

A real-life graph with usual and unusual climate swings.

A real-life graph with usual and unusual climate swings.

It’s not that this climate is not problematic. But there are problematic things happening and there are a lot of things happening that are just “normal” for this not-so-ideal storage room. For example, the temperature climbing from 17 °C to 23 °C (62 to 73 °F) in some up-and-down waves during May is pretty normal. A trigger warning at 22 °C (71 °F) would be pretty useless as the room has only a heating device.
On May 2nd, there is a sudden jump in relative humidity within just 35 minutes:

A sudden rise of relative humidity from under 47% to over 52% within 35 minutes.

A sudden rise of relative humidity from under 47% to over 52% within 35 minutes.

Ironically, a standard humidity alarm would probably stop alarming as this happens, because the humidity goes from a value that is not so ideal in theory into a range that is widely regarded as ideal. But as the collections manager of a not-so-ideal setting this occurence is definitely out of the normal behavior of the room. Someone might have left the door open, allowing wet air to come in. You want to check what’s wrong there. But how will you know?

A warning of sudden changes

We need a more flexible warning system, one that sends us a warning when sudden changes in humidity and/or temperature take place. One simple way to do this is to subtract the current measurement from the previous measurement. We get a value that tells us something about the change in the timespan we set between our measurements.
With our knowledge from the previous article on using Calc you should now be able to write a formula that subtracts the second humidity measurement from the first (Hint: the formula is “=C2-C1”) and apply it to all values of the column with the “fill” function. It’s pretty similar in Excel, by the way.

Subtracting a humidity value from the previous value.

Subtracting a humidity value from the previous value.

We get a column with values that tell us something about change over time. It is now easy to make a diagramm that lets us see what values are widely off the mark. Hint: you can hide the columns you don’t need in your diagramm before you mark the columns you do need. Maybe this time we choose points instead of lines:

A diagramm of changes.

A diagramm of changes.

While you could deduct the dramatic changes from the original graph, this new graph gives you a better overview and a handle to define about which changes you really want to get notified. You see that everything below 1 is probably pretty normal and would produce too many warnings if you set the trigger there. Everything above 1 is probably something you would like to know about immediately, not just when the monthly climate report arrives.

In real life we have used this for fine-tuning our climate warnings at the TECHNOSEUM. There are areas with well-known climate swings and some that need closer attention. For most areas, I get a warning email when a temperature or humidity change is over 1 degree within 5 minutes. If it is over 3 degrees other colleagues responsible for that area get a warning email. This keeps me aware of a lot of changes and I can look at the graphs to decide whether to check or call a colleague, while the other colleagues stay unbothered most of the time, but can check immediately if something goes very wrong.

The slow, steady, evil change

This is good, but it doesn’t warn you about another thing that creeps a collections manager out: The slow and steady change of a failed heating or a water leak. To show you what I mean, let’s take a look at another real-life graph:

A slow and steady rise in humidity.

A slow and steady rise in humidity.

The room has a rather stable climate at about 40% relative humidity. At about 8 p.m. humidity starts rising. Slowly, but steadily until it reaches 46,7% at about 1:30 a.m the next morning. Nothing our warning system would have warned us about, because the changes between two humidity values are minor. If we want to implement a warning system for this kind of changes, we need something else. We need a warning for problematic tendencies.

How can we do this? We first need to define a timespan we want to take as the basis of our calculations. Let’s take 30 minutes. If we count the differences between the 6 last values and divide it by 5, we get a value for the tendency. By now, you should be able to build the formular for this yourself. It is:
(If C is your column with humidity values.)

By making a diagramm out of it and comparing it with our original curve, we get an idea how the problematic changes look like:

The tendency values against the original curve.

The tendency values compared to the original curve.

We can now assume that getting a warning if the tendency shows a value over 0.5 would be a good idea. But, much more than with the value for rapid change, this is highly dependend on your setup and might be different from monitored space to monitored space. There might be some less-than-ideal storage areas where you can’t use it at all, because rise and fall of humidity and temperature is simply normal, and there’s nothing you can do about it. Let’s do a test so it becomes clear what I mean…

Bringing it all together

When we look again at the first 3 days of our scary graph above (you can download all the values here), how would our warning system react?

3 days in May…

Our first trigger warning comes on the 1st of May at about 8 a.m. when the constantly rising tendency in humidity first passes the 0.5 mark. This trigger value is met a couple of times throughout the morning, so there would be plenty of time to check an react.

First trigger warning comes in 8:07 on May 1st.

Our next warning comes a day later at about 10 o’clock. This time it’s a warning of sudden change. We can see the sudden change warning triggering before the tendency warning follows suit 5 minutes later:

Sudden change warning and tendency warning on May 2nd.

About 1 1/2 hours later we see a rapid decrease and some more tendency warnings as humidity goes back to “normal”.

We see again a rising tendency (although not as long enduring as the one May 1st) at about 4:30 p.m. that day, the next at about 10:30 a.m. the following day, next at 1 p.m., next at 8 p.m.

7 warnings in 3 days.

In the timespan of only 3 days our tendency warnings came in 7 times. Warning of sudden change came in 2 times. A warning for a fixed value… well if we would have defined a fixed warning when the humidity rises above 40% we would have gotten a constant warning starting at about 1 p.m. on May 1st – 5 hours after our tendency warning kicked in.

If this graph came from a climate controlled storage area I certainly wanted to get all 7 tendency warnings, because, seriously, this is NOT a good graph! I probably even set my tendency warnings as low as 0.2 or 0.3. For a well known not-so-ideal storage area, well, the warning for sudden changes will do. I won’t change German weather but I sure want to catch leaking ceiling windows or gates left open in wet weather.

I hope you had fun with this little analysis of data. I did. We might like to improve our logger on the basis of these findings…

Read the other posts for this project:


Mercury – A Tale of the Importance of Good Documentation

It’s a strange thing. The topic of hazardous materials in collections pops up every once in a while but as human beings we tend to forget about it because we consider that – of course – we know these hazards are there, but, then again, we are rather sure we know our own collection well and that if we act according to our safety precautions we are safe.
When mercury was found in the air in one of our storage areas during a pollutant analysis, I was shocked and surprised. Of course I knew we had mercury in our collection. We own a considerable number of thermometers and mercury switches. But until this day I considered our handling instructions and other precautions safe enough. This mercury was all contained, right? Yes, it was. But we never had thought of other sources, open sources that were hidden in our collection.

Discovering open sources of mercury & lessons learned

Automatic organ containing open mercury source (sorry for the poor quality of the pcture).

Automatic organ containing open mercury source (sorry for the poor quality of the pcture).

As we started to research our objects through the lens of “mercury” we discovered that, in fact, there were a couple of objects we never thought of. It turned out that there was an automatic organ which operated with contacts that dipped into mercury in some ceramic containers. In our medical history collection we had devices for counting thrombocytes in blood samples that operated with open mercury. However small, given that mercury evaporates at room temperature, even small outlets are an issue! There were barometers and even chronometers with open mercury sources. It was quite an effort to find out which sources we had. Even more to either remove or contain the mercury and seal and label the contaminated objects properly.
We learned quite a few lessons along the way:

  • Never assume you know everything about your collection
  • Never assume your policies and procedures cover every aspect
  • Never assume that you are safe, keep an eye on recent research

But maybe the most important lesson was about the importance of good documentation. And we learned it the hard way.

All expert knowledge at hand, but still…

Looking back, if someone had thoroughly researched the working principles of said objects, he or she would have discovered that they needed mercury to work. We don’t know if someone knew this when the objects were acquired. At least whoever did it, didn’t mention that they contained mercury in the documentation and the catalog entry.

Mercury switch inside of the automatic organ

Mercury switch inside of the automatic organ

It’s the disconnected working processes that are the real health hazard here! When we look at the classical museum setting there are different people with different knowledge involved in the documentation process. People whose skillsets are perfect matches but all their knowledge is useless if it isn’t interlinked in the workflow:
The curator might know best that mercury was necessary to make an object work, but might not be aware that mercury is a problem. The conservator has, due to his or her education, deep knowledge about dangerous substances but not about the object and might not see the object before it is stored if it is in good condition. Even if he or she checks its condition before it goes off to storage, the mercury might be hidden inside, so the conservator isn’t aware of the danger. The collections manager has some knowledge about dangerous substances but not about the object and might not be able to spot the danger if it isn’t widely known to his/her profession (like arsenic in taxidermied specimen is). The database manager has the knowledge about how to make dangerous substances retrievable in the database and maybe even know how to label them properly, but again, as he or she doesn’t have knowledge about the object, he or she doesn’t know there’s a problem.
Although all the experts work for the same institution, if they don’t assess the object together and bring their knowledge together, they are likely to overlook a danger and impose a health risk on colleagues, future researchers and visitors.

The importance of knowledge in cataloging

It is also obvious how dangerous it is when whoever is doing the catalog entry doesn’t have indeep knowledge about the objects. There is a tendency in museums to think that cataloging is a task that can be done by “whoever”. Knowledge isn’t important, every intern can key in a short description and some measurements, right? Of course we all know that’s nonsense, but arguing against it is tough. It’s hard to communicate what damage it does if dates, measurements and categorizations aren’t correct. With hazardous materials the danger should be obvious: someone doing the catalog entry who hasn’t enough knowledge to understand the working principles is likely to overlook the danger and therefore imposes a life threat to his or her colleagues and visitors.
If the curator can’t do the catalog entry him-/herself for a good reason (And: no, being too lazy/old/busy to learn how to do it isn’t a good reason, at least in my book!) he or she has to share his/her knowledge about the object with whoever does the catalog entry.

How to do it better

Objects containing mercury labeled according to international standards.

Objects containing mercury labeled according to international standards.

There are a few things that can be done to avoid unpleasant surprises:

  1. When an object is acquired, consult with everyone involved in the process. All the expert knowledge at one table will help to discover as many potential hazards as possible.
  2. If you are a one woman/man museum, make sure to reach out to experts in your area, your regional museum association or international experts via listservs and online groups to learn about the possible dangers your new acquisition contains.
  3. If the hazard is new, define safety precautions in handling and storage. If the hazard is long known, make sure your handling and storage precautions are still up to date with current research.
  4. In the database: make sure the hazardous material is named. In an ideal setting you do have a thesaurus of dangerous substances to pick from which are linked to safety precautions and correct labeling.
  5. In the database: make sure an object that contains dangerous substances is clearly distinguishable from other objects so everybody is aware that there might be special handling and storage precautions.
  6. In the storage: label dangerous substances according to international standards.
  7. In the storage: store hazardous materials according to the safety precautions. This might involve special containers or rooms with a ventilation system and handling instructions clearly visible on the container.

Live long and prosper!
Angela Kipp


Build Your Own Data Logger – We Want Fahrenheit!

Okay, so far this tutorial was quite European based. But you might want to have your data in Fahrenheit. There are two ways of doing this: In the arduino software or in the spreadsheet software. Because I left you with the spreadsheet software Calc in the last post, we will first do it this way. The formula to convert Celsius into Fahrenheit is to multiply the temperature in Celsius by 1.8 and then add 32.

In our spreadsheet software we add a new column and write the formula “=D1*1.8 + 32” for the first of our temperature values in D1:

The formula for converting Celsius into Fahrenheit.

The formula for converting Celsius into Fahrenheit.

Like last time we want a formula to apply for all our data so we write the first and the last cell into our address field, this time it’s “C1:C8484”:

For which cells the formula should apply.

For which cells the formula should apply.

Don’t forget to hit the enter/return key after you wrote this. Now we choose again the “fill–>down” option from our “edit” menu:

Filling the formula into all other fields of this column.

Filling the formula into all other fields of this column.

But our graph is still in Celsius? Don’t worry, we will fix that now. We double-click our diagramm and then go to our “format” menu to choose “data range”.

Changing the data range

Changing the data range

Here we can choose to add and remove columns. If we want to ditch the Celsius, we click on the column D and change the D into a C in the formula:

Changing column D to C.

Changing column D to C.

Changing column D to C.

Changing column D to C.

We see that the column D is now displayed as column C and in the graph we have the Fahrenheit values instead of the Celsius values. If we want to add the Celsius values to the graph again, we choose the Y-values, then “add” and change the “unknown data” to the column with the Celsius values:

Add a column.

Add a column.

Change added column to column D.

Change added column to column D.

By the way, if the color of our graph bothers us, we can change that by double-clicking the line and change the color to whatever we like:

Changing the color of a line.

Changing the color of a line.

And if we want the Fahrenheit right from the start in our Arduino code? well, search for this line:

And change it to:

What? That simple? Yep, the “true” tells the library that you want the temperature values in Fahrenheit. If this value isn’t set or you write dht.readTemperature(false) it shows the value in Celsius. Easy!

Even if you change it in the software, you might like to keep the two conversion formulas in the back of your mind in case you are exchanging temperature data with partners in the U.S. or in other parts of the world:

degree Fahrenheit (°F) = degree Celsius (°C) × 1,8 + 32
degree Celsius (°C) = (degree Fahrenheit (°F) − 32) / 1,8

In the next part we will do some more awesome things with our data, so stay tuned.
Keep your climate lines straight!

Weitere Beiträge zu diesem Projekt:


Build Your Own Data Logger – Processing Data With Microsoft Excel

In the last post I recommended OpenOffice Calc by Apache (https://www.openoffice.org/product/index.html) to use as the go-to spreadsheet software, but for those of you who might have the Microsoft Office Suite anyway, here we have the same in Excel:

Instead of opening our “MyLogger.csv” (which would mess up the data pretty badly), we first create a new spreadsheet. Then we choose the “Data” panel and “from text”. We choose our file and hit “import”. We choose the option that we have “separated” values. Then we hit “next”.

Our csv file in the preview window.

Our csv file in the preview window.

The next window allows us to specify how our values are separated. In our case they are separated by a comma.


As we choose “comma” the preview window shows us how it separates the values.

After we hit next, we got the option to choose a format for our values. Easy to screw your data if you choose the wrong one. In most cases, you can leave it set to “standard”.

You can choose the format of the data in all the columns seperately.

You can choose the format of the data in all the columns seperately.

There is a special trick to shoot yourself in the foot if you are working with another language package than the English one. In German, the seperator for decimals is the comma, not the point. In the German language package Excel won’t recognize your decimals as decimals if you leave the “.” from the original software and do all kinds of funny things with them, like forgetting everything after the point or not interpreting your numbers as numbers. You have to choose the decimal separator for each column in the “Options…” field.

Tweak for decimals in foreign languages.

Tweak decimals in foreign languages.

When you hit finish you get another dialog where you just hit “ok”. Now you do have the data in your spreadsheet. We proceed much like we did in Calc. We will add a new column at the beginning for our timestamp. We create the time stamp by joining our data in the right format: We put our cursor in A1 and type “=” which indicates the beginning of a formula. Then we type in the first field “D1”, place an “&”to join it with the next piece we need, which is a “/”. We type that in quotation marks because otherwise our software will interprete the / as a division. We add the next field, which is E1, with & and so on. After 6 fields, two “/”s, a blank and two “:” we have:
=D1&”/”&E1&”/”&F1&” “&G1&”:”&H1&”:”&I1
When we hit enter, we should see a nice, clean timestamp in A1:

Our first timestamp.

Creating a timestamp from data in columns D to I.

Now we want this in our whole A column. We look how many rows of data we have, which in this case is 8484 rows, yours might differ. We now go to the address field, write “A1:A8484” and hit enter. This tells Excel that we want to do something in all A fields from A1 to A8484. That’s why these are marked now. Next up we go to the “start” menu and choose the “fill” option on the far right of our screen. There we choose “down”. Now all our data sets have a nice timestamp made from columns D to I.

We select all our A fields in the address field and choose the fill option.

We select all our A fields in the address field and choose the fill option.

Ready to have a nice diagramm? Okay, here we go. You first mark the colums A to C which hold all the data we want to see in our graph. Then we choose the diagramm option from the “Insert” menu. You can either go to “recommended diagramms” or directly choose “lines”.

Choose your diagramm options.

Choose your diagramm options.

If you hit “lines” your graph will be generated on the spot, otherwise you can look at the recommendations and play around with them:

Diagramm options.

Diagramm options.

Now we have a nice diagramm to enhance, rename and play around with:

Climate graph ready.

Climate graph ready.

So, now Calc and Excel users are on the same page, so next up we can do some nice things with our data…

Read the other posts for this project:


Build Your Own Data Logger – Processing Data with OpenOffice Calc

Collecting data is nice, but not a value in itself. We collect data with our logger to actually do something with it. To process our data further I will use OpenOffice Calc by Apache (https://www.openoffice.org/product/index.html). Why Calc and not Excel? Various reasons: it’s free, it’s open source, it’s available for Windows, Linux and Mac and, most important, it is very user friendly for processing data. It beats Excel on many fields, at least in my opinion (I will follow up with an Excel part of this, though).

So, now we have made a software that saved our climate data as “MyLogger.csv” on our SD card. Next up we will save it from the SD card to our computer and open it with OpenOffice Calc. You should get something that looks like this:

Window when you open a csv-file directly with Calc

Window when you open a csv-file directly with Calc

Yours might be in English, though. Basically the program suggests to make a spreadsheet out of your comma separated values using the comma as marker for the columns – which is exactly what we need. If you used different separators, you can adjust this in this dialogue. Once you are satisfied with how the preview looks you hit “OK”.

Your raw data spreadsheet.

Your raw data spreadsheet.

While we could make a graph for temperature and humidity right there and then, it’s probably better to have our date and time in a format we can use. We could have fixed this in our software already – but nobody is perfect, we just note this for our improvements. For now, we just add another column to our spreadsheet: we mark our column A and choose “Insert”–>Column. A new column A appears on the left of our original column, which is now “B”.

Our new empty column A.

Our new empty column A.

Now we will make a nice, new date and time out of this snipplets we got in column D to I. We want our timestamp to look like this: “2017/4/1 0:1:22″To do that, we combine the data from the colums with a formula, which means we put our cursor in A1 and type “=” which indicates the beginning of a formula. Then we type in the first field “D1”, place an “&”to join it with the next piece we need, which is a “/”. We type that in quotation marks because otherwise our software will interprete the / as a division. We add the next field, which is E1, with & and so on. After 6 fields, two “/”s, a blank and two “:” we have:
=D1&”/”&E1&”/”&F1&” “&G1&”:”&H1&”:”&I1
When we hit enter, we should see a nice, clean time stamp in A1:

The formula entered and executed.

The formula entered and executed.

We want this in all our A columns, right? But we first have to see how many rows of data we have, so we look into our last row, which is in this case the 8484, yours might be different. We now go to the address field, write “A1:A8484” and hit enter. This tells our Calc that we want to do something in all A fields from A1 to A8484. That’s why these are marked now.

We select all our A fields in the address field.

We select all our A fields in the address field.

Now comes the trick. From the “Edit” menu we choose the “fill” option and choose “down”. Now all our data sets have a nice time stamp made from columns D to I.

From the "edit" menu, choose "fill" and then "down".

From the “edit” menu, choose “fill” and then “down”.

Ready to have a nice diagramm? Okay, here we go. You first mark the colums A to C which hold all the data we want to see in our graph. Then we choose the diagramm option from the “Insert” menu:

Choose the "diagramm..." option from the "Insert" menu.

Choose the “diagramm…” option from the “Insert” menu.

For the diagramm type we choose “lines”. In the background we already get an idea of how our graph will look like.

We choose the lines for a type.

We choose the lines for a type.

We can now hit “finish” or do some adjustments like giving our diagramm a title. As soon as we hit “finish” we have a graph that we can drag around, enhance and even cut out and paste in a new spreadsheet, just as we please.

The finished graph.

The finished graph.

There, we have a nicely enhanced graph in an added spreadsheet.

There, we have a nicely enhanced graph in an added spreadsheet.

With all the Calc knowledge we gained now, we will next up tweak our data the way we want it. Like: Those temperatures are in Celsius and we want them in Fahrenheit. But first, we will do the same in Microsoft Excel, for those who feel more comfortable with this…

Read the other posts for this project:


Itsy-bitsy climate engineer

Our education department does some activities on weather and climate this summer and asked us if we could spare a logger. Of course we could… but we also could built them a special one that measures barometric pressure, too. Who doesn’t love to learn how to do a little weather forecast by looking at the barometer? But, but, don’t we need an engineer who keeps care of that logger while it does its duty? There, we fixed it:

plastic spider on data logger

Our itsy-bitsy sensor engineer keeps quite a few eyes open…

This text is also available in Italian, translated by Silvia Telmon.


Build Your Own Data Logger – The Software, Telling the Logger to Log

790px-Kaffeetasse_Milchkaffee_Cafe-au-Lait_CoffeeOkay, with the arduino, shield, wiring and sensor complete, we’ve got our stuff together and can get our logger to log. To do that, we need to tell the arduino what it has to do. This is done with the arduino coding language. Now, what’s that and why do we need it?

As we have seen in the last part, an arduino itself is not very intelligent. To every ordinary person you can say “Would you be so kind and fetch me a cup of coffee?” and he or she will be able to execute that task without further ado, given he or she knows where the kitchen is and all necessary tools are available. If you want the same thing from a machine, you have to speak its language (or have a translator, which is called a “compiler”) and you have to think about the task you give it in a way as if that thing doesn’t know anything about this world. Which is, in fact, true for any machine. So, to stay in the example, to code a machine you have to say:

When hearing the command “Would you be so kind and fetch me a cup of coffee?” do the following:

1. Go to the kitchen
2. If the door is locked, open it to go into the kitchen
3. Go to cupboard
4. Open door of cupboard
5. Take out 1 cup
6. Close door of cupboard
7. Put cup under the coffee machine
8. Press first button
9. Wait until fluid has filled cup
10. Take cup in an upright position
11. Bring cup to person who spoke command

Silly, isn’t it? You would go mad with an assistant asking for such precise commands. That’s why some people find it hard to code – it’s very complicated to think so basic. But, anyway, we want our logger to log, so let’s take a look at the necessary code step by step (the complete code can be copied from our Quick Start Guide):

The part that is introduced by /* and ended by */ is a comment, something that is written for humans to read, not for the arduino to understand. Think of it like spelling out certain words so the kids don’t understand. Well, we never can be sure with kids and spelling, but we can be quite sure with /* */ and arduino.

You use this comments to make sure that another human being is able to understand what you wanted to achieve with a section of code. Chances are that human being is you because after some time you won’t remember why you coded some things that way. Comments are a part of good documentation, something we collections folk like, right?

Next up we have a couple of so called “libraries” we include in our code.

We have seen what an arduino mind needs to fetch you some coffee. Well, someone has already defined all the steps beginning with “1. Go to the kitchen…” in a library, so if you want your specific arduino assistant to be able to fetch coffee for you, you just have to write “#include <coffee.h>” at the beginning of your code and whenever you write “Would you be so kind and fetch me a cup of coffee?”, your assistant will be able to do all the necessary steps to bring you a nice, hot cup of coffee. It will also include what to do if the coffee machine is turned off, the water tank is empty, there is no coffee…

Now, I have to admit that I don’t understand all those libraries that are included here, of some I only know as much as that I do need them, and I know that I need them because I saw that they were used in some example codes. I think of it like when we need a conservator – of course we have to know which specific conservator we need, but we don’t have to fully understand what she or he does. Although, of course, the better we understand what she or he does, the more effectively we can cooperate.

For our logger we have included some libraries so it:

  • understands some functions you might need from the programming language C (stdlib.h)
  • knows how to handle time, that is, knowing that there are seconds, minutes, hours, days… (Time.h)
  • knows how to read the Real Time Clock on the logger board (DS1307RTC.h)
  • knows how to communicate with I2Cs (Wire.h)
  • understands what our sensor tells it (DHT.h)
  • knows how to communicate with peripheral devices like the SD card reader (SPI.h)
  • knows how to read from and write to a SD card (SD.h).

Next up we have to define where our sensor is and what type of sensor we use. The DHT library we included is able to handle DHT11, DHT21 and DHT22, so we have to specify that we connected a DHT22 and we connected it to our pin 9. The notations behind the “//” are again comments to be read by humans, not the arduino:

Next our sensor gets a name so we can order it to do something.

To keep things simple, we called it just “dht” in small letters, but we also could have called it “Walter”, “Gretchen” or “sensor1”. It’s only important that it is named consistently and that we are careful in using upper and lower case. For an arduino “Gretchen” is something other than “gretchen”, so the program won’t run if you make a mistake here.

The next line makes sure that we can communicate with the SD-Card although we used a logger shield. In the library, pin 4 is defined for a certain action, but this is already taken by the shield, so the arduino should use pin 10 instead.

So far, we have just made sure that our arduino knows what it needs to know. Next up we enter the “setup”. Think of it as your new assistant walking through the door. Before you can order her/him to do anything for you, you have to show her/him around. Where is the toilet? Where is the kitchen? Where is the coffee machine… This all will happen in the curly brackets after “void setup”.

Actually, the very first thing we do is we tell our imaginary assistant how s/he should communicate with us. Our arduino will be able to tell us what it does when it is connected to a computer using a thing called “serial communication”. It will be able to send information via the USB cable which we can read in the Serial Monitor of our arduino software. The line Serial.begin(9600) is like telling our assistant that s/he should communicate with us in English.

Next up, we tell our arduino that it should use pin 7 and 8 as output. This is where our two LEDs are connected, but our arduino only knows this if we tell it so. There are two possibilities for a pin: it can be an output or an input. If we define it as an output we can send signals to it that will do something with the thing that is connected to said pin. In our case, if we send that pin a “HIGH” signal it will switch the LED on, if we send it a “LOW” signal, it will turn it off.
If we define a pin as an input our arduino will “listen” to what happens on said pin instead. If the arduino receives a signal there, it can do something according to that signal. But in this case, we only need an output for our LEDs.

Next up, we do a couple of checks to see if our SD card works properly. It prints “Initializing SD card…” to the serial monitor so we can see it.
There is again a pin, pin 10, we define as an output, because our SD-Card-Reader needs this (we know this from the example code).

Now, the arduino checks if it can read the SD-Card. If it can’t read it, it sends a message to the serial monitor saying “Card failed, or not present”.
But “in the field” we won’t have an USB cable connected to a computer, only the logger itself. So we use our red LED on pin 7 to send us the same message. If the arduino doesn’t find the SD card reader or the SD card, it switches the red LED on for 5 seconds. In the arduino language this time span is given in milliseconds. So you see that we send the LED a “HIGH” signal, then wait (delay) for 5000 milliseconds until we send it a “LOW” signal to turn it off.
This is a mission for the Q-Tip: If the red light indicates that the SD card is missing or not properly inserted, you can put your SD card in the slot and press the q-tip, which reaches to the reset button inside the case to restart your arduino and try again.

If the arduino can read the SD-Card it will send the message “card initialized.” to the serial monitor. Next it sends “DHTxx test!”. Again, we have no idea if it can read the SD card, so if this is the case, we switch the green LED on pin 8 on for 5 seconds.

With the simple statement “dht.begin();” we tell our sensor that it should start reading.

Now our setup is finished, we can now tell our assistant what s/he shall do all day long. We do this in a function that is called “loop”. This function will repeat itself forever, if we don’t code anything that make it stop (or the plug is pulled).

What we want to do repeatedly is to read how high the humidty and the temperature in our room is, right? To read from our sensor, we call “dht.readHumidity” for the humidity and “dht.readTemperature” for the temperature.
If we want to use these values repeatedly in our code we use a thing called “variables”. A variable is something like a bag. We can store a value in it and carry it around. In our case we call our variables “h” for humidity and “t” for temperature. Bags come in all sorts and sizes, so do variables. You would choose your small, black handbag for a dinner invitation and your rucksack for your day trip so you always got the bag that fits your storage needs. Our sensor readings will come in a form like 14.5 or 34.8, they come as floating point numbers. So we choose the variable type “float” for it. There are a lot of other variable types, but for now, we just learn that “float” is the right type for our sensor values.
To sum up, the following lines of code store our sensor readings in the variables “h” and “t”. Whenever we call those variables in the fllowing parts of the code, they will repeat the sensor values.

But what will happen when the sensor returns something that isn’t a valid value for humidity or temperature? The next part of our code checks exactly this and reacts accordingly.

If either the value for humidity which we stored in “h” or temperature which we stored in “t” is not a number, the arduino will inform us by writing “Failed to read from DHT sensor!” on the serial monitor. The expression for something not being a valid number is called “isnan” (for IS Not A Number. Instead of writing “or” between the conditions, we have to use wording the arduino understands, which are the two upright strokes || (there are a few of these, like “and” which is &&, “greater than” which is > or “smaller than” which is < ).

Again, with a free standing logger we won’t see a message on our computer, so we let our red LED blink frantically if the sensor values are nonsense. There might be more elegant ways to code this, but I’m just a collections manager, not an IT professional.

Next up we print our values to the serial monitor for check, in case we have a computer connected. Right now you should be able to understand what happens:

Now, we need the time that comes from our Real Time Clock. Note that to use the Real Time Clock properly, you have to set it to the correct time first, using the example provided with the RTC library. Basically here we say “look at the clock and remember anything you saw in a variable called “tm”). We will be able to call the specific day, month, hour, minute, second… that way later in the code.

What follows next is perhaps a little weird to explain and to look at. We want to store our data on the SD card later, in a form that each data point is separated by a comma. That way, we can use any old spreadsheet software, import the data in a form that is called “comma separated values” (CSV) and process it further. The thing is, our data are numbers. You remember how we defined our sensor readings to be floats? Yep.

What we need to process it further is charcaters, in other words, we need a string. To be more exact, we need a string that incorporates all the data we want to be stored when we read our sensor. We want to have something that looks like this:
“34.8, 14.5, 2017, 04, 14, 2, 45, 23,” that we can have in our spreadsheet software to process a reading of 34.8 % relative humidity, 14.5 degrees Celsius on the 14th of April 2017 at 2:45 p.m (and 23 seconds).

To achieve this, we first open up an new bag called “dataString” to store our values in. I must admit I don’t get what line 116 really does, but it has something to do with defining the size that is available for our values.

What happens next is that we put all our values that we want to store into our “bag” called “dataString”. We do this one by one, as if we open up our bag, put the tape measure in, put the gloves in, put the lipstick in… The tricky thing is that whatever we take, we first need to convert something that is a number into a string. Hmmm… perhaps like if you want to put a fluid into your bag. You first have to put it into a container. Well, perhaps not exactly so, but along these lines.

So, we put our humidity value into a container. We call this container “stringH”. The function “dtostrf” does this with our variable h, which is, as we know a float number of our relative humidity reading. Then we put our container “stringH” into our bag “dataString”:

We said we wanted to have comma separated values in the end, so what we do have to do now is to add a comma. We take our “dataString” bag and put a comma in, useing “+=” as the order to do so. Here we go:

The same goes with our temperature reading:

What does our bag now contain? Something that looks like that: “34.8, 14.5, “. You can make sure by ordering your code to let you know on the serial monitor by adding this line:

We didn’t do that here. Instead we are putting in our bag one by one the values for the day, month, year, hour, minute and second of the reading, each separated by a comma. Note: I later discovered that I wouldn’t have had to separate them all with a comma, but we will discuss this later in the series. For now, we just know it works.

Whew, that’s a lot of code. Our “bag” dataString now looks like that: “34.8, 14.5, 2017, 04, 14, 2, 45, 23,”. Next up, we want to write it to our SD card. To do that, we have to open the file the string should be written to:

If the arduino finds the file called “Mylogger.csv” on the card, it openes it, writes the content of dataString to it (at the end of all other data that is already stored there) and closes the file again. Mission accomplished!

What’s great about this is that if there is no file called “Mylogger.csv” on the SD card, the arduino will automatically create it. Only in the occasions where there is such a file, but it can’t be opened or if the SD card is missing, we will need an error coding which informs us on the serial monitor and lights up the red LED until the next loop:

Finally, we have to define how long the arduino should wait between measurements. The more often you read, the more data you get, which is a plus in detail, but also needs more storage space. In our example, we wait 5 minutes between readings, which are 300000 milliseconds. For 10 minutes, set it to 600000 milliseconds and so forth.

That’s it, that’s the whole code. There is, of course, room for improvements, for example if you need the temperature in Fahrenheit or want to calculate the dew point. But this will be another part of the series…

Read the other posts for this project: