My Shiny Weblog!

programming, photography and lifestyle

Europeana

Europeana is the european cultural heritage aggregator. The idea behind it is simple. Every library, museum or other cultural institution within EU should provide metadata about its items. This metadata contains the identifier, description, location and the type of the item. It is in XML format, validated with Europeana Semantic Elements schema. A collection of record tags should be contained within metadata tag. Each record has a text, image, sound or video type. We have a collection of folk songs, which metadata was uploaded to Europeana. A folk song in our collection could be described like that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<?xml version='1.0' encoding='UTF-8'?>

<metadata xmlns="http://www.europeana.eu/schemas/ese/"
          xmlns:europeana="http://www.europeana.eu/schemas/ese/"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:dcterms="http://purl.org/dc/terms/">

   <record>
      <dc:identifier>BA,001,2,09</dc:identifier>
      <dc:title xml:lang="en">Bulgarian folklore song</dc:title>
      <dc:title xml:lang="bg">Българска народна песен -- Снощи отидох, мамо</dc:title>
      <dc:subject xml:lang="bg">Фолклор -- България -- Кортен</dc:subject>
      <dc:description xml:lang="bg">Лазарска народна песен</dc:description>
      <dc:publisher>Magrathea Information Technologies</dc:publisher>
      <dc:type>Text</dc:type>
      <dc:format>text/pdf</dc:format>
      <dc:date>1908</dc:date>
      <dc:rights>Institute of folklore, BAS</dc:rights>
      <dc:language>bg</dc:language>
      <europeana:object>https://folk.magrathea.bg/pdf/ba_001_2_09/0</europeana:object>
      <europeana:provider>Bulgarian Academy of Sciences</europeana:provider>
      <europeana:type>TEXT</europeana:type>
      <europeana:rights>http://creativecommons.org/licenses/by/3.0/bg/</europeana:rights>
      <europeana:dataProvider>Institute of folklore, BAS</europeana:dataProvider>
      <europeana:isShownAt>https://folk.magrathea.bg/pdf/ba_001_2_09/0</europeana:isShownAt>
   </record>
</metadata>

Files like this should be send for validation in Europeana content checker. A JNLP for upload is provided here. The metadata is validated and published on the content checker site. I expected that more flexible API is available. Maybe something like Facebook or Twitter, that could be used for live systems integration. But that is appearantly not (yet) the case. Guess I should rerun my indexing script and upload again when there is new content in our repository. Reading the documentation on the site was not pleasant experiance. There are far too many PDF and DOC files out there, with Windows screenshots and click-through guides. I think it should be far more simple, short and developer oriented.

Comments