EAD-ODD: A solution for project-specific EAD schemes
|Title||EAD-ODD: A solution for project-specific EAD schemes|
|Publication Type||Journal Article|
|Year of Publication||In Press|
|Authors||Romary, L, Riondet, C|
|Keywords||Archival Science, EAD, Holocaust, Standards|
This article tackles the issue of integrating heterogeneous archival sources in one single data repository, namely the European Holocaust Research Infrastructure (EHRI) portal, whose aim is to support Holocaust research by providing online access to information about dispersed sources relating to the Holocaust (http://portal.ehri-project.eu). In this case, the problem at hand is to combine data coming from a network of archives in order to create an interoperable data space which can be used to search for, retrieve and disseminate content in the context of archival-based research. The scholarly purpose has specific consequences on our task. It assumes that the information made available to the researcher is as close as possible to the originating source in order to guarantee that the ensuing analysis can be deemed reliable. In the EHRI network of archives, as already observed in the case of the EU Cendari project, one cannot but face heterogeneity. The EHRI portal brings together descriptions from more than 1900 institutions. Each archive comes with a whole range of idiosyncrasies corresponding to the way it has been set up and evolved over time. Cataloging practices may also differ. Even the degree of digitization may range from the absence of a digital catalogue to the provision of a full-fledged online catalogue with all the necessary APIs for anyone to query and extract content. There is indeed a contrast here with the global endeavour at the international level to develop and promote standards for the description of archival content as a whole. Nonetheless, in a project like EHRI, standards should play a central role. They are necessary for many tasks related to the integration and exploitation of the aggregated content, namely: ● Being able to compare the content of the various sources, thus being able to develop quality-checking processes; ● Defining of an integrated repository infrastructure where the content of the various archival sources can be reliably hosted; ● Querying and re-using content in a seamless way; ● Deploying tools that have been developed independently of the specificities of the information sources, for instance in order to visualise or mine the resulting pool of information. The central aspect of the work described in this paper is the assessment of the role of the EAD (Encoded Archival Description) standard as the basis for achieving the tasks described above. We have worked out how we could develop a real strategy of defining specific customization of EAD that could be used at various stages of the process of integrating heterogeneous sources. While doing so, we have developed a methodology based on a specification and customization method inspired from the extensive experience of the Text Encoding Initiative (TEI) community. In the TEI framework, as we show in section 1, one has 1 https://team.inria.fr/almanach/ 2 Special thanks to Annelies van Nispen (NIOD) and Hector Martinez Alonso (ALMAnaCH) for their help.