20 Aug 2009

Create a custom-made book cradle to minimise risk during the digitisation process





Introduction



When in 2009, as part of the long-term digitisation strategy, the British Library digitised 250 early Greek manuscripts as the first phase of an ongoing externally-funded project to digitise the entire Greek manuscript collection. The manuscripts chosen for digitisation ranged in condition from fair to good. Of course all those in poor condition were excluded from digitisation and sent to be treated or put aside for later conservation.



But one very important piece of equipment that became essential to the project was a humble book cradle designed for the handling of manuscript during the digitisation process and here we provide instructions for its assembly.



The need for a book support for the digitisation process was highlighted during the making of a risk assessment that was made at the beginning of the project, the framework of which will form part of a possible further article. This collection care risk assessment was aimed at identifying possible causes of harm to the manuscripts and to help mitigate such risks, whilst enabling the ease of project workflow. To manage these collection care requirements, the project included the work of a full-time book conservator to evaluate the risks attached to the digitisation of this collection.



It was established quite early that the main risk factors were around the handling processes, particularly during



  1. Transport


  2. Digitisation in studios


  3. Storage




The main focus of this assessment was to evaluate the risks to manuscripts through the process of mechanical handling during the whole project and some aspects of storage.



As handling presented the highest risk overall, it was imperative that all those involved in the process of handling should first observe some of the basic preservation handling rules, such as clean hands, ensuring the correct handling mechanisms were in place such as trolleys, cradles and establishing safer access routes before moving items from one place to another.



Handling during the actual digitisation process was identified as presenting the highest risk of all. It was important to consider the worst-case scenario for digitisation handling, as handling during this process is not always in the presence of the collection care staff.



In order to mitigate those risks and as a result of the primary risks identified during the risk assessment phase of the Greek Manuscripts Digitisation Project, it was decided to design and develop an easy-to-use and adaptable book cradle the cradle was intended to support manuscripts whilst they were being digitised.



The book cradle was designed to be used only where one page at a time was being photographed or digitised. It was used with a common photographic stand with the camera placed perpendicular to the page to be digitised. The book support is also adjustable depending from the type of the spine the book to be digitised has.



The resulting cradle enabled a safe and faster digitisation of many of the manuscripts so far and has also been used recently for other digitisation projects.

Book support: construction and use



The following instructions are intended to explain and show clearly and easily the construction of the book cradle. They are offered to anyone wishing to make a cost-effective cradle for use in their institution.

Book supports for digitisation.







Fig 1 Book support with holding strips



The book support is made of three components,



  1. A base formed of two boards covered with Buckram (Fig 1)


  2. Two Plastazote® supports covered with the same archival cloth (Fig 2)


  3. Strips of Velcro placed on the edges of the support (Fig 3)








Fig 2 The components of the support



1. The base is formed by two 3mm identical boards. The boards are covered with buckram or any suitable archival material that join them together creating a central hinge of approximately 10mm.







Fig 3 The base (component 1 above)



2. The Plastazote® supports are made from a piece of Plastazote® and a 3 mm board of the same dimensions. They are covered in such a way as to create a slit at the back of each Plastazote® support into which the base board is inserted. To create this slit place the piece of Plastazote® on the cover material to the left and the same dimension piece of board on the right leaving between them a gap equal to the thickness of the Plastazote® plus the thickness of the covered base board (5mm).







Fig 4 Diagram of template for the covering material for the Plastazote® bases



Next, secure the Plastazote® and the board bases to the cover material. Place the Plastazote® on A without gluing it but securing it with a weight. Glue the board on B aligning it with the Plastazote®.







Fig 5 Diagram of folding sequence or the Plastazote® support covering material



Glue verso 1 to the Plastazote®, do the same with 2 (head and tail) to the board and finish attaching 3.



Close B onto A placing a compensatory thickness equal to the thickness of the covered base board (the base board itself can be easily used for this) to create the slit. Place the glue on 4 and fold it over the verso of B at both sides.







Fig 6 Diagram (tail view) of the book support



Now place the strips of self adhesive Velcro (hook side of the Velcro) onto the Plastazote® supports at head and tail (short sides) of the Plastazote® bases and onto their thickness (see following diagram).







Fig 7 Diagram of strips of self adhesive Velcro (hook side)



To adjust the groove.



To adjust the groove to accommodate different sizes of the raised spine of the book it is necessary to secure the left side of the Plastazote® support at different heights.



This is achieved by placing 3 strips of self adhesive Velcro (loop side) on the verso of the left part of the base board. These strips need to be placed at 1 or 2 cm interval parallel to the groove.



Cut a strip of board of the length of the base board or slightly shorter and 40 mm wide.



Adhere to the strip a new strip of self adhesive Velcro (hook side) and use the strip to support the Plastazote® base at the desired height (see following diagram and images).







Fig 8 Diagram of holding strip







Fig 9 Adjustable groove for housing different size spines







Fig 10 Pronounced spine properly housed ready for digitisation



 



Book supports with book in place.



3. The book is held in place by strips that can be made of linen tape, or conservation paper or even Melinex® or Mylar ® or Tyvek®. These strips are secured to the base with Velcro (loop side of Velcro) at each end to secure the books to the book support during digitisation. These strips both gently hold the left part of the book block out of the camera range and also provide an easy and fast way to change the page. These strips, placed behind the page to be photographed can also help to secure the right side of the book block (as shown in the following illustration) in place during the photographic process. To hide the strips, a sheet of archival paper can be placed behind the page being photographed as a blending background.







Fig 11 The book support is now ready to be used







Fig 12 The book in use







Fig 13 Book support with book in place with different opening angles



The opening angle of the book support can be changed as necessary. Different thicknesses of foam wedges can be used behind the book support to achieve a different opening angle. The opening of the book should not to be more than 120 degrees and the book should never be forced to open further than it will naturally.



Refinements can be made to the design when you have made the basic cradle; for instance, the Plastazote®bases can be bevelled at the edge close to the groove where the book spine will be placed to follow the shape of rounded book spines. Also, where natural hollow or tight back spines need support a rolled linen cloth can be used to fill the groove to support the book block from behind, as illustrated in the image below.







Fig 14 Natural hollow spine supported with rolled linen cloth



 

List of materials



BOOK SUPPORT



  • Board


  • Buckram cloth


  • Plastazote®


  • Strips of linen tape or Tyvek®, the length depending on the dimensions of the book support plus the space for the book block.


  • Self adhesive Velcro strips.


  • Foam wedges




Conclusion



The book support cradle was designed to reduce the handling of the books during the digitisation process. The book is secured on a non abrasive surface that keeps a suitable constant opening angle and allows the book to be positioned on the photographic table without further direct handling as the book rest itself can be moved with the book already in place.



The dimensions of the supports can be varied depending on the dimensions of the books to be digitised, more than one dimension should be available to the photographer/imager and the book support needs to be bigger then the book to be digitised.



The strips made of conservation-grade material (Tyvek® and linen tapes were the more suitable choices due to their properties of strength and non-abrasive surface) keep the books open and reduce the risk of damage to delicate paper or parchment surfaces. The use of Velcro to secure the strips to the book support means they can be secured with a slight tension to prevent the angled opposing pages from slipping. The Velcro also means that the page turning operation is quicker and safer.



The adjustable space in the centre of the book support between the covered Plastazote® bases enables the safe positioning of the spine of the books placed on the support. Different book sewing structures open in different ways during use, for example: hollow back books need space to accommodate the spine which is detached from the text block. Positioning books properly on the support enables the pages to be turned more easily and the adjustable cradle enables the dimensions of the gap to be increased or decreased to accommodate books of different thicknesses safely.



The increase of book digitisation projects has meant that the involvement of conservation/preservation departments is an essential part of successful project planning. Never before has so much emphasis been placed solely on the books as mere textual carriers. Much of the funding for these projects is awarded towards the accessibility of this textual information alone. For this reason, book conservators have a vital responsibility to contribute to these projects by supervising the safety of the physical items through the stressful process of digitisation.



Books now, more than in any other period, need to be preserved for future generations as artefacts and museum objects too. Important features of the artefacts can be lost, simply because they are presently undervalued due to pressing work schedules and other agenda, but it must be remembered that they are carriers of information on many levels, not just intellectual content.



Experience at the British Library has demonstrated that the involvement of the conservation/preservation element in digitisation projects must be factored-in at the beginning of the planning process. The early assessment of condition and risks is vital for the future conservation and safety of our irreplaceable heritage.

10 Aug 2009

The Latin American and Caribbean Cultural Heritage Archives (LACCHA, established 2008), with the Society of American Archivists (SAA, founded 1936) are to have their Roundtable Meetings this August.



They will have a guest speaker Kent Norsworthy, Content Director of Latin American Network Information Center (LANIC).  His talk is on:

Archiving the Latin American Web: Challenges and Opportunities



LAGDA seeks to preserve and facilitate access to a wide range of ministerial and presidential documents from 18 Latin American and Caribbean countries. The Archive contains copies of the Web sites of approximately 300 government ministries and presidencies. Capture of sites began on multiple dates in 2005 and 2006, and will continue with regularly scheduled captures.



Content in the Archive includes not only the full-text versions of official documents, but also original video and audio recordings of key regional leaders. Archive contents include thousands of annual and "state of the nation" reports; plans and programs; and speeches by presidents and government ministers. Content can be accessed via full-text search (search help), or by browsing by country or by specialized sample collection, such as "Presidential Messages" or "Ministerial Documents".



LAGDA is a joint project of the University of Texas Libraries, The Nettie Lee Benson Latin American Collection, and the Latin American Network Information Center at The University of Texas at Austin. Web archiving services are provided by the Internet Archive's Archive-It service.



LAGDA Basics



• Collaborative effort



– Latin American Network Information Center - LANIC

– Benson Latin American Collection

– University of Texas Libraries



• 2003: CRL-Mellon grant, Web archiving political communications

• 2005: One of original five Archive-It Pilot Partners



Collection Focus



• Ministry and Presidency Web sites from Latin America & the Caribbean

• All major Spanish-speaking countries in the region, plus Brazil

• Sample of ministries: health, economy, defense, agriculture, etc.



Collecting Objectives #1



• Supplement Benson print collection

• Born digital, no longer printed at all

• Brief life on publishing entity’s site



Collecting Objectives #2



• Platform for research

• Numerous types of scholarly & applied research supported

• Important to recreate the “look and feel” of the original, ability to browse, etc.

• Historical record



LAGDA by the Numbers



• 280 Archive-It seeds in one collection

• 18 Latin American & Caribbean Countries

• Quarterly crawls, six to date

• Linking to LAGDA: Over 100 sites, mostly libraries



More LAGDA Numbers



• Over 24.8 million files archived to date

• 2.4 million PDF documents archived



• Largest site archived (by file): Ecuador Ministry of Industry 120,000 URLs

• Largest site archived (by size): Colombian Presidency, 20GB



Why Web Archive?



• Governments come and go. . .



Live Archived



• Disk drives fill up . . .





















  Live:Archived:
lagda2a


Ideologies evolve . . .





















  Live:Archived:
lagda3a


Challenges



• In Web archiving, Latin America is a “moving target”

• “Best Practices” in Web design = consistent Web archiving quality

• Overuse: JavaScript menus; IFrames; Redirects; Flash, https; cookies; etc.

• How to make more researchers aware LAGDA exists



Quality Control



• Systematically separate crawl issues from playback issues

• Immediate corrective action on crawl issues (fix or eliminate seed)

• Address playback issues through user interface and documentation



Quality Control #2

lagda4

Proxy Mode



• Eliminates many playback problems

• Confronts some provenance issues

Web Archives and Large-Scale Data: Preliminary



Techniques for Facilitating Research



History of archiving Latin America at UT Austin



• Benson Library collected gov docs in print since 1920s

• Latin America began moving to digital gov docs around 2000

• Download, print and curate

• Latin American Government Document Archive begins 2005

• Crawl entire websites, compress and curate data

• Provide access to digital content directly



Latin American Government Document Archive



LAGDA = 280 seeds, about 15 government ministries per each of 18 countries crawled quarterly since 2005















• Files crawled and archived to date in LAGDA

• Data archived

• Items added to collection per year

• HTML pages archived per crawl

• PDF documents archived per crawl

• Monthly average pageviews on LAGDA
70 million

5.9 TB

9-10 million

1.6 million

260,000

2,918


Latin American Government Documents





LAGDA: challenges to data mining



• Heterogeneous corpus

• Various languages

• Data formats (HTML, Word, PDF, Other)

• Document characteristics

• Variety of sources (countries, governments, departments)



LAGDA: motivating problem



• Goal:

• Automatically attach labels to documents in a large collection based on training documents

• Challenges:

• Keyword search is ineffective due to lack of consistent words

• Training documents may cover broad subject areas



LAGDA: techniques for data mining



• Break documents into n-grams

• 1-gram {The, quick, brown, fox, jumps, over, the, lazy}

• 2-gram {The quick, quick brown, brown fox, fox jumps}

• 3-gram {The quick brown, quick brown fox…}

• Identify one or more subsets of n-grams with significant high usages in the training documents

• Evaluate all documents in the corpus using these n-grams



LAGDA: techniques for data mining



• Use this score and others to create a composite score

• The company you keep - Examine the text and the links that point to our documents

• Natural language processing

• Named entities & Part-of-Speech tagging



LAGDA: technology for large-scale computing at TACC



• Corral data storage system (6 Petabyes)

• Longhorn High Performance Cluster

• Paradigms for distributed computing (MPI and Hadoop)

• Nodes work in parallel and combine their results

• Allows us to divide and conquer the problem

• Open source libraries (Heritrix, Tika, Lucene, OpenNLP)



LAGDA: initial results



• Traditional classification approaches are unsuccessful

• Our n-gram approach for classification based on training set outperforms traditional Bayesian Inference Classifier

• Results from our composite scores demonstrate additional improvement



“big data” and libraries: going forward



• Challenges posed by web-archived data

• Size, heterogeneity and limited metadata

• Data access that is more dynamic and flexible

• How big data can create data-driven research

• Development of use cases and research examples

• Technology at the service of social sciences, humanities and other fields whose research could benefit



About LANIC - The Latin American Network Information Center (LANIC) is affiliated with the Lozano Long Institute of Latin American Studies (LLILAS) at the University of Texas at Austin. Live on the internet since 1992 LANIC's mission is to facilitate access to Internet-based information to, from, or on Latin America. Its target audience includes people living in Latin America, as well as those around the world who have an interest in this region. While many of its resources are designed to facilitate research and academic endeavors, its site has also become an important gateway to Latin America for primary and secondary school teachers and students, private and public sector professionals, and just about anyone looking for information about this important region. LANIC's editorially reviewed directories contain over 12,000 unique URLs, one of the largest guides for Latin American content on the Internet.



One of LANIC’s initiatives is the Latin American Government Documents Archive (LAGDA). LAGDA seeks to preserve and facilitate access to a wide range of ministerial and presidential documents from 18 Latin American and Caribbean countries. The Archive contains copies of the Web sites of approximately 300 government ministries and presidencies. Content in the Archive includes not only the full-text versions of official documents, but also original video and audio recordings of key regional leaders. Archive contents include thousands of annual and "state of the nation" reports; plans and programs; and speeches by presidents and government ministers. LAGDA is a joint project of the University of Texas Libraries, The Nettie Lee Benson Latin American Collection, and the Latin American Network Information Center at The University of Texas at Austin.



nwoodward@mail.utexas.edu

http://lanic.utexas.edu/project/archives/lagda/

http://www.archive-it.org/public/collection.html?id=176