
Recently my Library starting a Digital Repository based on the DSpace platform. The first large set of data to be added to it was our collection of graduate dissertations. The first obstacle to adding this content to DSpace was digitization. There was approximately 1800 theses that needed to be scanned into PDF with OCR information. A portion of the digitization was outsourced to the Internet Archive. The Internet Archive is a very interesting initiative. In short they will take any piece of information and create a persistent digital representation of it and archive it for you. The archive itself already has a huge selection of material and makes for interesting browsing. The second challenge was to get the digitized archive information and get it to DSpace. Basically the archive will create a nice page online with your objects and all associated metadata. In my case I was interested in the Dublin Core meta-data (DSpace's native metadata format) and PDF's of the theses.