Digitising physical artefacts

Thank you to Belinda Battley from the Auckland University Tramping Club for her considerable assistance writing this resource, as well as providing links and reviewing the finished product.  If you or your club wants to add to this resource, or assist us building resources about any topic for the Outdoor Community, then drop us a […]

Thank you to Belinda Battley from the Auckland University Tramping Club for her considerable assistance writing this resource, as well as providing links and reviewing the finished product.  If you or your club wants to add to this resource, or assist us building resources about any topic for the Outdoor Community, then drop us a line!

The most appropriate tool is a flat-bed scanner, which are commonly found in ‘all in one’ printers.But don’t use a feeder; you will need to carefully lay each one on the bed of the scanner.

Using your camera is better than nothing, but results in image distortion and often misaligned images.  It’s usually easier to set up a bit of a ‘production line’ of documents, scanning them all in one go, then returning to re-name the files following your convention.

There are two main types of artefacts you might wish to digitise: Photo/Image based documents and documents with text.

Photos, slides and other image documents should be scanned at high resolution in Tagged Image File Format (TIFF), which is one of the best formats for maintaining image quality and file security. TIFF is a loss-less format, which means that images retain their quality for years to come, unlike JPEG or other ‘lossy’ formats which compress the image data. TIFF can’t be altered easily, which is good for archiving, and has no embedded data, so can never have a virus within it (unlike a PDF for example).

For text based documents, a very beneficial outcome of digitisation is the ability to search for things within documents. Imagine how much more useful that pile of old newsletters would be if you could find the article you wanted in the blink of an eye!

Thats where Optical Character Recognition (OCR) comes in. Basically, it’s a process that changes a picture of text into actual text which can be searched or copied and pasted. For OCR to work, the original document needs to be reasonably good quality; Paper copies of newsletters that were made on the computer then printed out are likely to convert well, but 50 year old type-written documents less likely. There are a number of programs which use OCR to convert a scanned image into a text document.

Currently Google drive offers a very easy and free way to use OCR. You basically upload the scanned file to your Google Drive, then right click to ‘open with Google Docs’. You then can save the Google Doc as a PDF, which has fully searchable text!

Then, you could use Google Drive’s internal search function to search all those archives, or use the Adobe Acrobat (standard PDF viewer) advance search (press CTRL + SHIFT + F) to search through multiple PDFs on your computer.

However you digitise your physical artefacts, one point should be obvious by now; keep your originals!

If you or your club wants to add to this resource, or assist us building resources about any topic for the Outdoor Community, then drop us a line!

Last updated: 4 June 2018

Wilderlife