Technical information

The scanned articles have been converted by an Optical Character Recognition software program to make the resulting PDF files searchable.

The scanner is a Canon CanoScan 9900 F and the resolution varies between 400 and 600 dpi depending on the quality of the original. The smaller the text, the higher the resolution, to help the OCR software identify the text with a minimum of errors.

The OCR program is Adobe Acrobat Capture. Unlike most OCR software, this can preserve the scanned image as is, instead of converting it to an entirely new file. The searchable text is a layer behind the image, as it were. To simplify matters for the software and at the same time keep the size of the files down, the scanned pages have been converted to b&w (instead of grayscale). An exception has been made for pages that contain illustrations or photographs. In such cases the text has been converted to black and white, but we have retained the grayscale for the page as a whole. This explains why some of the PDF's (those with images) are much bigger than others.

One of the advantages of keeping the original scanned picture is that OCR-related errors are less critical, since they are not visible on the screen.

We are well aware that errors occur in the underlying texts. Only words that the OCR program has flagged as "suspects" have been corrected. Further proofreading would have been too time-consuming. This, of course, makes searchability less than 100% so we ask you to keep that in mind as you work with these files. Needless to say, we will appreciate any corrections.

 

Lena Forsgren

Updated 20 October 2006