Technical

If you do not find your question listed or would like more information, please do not hesitate to contact us.

Q. What type of processing do the images undergo?
Q. What is OCR?
Q. What is the difference between OCR and indexing?
Q. How accurate is OCR?
Q. How fast is the OCR process?
Q. What is ICR (Intelligent Character Recognition)?
Q. What is OMR (Optical Mark Recognition)?
Q. Can OCR’ed text be exported and re-used in a word processor?
Q. Can I manually correct OCR errors and typos?
Q. What is the difference between COLD and imaging?
Q. How many index fields can the COLD server extract from each report?
Q. What is Document Imaging and Management?
Q. How much Server space do I need to store the documents?
Q. Do imaging systems support audit trails?
Q. How does it store the documents?
Q. How can I re-sequence pages?
Q. In which formats can I export documents?
Q. How are documents captured?




Q. What type of processing do the images undergo?
A. Images go through extensive quality control for straightening, de-speckle, and visible page edge removal before being scanned.

Q. What is OCR?
A. OCR stands for Optical Character Recognition, which is how a computer converts words in an unsearchable scanned image to searchable text. OCR is usually necessary in order to use full-text indexing and searches, and it should be included in an imaging system. OCR engines can generally only recognize typed or laser-printed text, not handwriting.

Q. What is the difference between OCR and indexing?
A. OCR is the process of converting scanned images to text files. Full-text indexing is the process of taking a text file and adding each word to an index file that specifies the location of every word on every document. Well designed imaging software can make this a fast and easy procedure, providing rapid access to any word in any document.

Q. How accurate is OCR?
A. Accuracy on a freshly laser-printed page is typically better than 99.6%. Accuracy on faxed, dirty or degraded documents will of course be lower, but a few imaging systems have image clean-up technology that can improve OCR accuracy.

Q. How fast is the OCR process?
A. The performance of the OCR and indexing processes is entirely dependent on factors such as the speed and configuration of the host system as well as the contents of the image. A 133 MHz Pentium generally needs about 6 seconds per page, while a 450 MHz Pentium II will take about 2-3 seconds per page.

Q. What is ICR (Intelligent Character Recognition)?
A. ICR is pattern based character recognition and is also known as Hand-Print Recognition. Handwritten text is more difficult for computers to recognize and results in higher error rates than printed text. ICR engines usually do best at recognizing constrained printing, which means block printed letters with one letter in each box. Accurate recognition of unconstrained handwriting, especially cursive handwriting, typically requires that the ICR engine be trained to recognize each user’s style of writing.

Q. What is OMR (Optical Mark Recognition)?
A. OMR, also called Mark-Sense Recognition, is the recognition of marks commonly used on forms, such as check marks, circled choices, and filled-in bubbles. OMR can be an important part of an imaging system for organizations that process many standard forms. Scantron exam forms and customer survey cards are perhaps the best-known examples of OMR in action.

Q. Can OCR’ed text be exported and re-used in a word processor?
A. Yes, you can usually cut and paste text between the imaging system and another Windows application, or you can export complete text files (all text pages in a document) to a directory and open it with your favorite word processor.

Q. Can I manually correct OCR errors and typos?
A. Well-designed systems allow users to correct OCR errors from within the system. However, when hundreds or thousands of pages are scanned every day, it is usually not practical to have someone clean up the text. If fuzzy logic search capabilities are available, it is not necessary to correct the text as searches will typically still find misread words.

Q. What is the difference between COLD and imaging?
A. Imaging is for scanning, compressing, storing, indexing, OCRing, searching and retrieving millions of pages of paper documents or electronic documents archived as permanent images. COLD is for archiving, indexing, searching and printing reports from huge text files generated by mainframes, mini-computers and other computer applications. COLD stores huge report files and extracted index fields on hard disk, optical cartridge or CD-ROM instead of printing all the information out on paper or storing it to microfilm.

Q. How many index fields can the COLD server extract from each report?
A. The number of index fields is usually unlimited. However, the more fields extracted from each report, the slower the extraction process will run and the larger the index files will be.

Q. What is Document Imaging and Management?
A: Paper and electronic documents are at the root of our modern business environment. Document Management provides information retrieval when and where it is needed.
Document Management means being able to store, sort, index, and combine these information containers for easy retrieval.

Q. How much Server space do I need to store the documents?
A. A scanned image is stored using industry standard Group Tiff file compression. This is white compression, which means the white is thrown out and the black plotted in binary code. So a page with a lot of writing on it or a grey or coloured background will have less white space and will be greater in file size. Having said that, on average you can get in the region of 25,000 A4 images on 1Gb of hard disk storage which, in plain terms, is about two and a half four drawer filing cabinets.  PDF images take about the same amount of storage space.

Q. Do imaging systems support audit trails?
A. An imaging system’s audit trail product should record a user name, date, time, document name and action whenever a user accesses a database or document. Various levels of audit-trail logging detail and activity tracking should be available. The system should also support a viewer for sorting and filtering these logs.

Q. How does it store the documents?
A: Documents may or may not be stored in their native format or file type. Some systems use proprietary file structures while others treat everything stored as a single document type like TIFF image. Storing documents in the same way as they were created is useful if you are going to use the documents again or want them to retain their special properties.
Somewhere on a local drive or network device the document will be sent and stored. The imaging system retains a pointer or address where this image might be. Even if the document needs to be in several places at once, some systems can support a hierarchical search that locates the nearest match in a tree search. The storage medium determines the speed and reliability of access plus the overall cost.

Q. How can I re-sequence pages?
A: If pages are out of order and need to be re-sequenced;a well-designed imaging system will allow “thumbnail” views of pages to be simply dragged to the required position. In the same way, individual pages can be selected and deleted, subject to appropriate security access control and privileges.

Q. In which formats can I export documents?
A. It depends on the imaging system. Common graphical formats you may need include TIFF III, TIFF

Q. How are documents captured?
A: The most common form of input to a document imaging system is scanned paper. This can be done in several ways. The document may be single page or multi pages. Scanning is a process of converting a paper document into a series of ones and zeros that faithfully represent the original document. Automatic document feeders built into the scanner move pages in sequence to the scanner anywhere from 12 pages per minute to 100's of pages per minute. Scanners are rated by speed (pages per minute) , resolution (lines per inch i.e.200,300,400) format (colour, greyscale, black/white) and page layout (double sided, single sided, standard , legal, ) . The pages are converted by light from the pages falling on a sensor area that converts the image to electronic "ones" and "zeros" . Several processes can then be added to scanners or their interfaces to enhance the scanned image. These process enhancements can include colour dropout lamps, deskew, despeckle, continuous contrast adjustments, thresholds and bar code recognition. As each page is brought into the document management system it is indexed so it can be found again. It's like having a telephone book to link with your home address. We can identify your house with your name, street etc. Alternatively we can inventory the contents of your whole house and list the contents in a search table with key word values. The third method of indexing involves using a folder to group similar objects. In this way all white houses are located in the same folder.

General FAQs