Document
scanning service is a process that converts paper into
digital images and then stores the resulting files into CD-ROM or
imaging system for later retrieval. In general, document scanning
systems are fairly inexpensive to purchase (consisting of little
more than scanners and PCs) but extremely expensive to operate,
frequently account for as much as 80% of the total ongoing labor
cost of an imaging system. This makes it worth to consider
our service which provide you:
- convenience solution
- eliminate large
investment on expensive equipment
- quality assurance
- high-performance
production
- flexibility
- low cost
- no high maintenance
cost
Understanding
the Elements of Production Document Capture Document
scanning encompasses a complex flow of processes that includes scanning
but extends much further. In general, production capture includes
six operations, namely, batch preparation, scanning, OCR and image
cleanup, indexing, QA and rescanning.
Batch Preparation
Batch preparation is
an important first step in assuring a well-functioning document
scanning process. Key manual tasks include inspecting and
separating documents, grouping documents into like categories, and
designating the beginning and end of documents and batches.
Scanning
Scanning
refers to the actual transformation of paper documents into digital
images. Effective scanning requires precise control over a
wide variety of scanners and scanner settings, including resolution,
contrast, simplex or duplex operation, advanced thresholding options,
etc. In addition, scanning usually allows for in-line extraction
of bar code information for purposes of indexing the documents for
later retrieval.
OCR and Image Cleanup
Optical character recognition
is frequently used in production capture systems to extract information
about a document directly from the document itself. There
are two forms of OCR: zonal and full-text. Zonal OCR
is typically used on forms, where only specific fields on the form
are of interest. Full-text OCR is used on free-form documents,
such as legal briefs, to read the entire document and then prepare
a searchable, full-text index of the document.
Image cleanup is a
broad term that includes various methods for cleaning up scanned
images to make them more readable. Techniques include:
Deskewing, despeckling,
deshading, streak removal, and other basic cleanup functions
Line removal and character reconstruction for use on forms
Edge enhancement, which sharpens character edges to increase OCR
accuracy
Indexing
Indexing
consists of creating meaningful descriptive information for each
scanned document and then writing this information into a database
that will be used to retrieve the images later. In most cases,
the index information is entered by a keyboard operator based on
information on the image itself, an operation known as "key from
image." In some cases, however, the index information is extracted
automatically from the images via a recognition process -- typically
optical character recognition or bar code recognition. Some
indexing information may also be assigned automatically to all images
included in a particular batch.
Quality Assurance
Quality assurance entails
systematic reviews and checks to ensure that the scanned images
are readable and the indexes are accurate. It includes methods
for flagging bad images and explaining why or how images should
be rescanned, as well as correcting errors or shortcomings in indexing.
The QA step can be performed either by a QA operator or by an index
operator.
|