Data
Capture and Data Entry We
are experts at getting information into your organization's content supply chain
rapidly, reliably and economically. In
fact, Acumen Solutions has been providing digital data capture services for almost
two decades now. Back
in the days when there was no XML and no World Wide Web, we were already busy
capturing millions of pages every year. Today,
we can capture and process millions of pages on any given week. So you can organize,
store, search and retrieve information more efficiently and effectively. Cutting-Edge
Technologies
While
materials have traditionally been digitized through data entry, we're less likely
to use keyboarding as a technique these days. We
use cutting-edge technologies. Our advanced technologies yield superior results
and markedly improved economics. And - they allow us to process content at much
higher volumes much faster. And
beyond technology, we're your best choice because of the caliber of our people,
our processes, our quality assurance regimes, our massive global infrastructure
and our size. Data
Capture: Our generally preferred method for data capture is Optical Character
Recognition (OCR), an automated process that creates digital machine-readable
versions of hardcopy textual pages. Pages that exist only as hardcopy are imaged
at an appropriate resolution then passed through our proprietary suite of OCR
engines in a process that extracts all textual data. We
work with all kinds of source materials in all kinds of formats including: Hardcopy
Microfilm Microfiche/Aperture Cards TIFF/PDF/Other Digital Images
Slides/Negatives
We use a custom system composed of a mix of leading
commercial and proprietary tools, technologies and processes to produce
the highest possible levels of textual accuracy. To further enable us to guarantee
maximized accuracy levels, a combination of manual and automated processes are
used. These include
a system of processes whereby multiple voting engines make intelligent, rules-
and syntax-based decisions to maximize initial OCR accuracy. After
OCR'd files have passed through our voting engines, our OCR editors use our proprietary
editing software's features to edit the files and increase the captured text's
accuracy up to required levels. Data
Entry: As good as our OCR technologies are, the simple fact is that some materials
are not suitable for OCR processing. We do a bunch of work for libraries and archives,
and by their very nature, these organizations often look to digitize rare historical
manuscripts. There
are several major issues with these types of materials. Rare manuscripts and historical
texts are frequently old, brittle, yellowed — or otherwise deteriorated. Then
they often contain tiny or antiquated fonts that OCR technologies struggle to
recognize. Further complicating matters, they can often contain valuable handwritten
notes that are integral to the collection (the Walter Reed collection we digitized
for the University of Virginia was entirely handwritten). To
capture textual data from these kinds of materials, Acumen Solutions uses industry-standard
word processing packages to manually encode required textual data. Typically,
multiple versions of the documents are output by discrete and experienced encoders
working independently. These
multiple versions then undergo a software-based 'merge and compare' process to
ensure complete data integrity; any discrepancies between the versions are automatically
flagged for review against the hardcopy and subsequent correction by editorial
staff. This editing
process continues until the document is completely verified at the agreed-upon
accuracy levels. Acumen
Solutions can deliver textual data guaranteed at accuracy levels of 99.995% or
better (The International Organization for Standardization recognizes this as
a maximum of 50 errors per million characters) and can meet the tightest turnaround
times (we turn materials around in a single hour for some clients). Nobody
is better qualified to handle large-scale data entry projects. |