One key issue with newspaper scanning is the possibility of optical character recognition (OCR) errors. A smudge on an old newspaper might result in an OCR program resolving "Titamic"' as "Titanic." The dtSearch Engine includes its own fuzzy searching algorithm which lets web visitors adjust search fuzziness to sift through OCR and other typographical errors. For example, with a fuzziness level of 3, a search for "Titanic" would find not only "Titanic" but also "Titamic." With a fuzziness level of 4, a search for "Titanic" would find "Titanic," "Titamic" and "Titomic."
On the Fulton History site, dtSearch's fuzziness algorithm also extends to the Contegra application's highlighting of PDF hits. Adds Mr. Tryniski: "Because of the potential for OCR errors when scanning old newspapers, dtSearch's fuzzy searching is really important." The Fulton History site makes available other dtSearch search options as well, including stemming, phonic and concept / thesaurus searching.
About Fulton History, www.fultonhistory.com
Originally a resource for searching historical newspaper records from upstate New York, Fulton History brings together an ever expanding collection of American and now Canadian newspapers. The entire 34 million page collection is available for the general public to search at fultonhistory.com
About Contegra Systems, www.contegrasystems.com
Established in 1987, Contegra Systems, Inc. is a leading provider of data integration services to Fortune 500 companies and others with extensive data access requirements. The company routinely transforms substantial collections of mixed data content into robust, user-friendly Web and other electronic products. The company also routinely undertakes custom development projects involving the dtSearch Engine SDKs, applying Contegra's server-side PDF hit-highlighting application as well as customized faceted and other advanced data classification implementations.
About dtSearch, www.dtSearch.com
The Smart Choice for Text Retrieval® since 1991, the dtSearch product line has 25+ search options for instantly searching terabytes of text. Along with enterprise and developer text retrieval, the company has its own document filters, offering parsing, extraction, conversion and searching of a broad range of data formats. Supported data types encompass databases, website data, popular "Office" formats, compression formats, and emails with attachments. dtSearch products meet some of the largest-capacity text retrieval needs in the world, including developer APIs spanning multiple platforms. The products have received hundreds of excellent case studies and press reviews. (Please see dtSearch.com for these.) The company has distributors worldwide with coverage on six continents.
To view the original version on PR Newswire, visit:http://www.prnewswire.com/news-releases/fulton-historys-34-million-page-online-newspaper-archive-adds-contegras-server-side-pdf-hit-highlighter-to-dtsearch-instant-searching-for-immediate-historical-access-300231780.html