<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>File Conversion Services Blog &#187; Images</title>
	<atom:link href="http://blog.fileconversionservices.com/category/images/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.fileconversionservices.com</link>
	<description></description>
	<lastBuildDate>Fri, 08 Feb 2008 15:40:44 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Converting scanned documents to editable text</title>
		<link>http://blog.fileconversionservices.com/2007/09/17/converting-scanned-documents-to-editable-text/</link>
		<comments>http://blog.fileconversionservices.com/2007/09/17/converting-scanned-documents-to-editable-text/#comments</comments>
		<pubDate>Mon, 17 Sep 2007 21:02:35 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Conversion]]></category>
		<category><![CDATA[Images]]></category>

		<guid isPermaLink="false">http://blog.fileconversionservices.com/2007/09/17/converting-scanned-documents-to-editable-text/</guid>
		<description><![CDATA[One of the things I get people asking to me to do most often is to convert a scanned document into an editable text document in a format such as Microsoft Word.  In this post I will go through a simple algorithm to get this task done that you can do for yourself without [...]]]></description>
			<content:encoded><![CDATA[<p>One of the things I get people asking to me to do most often is to convert a scanned document into an editable text document in a format such as Microsoft Word.  In this post I will go through a simple algorithm to get this task done that you can do for yourself without spending any money.</p>
<p>When a document is scanned in a scanner, the default output is an image of the scanned page. In some of the more recent scanners, you can specify your settings so that the scanner reads the document straight into a Word or Excel file.  But sometimes you don&#8217;t have that option. What you need is a way to convert that scanned image into a text file, and the technology that does this is called &#8220;Optical Character Recognition&#8221; or OCR for short.</p>
<p>The OCR program will take your scanned image and attempt to read it to get you the best estimate of the original in text format.  You then save the output and check it against the original for integrity.</p>
<p>Most OCR programs work pretty well., and the accuracy of the output, in my experience, is dependent on two main factors:</p>
<ol>
<li> The clarity of the scanned image &#8211;  as long as the original scanned file (image) is clear and of a high resolution the output will be good. The higher the resolution, the better the match.</li>
<li>Font size &#8211; The other factor that comes into play is the size of the font. The larger the font on your image the higher the accuracy will be.</li>
</ol>
<p>There are several different OCR programs.  I will list a few for you:</p>
<ol>
<li><strong><a href="http://www.simpleocr.com/" title="Simple OCR" target="_blank">Simple OCR</a></strong> &#8211; One of the ones I have used before and would recommend for simple jobs is <a href="http://www.simpleocr.com/" title="SimpleOCR" target="_blank">Simple OCR</a>. This is a <strong>free </strong>program with a good OCR engine.</li>
<li><strong><a href="http://www.amazon.com/gp/product/B000B7VBF0?ie=UTF8&amp;tag=thcopa-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B000B7VBF0">Abbyy Finereader 8.0 Professional</a><img src="http://www.assoc-amazon.com/e/ir?t=thcopa-20&amp;l=as2&amp;o=1&amp;a=B000B7VBF0" style="border: medium none  ! important; margin: 0px ! important" border="0" height="1" width="1" /></strong> &#8211; This is my <strong>personal favorite</strong>, and the one that I use most often.  ABBYY FineReader has a powerful OCR engine, and a lot of features and options to work with different files. I find it to have very good accuracy even with documents that aren&#8217;t very clear. It&#8217;s also good with pdf files.</li>
<li><strong><a href="http://www.amazon.com/gp/product/B000F7EV1W?ie=UTF8&amp;tag=thcopa-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B000F7EV1W">Readiris Pro 11</a><img src="http://www.assoc-amazon.com/e/ir?t=thcopa-20&amp;l=as2&amp;o=1&amp;a=B000F7EV1W" style="border: medium none  ! important; margin: 0px ! important" border="0" height="1" width="1" /></strong> &#8211; This is yet another popular OCR program which I recommend.  Has great features also.</li>
<li>Other popular options that I have <strong>not </strong>used include <strong><a href="http://www.amazon.com/gp/product/B000AMPJPY?ie=UTF8&amp;tag=thcopa-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B000AMPJPY">ScanSoft OmniPage 15 OCR</a><img src="http://www.assoc-amazon.com/e/ir?t=thcopa-20&amp;l=as2&amp;o=1&amp;a=B000AMPJPY" style="border: medium none  ! important; margin: 0px ! important" border="0" height="1" width="1" /></strong> and <strong><a href="http://www.amazon.com/gp/product/B000067VPA?ie=UTF8&amp;tag=thcopa-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B000067VPA">Textbridge Pro 11.</a><img src="http://www.assoc-amazon.com/e/ir?t=thcopa-20&amp;l=as2&amp;o=1&amp;a=B000067VPA" style="border: medium none  ! important; margin: 0px ! important" border="0" height="1" width="1" /></strong></li>
</ol>
<p>A glance at these programs will show you that they are somewhat costly to purchase, so you would need to decide if you do enough conversion to justify the cost.</p>
<p>And remember, if you just want someone to do the task for you, <a href="http://www.fileconversionservices.com/pages/contact-us.php" title="Contact Us">contact us</a>, it&#8217;s what we do.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.fileconversionservices.com/2007/09/17/converting-scanned-documents-to-editable-text/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
	</channel>
</rss>
