pdf confusion

Kingson

Kind Benefactor
Super Member
Registered
Joined
Apr 17, 2010
Messages
163
Reaction score
6
Location
Colorado, USA
Hi Friends, I have many pages that were scanned and saved as pdf files. However, I can't seem to convert them to word. That is, a program will convert the page, but it comes up as either blank or just another picture of the document that can't be edited. I would greatly appreciate any advice. Thank you!
 

BlackBriar

Bricoleur
Super Member
Registered
Joined
Jul 10, 2009
Messages
1,295
Reaction score
203
Location
South
I think the problem is that what you have are PDF images, not text documents. You will need OCR software to convert the images (what you scanned) into text. You may have to rescan the images again, since I don't know if there's OCR software for PDFs. And you might want to wait until someone a bit more knowledgeable than me can help.

http://www.makeuseof.com/tag/top-5-free-ocr-software-tools-to-convert-your-images-into-text-nb/

Edit: http://freeocr.net/ is free and will convert pdfs into editable text, according to both links. Good luck! :)
 
Last edited:

Deleted member 42

You need to process the .PDFs using OCR software. If you have a scanner you may have a program like Text Bridge or Omni Page that came with it for performing the OCR. You can OCR the PDFs quite easily with OCR software.
 

Kingson

Kind Benefactor
Super Member
Registered
Joined
Apr 17, 2010
Messages
163
Reaction score
6
Location
Colorado, USA
OK, I get it. I've confirmed that the scanner used did not have OCR software installed, and there in lies the problem. Thank you Cuppa and Medievalist for your responses.
 

Deleted member 42

OK, I get it. I've confirmed that the scanner used did not have OCR software installed, and there in lies the problem. Thank you Cuppa and Medievalist for your responses.

You can do the OCR after the fact, yourself, of the .pdfs, if YOU have OCR software on your computer.
 

ComicBent

Super Member
Registered
Joined
Nov 7, 2005
Messages
347
Reaction score
28
Location
Tennessee
Still a problem

You can convert those PDF "image" files with the right software, as others have said above. I have up-to-date OmniPage software, and I have done it.

However, you are going to have some major cleanup to do afterward.

What happens is that the PDF pages get converted to Word, but the paragraph styles are really crazy. It will all look about the same as the original (if you are lucky), but you will have strange things like unexpected character spacing in the style of a paragraph.

I have always had to select the paragraphs and apply a new style to them, just to get rid of all the bizarre formatting elements.
 

benbradley

It's a doggy dog world
Super Member
Registered
Joined
Dec 5, 2006
Messages
20,322
Reaction score
3,513
Location
Transcending Canines
It may be best to change settings on the OCR to convert to "plain text" and then add/apply all needed formatting/spacing/corrections. I've done very little OCR, just enough to see it's like the dancing bear.
 

Xvee

I Need Monkeys
Super Member
Registered
Joined
Apr 30, 2010
Messages
356
Reaction score
231
Location
Long Island
Google and download a free program call Calibre and give it a try. It's an e-book management tool that can convert almost any format to any other format.