Extract Text from Images and PDFs with Google Drive

Extract Text from Images and PDFs with Google Drive

There may be any variety of reasons you may need to copy the text from a image file or PDF. Google Docs can help you with that process if you don’t have access to an application like Adobe Acrobat Pro or an OCR conversion application. Here is how.

FIRST – Upload the image file or PDF to your Google Drive account and select it.

Here I am using a PDF shared by Richard Byrne on his blog Free Technoogy for Teachers.

pdf-to-text-1

NEXT – With the file selected in your Google Drive (not when it is opened in preview mode), click on the the three vertical dots that represent the “More Actions” menu. Choose “Open with” and select “Google Docs.”

text-to-pdf-2

FINALLY – Google Drive’s optical character recognition (OCR) jumps into action. Google Drive scans the file and uses its magic algorithms to convert the file into a Google Document. It’s not perfect but it can certainly help in a pinch.

 

Note: Video has no sound 

The conversion will be most accurate if:

  • the image is high resolution
  • line height is at least 10 pixels (larger is even better)
  • text is horizontal and left-to-right (you can use Google Drawings to rotate the image if needed)
  • common fonts like Arial and Times New Roman get the best results (from my experience sans-serif fonts like Arial work best).
  • image should be sharp and free of blurring

There are limitations. The file size cannot be over 2 MB. If you are working with a PDF, only the first 10 pages are scanned and converted.

Some of the formatting may carry over into the converted text but don’t be surprised if you need to do some clean-up. Still, this process can be a big time-saver when the clock is ticking.

 

 

Leave a reply