A reader asked me earlier how to obtain a PDF of a patent that was OCR (text searchable). I thought I’d create a list of the ones I remember off the top of my head:
- Patent Tools will OCR for $5: http://www.pattools.com/Patent-to-PDF.html
- The Patent Supply Company for as cheap as $0.49 per patent: http://www.patentsupply.com/price
- Per a comment below, you can do it yourself via Adobe Acrobat. There may be free tools that OCR PDFs too.
- Another comment notes “The EPO publishes text-searchable PDFs of recent (but not all) European patents and patent applications. http://www.epo.org/patents/patent-information/european-patent-documents/publication-server.html
If there are others (and I am sure there are), please let me know and I’ll update the list.
Related posts:



Acrobat Standard V7.0 and later have a feature called “Recognize text using OCR”. under the “Document” menu. It will allow you to OCR your own PDFs, including patents. It is a little slow and you cannot see the full text other than for searching or cutting and pasting. It seems to perform OCR without an interactive spellchecker and since you can’t view the text, you can’t see errors. But it does work nicely for when the full text is not available.
I don’t know what OCR means, but aren’t those patents on Google Patents?
Mau:
OCR means Optical Character Recognition. See http://en.wikipedia.org/wiki/Optical_character_recognition
A PDF that has been OCR can be searched for keywords (e.g., where in the document does the word “flange” occur).
Also, the PDF patent copies you can download from Google Patents are not OCR’d.
The EPO publishes text-searchable PDFs of recent (but not all) European patents and patent applications.
http://www.epo.org/patents/patent-information/european-patent-documents/publication-server.html
Many of these will have been produced by OCR from the applicant’s original typescript. Increasingly, however, applicants are filing European applications electronically in machine-readable format, so that OCR is unnecessary.