PDF questions
-
- Posts: 3647
- Joined: February 15th, 2009, 6:25 pm
- Location: Florida
- Contact:
Does anyone know of a way to unlock a public domain PDF file in order to convert it to a Txt file? Is there something like an optical-scan program that can do this? For some reason, Google Books no longer makes Txt versions of PD books available on their site.
They call me Threadkiller.
My Catalog Page
My Catalog Page
Are they specially locked PDFs, then? My Acrobat Reader has the option to File | Save as Other | 'Text' or 'Word or Excel Online'. I don't know how the latter works - never tried it - but saving as Text and then opening in Open Office works fine.
Ruth
Ruth
My LV catalogue page | RuthieG's CataBlog of recordings | Tweet: @RuthGolding
I have a feeling that what you will need is OCR software, to convert the scanned images into text. Most of Google books offerings are straight image scans of the pages, so the text is not embedded within them.
I haven't used OCR software in over 10 years, so I don't have any idea of what is available, open source, or purchasable.
I haven't used OCR software in over 10 years, so I don't have any idea of what is available, open source, or purchasable.
Boomcoach
My Catalog Page
My current Solo project A Spoiler of Men by Richard Marsh
One role needed to complete the Dramatic Reading of The Leader by Murray Leinster, help us finish this project!
My Catalog Page
My current Solo project A Spoiler of Men by Richard Marsh
One role needed to complete the Dramatic Reading of The Leader by Murray Leinster, help us finish this project!
Well, I don't know . Here is a PDF image scan of a poetry magazine, downloaded, opened in Acrobat Reader and "saved as other " text. Looks OK to me. I use Acrobat Reader DC 2015 release.
https://librivox.org/uploads/ruthieg/scan.zip
Ruth
https://librivox.org/uploads/ruthieg/scan.zip
Ruth
My LV catalogue page | RuthieG's CataBlog of recordings | Tweet: @RuthGolding
BT, if you are looking for a file for your latest group project:
archive.org has a microform download here:
https://archive.org/details/cihm_992063
one of the files appears to be a text file - it should be possible to extract word counts from there?
archive.org has a microform download here:
https://archive.org/details/cihm_992063
one of the files appears to be a text file - it should be possible to extract word counts from there?
Cheers, Ava.
Resident witch of LibriVox, channelling
Granny Weatherwax: "I ain't Nice."
--
AvailleAudio.com
Resident witch of LibriVox, channelling
Granny Weatherwax: "I ain't Nice."
--
AvailleAudio.com