How to make a Project Gutenberg eBook

Everything except LibriVox (yes, this is where knitting gets discussed. Now includes non-LV Volunteers Wanted projects)
Piotrek81
Posts: 4682
Joined: November 3rd, 2011, 2:02 pm
Location: Goat City, Poland

Post by Piotrek81 »

Of course there's a University library here, just as there is a whole network of Raczyński Library :wink: I know that for a fact- I'm subscribed to about 10 branches :mrgreen: The problem is that what I would need is not any book, but an edition old enough to be legally photocopied and scanned and be PG-admissible. I somehow doubt that they would be willing to borrow me a 1920 edition for 3 months... :roll: I'd have work on the spot instead.
Want to hear some PREPARATION TIPS before you press "record"? Listen to THIS and THIS
NinaBrown
Posts: 549
Joined: December 22nd, 2011, 6:17 pm
Location: Rockville IN

Post by NinaBrown »

Piotrek81 wrote: I somehow doubt that they would be willing to borrow me a 1920 edition for 3 months... :roll: I'd have work on the spot instead.
I see your point, but I suppose it depends how rare the edition is? One way to find out :-)
-nina-
gypsygirl
Posts: 8618
Joined: June 12th, 2006, 6:00 pm
Location: British expat in Waco, TX
Contact:

Post by gypsygirl »

Piotrek81 wrote:I somehow doubt that they would be willing to borrow me a 1920 edition for 3 months... :roll: I'd have work on the spot instead.
If there are scanners available for use in the library you could get away with borrowing it for just a couple of hours...
Karen S.
TriciaG
LibriVox Admin Team
Posts: 60576
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

Pulling up a long-sleeping thread:

I have a Word (well, LibreOffice, but I can convert it to .doc or .docx) document of a book, Maud and Other Poems by Tennyson.

There might be slight OCR errors - substitutions that are real words ("boot" for "book" as a rough example), and periods that should be commas - that sort of thing. I've cleaned up all the more egregious OCR errors.

I'd like to see this get on PG, but I really don't want to do the final proofreading and conversion to html. Would anyone want to take it over from me? It registers at about 15,000 words.

Oh, and I submitted the copyright clearance about an hour ago.
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Humor: My Lady Nicotine
TriciaG
LibriVox Admin Team
Posts: 60576
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

Never mind. I did the final proofing myself. The book was short enough. :)

I guess you're required to submit the plain text file. I don't like the generated html, but aside from submitting a text file and an html one, I guess I didn't have much choice. :?

It's here: http://www.gutenberg.org/ebooks/56913
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Humor: My Lady Nicotine
tovarisch
Posts: 2936
Joined: February 24th, 2013, 7:14 am
Location: New Hampshire, USA

Post by tovarisch »

Pretty cool.

I'm going to have to contact PG soon about the typos in the book I'm reading now (and how many!).

For some reason there is a strange symbol my Chrome shows in the last line of Maud.I.9 (it's a diamond with a question mark in it, right between the "yes!" and the "-but"):
Peace in her vineyard--yes!�-but a company forges the wine.
I guess Chrome does not know what to display...
tovarisch
  • reality prompts me to scale down my reading, sorry to say
    to PLers: do correct my pronunciation please
TriciaG
LibriVox Admin Team
Posts: 60576
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

It was an en dash, but their checker program should have flagged it for me to replace with a regular dash. (It did flag an en dash in another spot; or was it this one? But I fixed the spot it flagged.)

Does "Agavè" show the accent in yours? If not, you need to use Unicode to show the text file rather than... ASCII?
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Humor: My Lady Nicotine
tovarisch
Posts: 2936
Joined: February 24th, 2013, 7:14 am
Location: New Hampshire, USA

Post by tovarisch »

No, it does not show the accent. There is 'small E with grave' in extended ASCII (code xE8: è), so there should be no need for Unicode.
tovarisch
  • reality prompts me to scale down my reading, sorry to say
    to PLers: do correct my pronunciation please
TriciaG
LibriVox Admin Team
Posts: 60576
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

I can't get back into the submission page where one selects what character set one is using, so I can't tell what the options are/were. I do know that I couldn't choose the basic ASCII character set due to the accented characters, but perhaps I didn't have to go so far as Unicode. I don't know much about all that - just enough to be dangerous when given the toys to play with. :lol:

EDIT: The file listing on the PG page says the text file is UTF-8. If that's the encoding, it should have picked up the two accented e characters. That's apparently a mistake with their encoding. :(

The HTML format renders everything correctly.
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Humor: My Lady Nicotine
mightyfelix
LibriVox Admin Team
Posts: 11099
Joined: August 7th, 2016, 6:39 pm

Post by mightyfelix »

This is an old thread, and maybe not the best place to ask my question. But be that as it may, can anyone tell me how to go about reporting an error to Gutenberg? I'm prereading through Doctor Dolittle's Post Office currently, in preparation for a DR, and there's a spot where they have changed Jip's name to Jim.
TriciaG
LibriVox Admin Team
Posts: 60576
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

errata2019 (at) pglaf.org

Errata within eBooks. To report an error within an eBook (such as a missing word), please be sure to include the eBook number or specific filename or download link you used. Error reports are easiest to handle when they clearly indicate the context in which an error was found. Note that since Project Gutenberg includes many old titles, it is common for unusual spelling or arcane word uses to be used. If possible, check a printed source to verify whether an error exists, before reporting it. Messages to the errata list generate an automatic response that your report was received.

Any errata/bug/typo report is welcome! There is additional guidance in the FAQ on how to prepare errata reports so they are easiest for the Project Gutenberg team to handle. Start with FAQ #R.26 on how to report typos.
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Humor: My Lady Nicotine
mightyfelix
LibriVox Admin Team
Posts: 11099
Joined: August 7th, 2016, 6:39 pm

Post by mightyfelix »

Thank you! If I was on my home computer, it would have been easier to find this info myself, but I appreciate your getting it for me. :)
LikeManyWaters
Posts: 787
Joined: January 15th, 2018, 2:50 pm
Location: Arizona

Post by LikeManyWaters »

Hmmm... I have a handful of errors/typos highlighted in most of the ebooks I have downloaded from Gutenberg. I suppose I should report them, but not sure if I have the time. Most are obvious, and I just highlight and remember to read it the right way when I read them for LV. Only one or two books have ever been VERY error-ridden.

On a related note, I have bought an old book that I couldn't find anywhere online and was wanting to scan it. My plan is to use my phone, I believe you two (Tricia & Devorah) have done that before. Any TIPS would be much appreciated! 8-) PS - The book is in great condition, and has several beautiful illustrations, so hoping not to damage the book in the process.

Not sure whether I will try for getting it on PG or Archive. Archive's submission page kind of intimidated me. But either will be a learning curve.
April
mightyfelix
LibriVox Admin Team
Posts: 11099
Joined: August 7th, 2016, 6:39 pm

Post by mightyfelix »

LikeManyWaters wrote: April 10th, 2019, 2:25 pm On a related note, I have bought an old book that I couldn't find anywhere online and was wanting to scan it. My plan is to use my phone, I believe you two (Tricia & Devorah) have done that before. Any TIPS would be much appreciated! 8-) PS - The book is in great condition, and has several beautiful illustrations, so hoping not to damage the book in the process.

Not sure whether I will try for getting it on PG or Archive. Archive's submission page kind of intimidated me. But either will be a learning curve.
Using my phone was much easier, I think, than it would have been to scan each page. Quicker, too. One thing I found that did help was a cool crop feature in the photo editor I used to process the pages after I took the pictures. This may not be much help, because I don't remember specifics, such as the name of the photo editor, but the feature may be more or less standard. It's like a perspective crop tool, I guess you would say. Rather than cropping the image in a perfect rectangle, it allows you to place the corners on the corners of your pages, which might give you kind of a funny quadrilateral, if your picture wasn't taken perfectly from above. Then it will basically squish and stretch that into a good rectangle, making it all straight and even.

For archive, you'll take all those images (single pages, not double page spreads) and put them (in order, of course) into a PDF format, then upload the PDF. The archive uploader says that you could also put all your images into a zip folder instead and upload the zip folder, but that didn't work for me. Maybe it wasn't named right, I don't know. But the PDF worked very well for me and wasn't too hard.

I've never submitted a book to Gutenberg. I think they may choose not to take one on, if it's something extremely niche or obscure (since it takes a lot of time and manpower for them to create an ebook), but I really don't know. Archive, on the other hand, will take whatever you have, and then it's all an automated process, as far as I know.

Sorry if that was all more info than you needed. I've only submitted books to archive twice now, and only one of them was something I actually scanned (or rather, photographed) myself. The other one came from a library that had it on microfiche, and I was able to save the pages from the fiche reader onto my USB. But anyway, let me know if there is something I might be able to help you with, and I'll do what I can!
LikeManyWaters
Posts: 787
Joined: January 15th, 2018, 2:50 pm
Location: Arizona

Post by LikeManyWaters »

I meant to say THANKS before now, oops! :D

My husband says he will help me set up something like this DIY book scanner. Thought it might be nice to share... so if you have some scrap plexiglass... it doesn’t open the book all the way, so less spine stress.

Image
April
Post Reply