Not sure how to pose these two questions but here goes.
Let's say I've got my assigned reading section in a group project. I've downloaded the text from which I've been instructed to read, and only that text. There are several file formats. PDF, Kindle, and so on.
Question one. Which file format do I use?
Question 2. When I am running the audacity software, is it running in the background on my main computer screen while I read from that? Or am I running audacity on one machine and reading from another? Like, can I run the software on my laptop as I read from my tablet screen? Because it seems like I should also be watching to make sure that the software is running properly as I'm recording.
Reading screen and file format
-
- Posts: 707
- Joined: July 14th, 2007, 5:18 pm
- Location: In the urban wild
Hi, there--
Most of these options are viable!
I've noticed that the pdf format often comes through a lot cleaner than the other versions for documents from Internet Archive, so I tend to use that if I'm not just reading from the online version itself. I can't remember what I use for Gutenberg.
I read on my computer screen and have the Audacity program to the left and the document to the right, each taking up half my screen. This is highly customizable, though--I know some people read from a tablet instead of the computer screen. The issue is--as you've noted--you might want to keep an eye on Audacity to make sure you've hit record and your mic isn't muted and the background noise hasn't taken over the whole wave form and you aren't constantly hitting the limit where clipping happens. This is why I organize my space the way I do. The downsides of my way are that both winders are smaller (I'm on a laptop) and I get mouse clicks when I have to adjust the text. I don't think I'd get as many of those noises if I were scrolling on a tablet.
Hope this helps. Phil Chenevert has a thread going somewhere about how people set up their spaces and how they record. I just can't find it right at this moment...
Thanks,
Stephanie
Most of these options are viable!
I've noticed that the pdf format often comes through a lot cleaner than the other versions for documents from Internet Archive, so I tend to use that if I'm not just reading from the online version itself. I can't remember what I use for Gutenberg.
I read on my computer screen and have the Audacity program to the left and the document to the right, each taking up half my screen. This is highly customizable, though--I know some people read from a tablet instead of the computer screen. The issue is--as you've noted--you might want to keep an eye on Audacity to make sure you've hit record and your mic isn't muted and the background noise hasn't taken over the whole wave form and you aren't constantly hitting the limit where clipping happens. This is why I organize my space the way I do. The downsides of my way are that both winders are smaller (I'm on a laptop) and I get mouse clicks when I have to adjust the text. I don't think I'd get as many of those noises if I were scrolling on a tablet.
Hope this helps. Phil Chenevert has a thread going somewhere about how people set up their spaces and how they record. I just can't find it right at this moment...
Thanks,
Stephanie
--Stephanie
*******************
Current solo:
Life among the Piutes
Native American history--Come read about removal plans, education, and laws:
Annual Report of the Commissioner of Indian Affairs, December 1837
*******************
Current solo:
Life among the Piutes
Native American history--Come read about removal plans, education, and laws:
Annual Report of the Commissioner of Indian Affairs, December 1837
-
- LibriVox Admin Team
- Posts: 60799
- Joined: June 15th, 2008, 10:30 pm
- Location: Toronto, ON (but Minnesotan to age 32)
(1) Whichever you prefer.
(2) Whichever you prefer.
As long as you got the text from the link in the project, any format of it is OK to use.
Yes, it's good to have your eye on the Audacity window as you're recording, to make sure you're not clipping or even that your mic is turned on! (Some people have "recorded" a whole section only to discover that, for whatever, reason, it didn't actually record! Seeing the wave forms scroll by would prevent this.) But it's not required.
(2) Whichever you prefer.
As long as you got the text from the link in the project, any format of it is OK to use.
Yes, it's good to have your eye on the Audacity window as you're recording, to make sure you're not clipping or even that your mic is turned on! (Some people have "recorded" a whole section only to discover that, for whatever, reason, it didn't actually record! Seeing the wave forms scroll by would prevent this.) But it's not required.
School fiction: David Blaize
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
-
- LibriVox Admin Team
- Posts: 11140
- Joined: August 7th, 2016, 6:39 pm
I will offer one caveat to this. There's one text format that I think readers should avoid like the plague, if you have a text from archive.org, and that is the "Full Text" link. This is a plain-text format that is easy to copy and paste, and so it may be tempting if you like to mark up your text before reading. However, this is a computer-generated text file with no human oversight or correction, so it's always riddled with errors, some of which are clearly ridiculous, and others which are rather hard to catch!
Aside from this caveat, I totally agree!
-
- Posts: 34
- Joined: August 4th, 2009, 1:29 pm
For sure. Once I found the word "fire" changed to "tire" in the computer-generated text file. That kind of error can silently change the meaning of a sentence!mightyfelix wrote: ↑February 1st, 2023, 1:46 pm I will offer one caveat to this. There's one text format that I think readers should avoid like the plague, if you have a text from archive.org, and that is the "Full Text" link. This is a plain-text format that is easy to copy and paste, and so it may be tempting if you like to mark up your text before reading. However, this is a computer-generated text file with no human oversight or correction, so it's always riddled with errors, some of which are clearly ridiculous, and others which are rather hard to catch!
(By the way, this process is called "Optical Character Recognition" or OCR. Some OCR software is better than others, but I found a lot are still confused by specks and ink spots and with similar characters like capital I and lowercase l and numeral 1. And sometimes words will be smudged or partially printed in a book, which is virtually impossible for OCR software to read but (usually) can be read by a human with a brain.)
-
- Posts: 1254
- Joined: October 22nd, 2021, 10:55 pm
- Location: Melbourne with kangaroos
1. I use software called Okular to read from a pdf.
2. In Reaper [my DAW] I have a delay set of about 2 seconds once I hit record. That gives me time to click on the pdf of the text and read from it. Thus I have the pdf filling my whole screen. I am not looking at the waves or the DAW. When I want to stop I press the space bar. If I have included any blank space at the start or end I easily delete them with 1 select and click [that is set in my Reaper template as well]. Note that I use a desktop computer with 1 monitor. I do not use any other device.
But I probably wouldn't suggest method [2] for people who have never recorded. It's a more advanced way of doing it. It requires a sense of timing to not record "too early" before the delay ends. But after awhile you do it at the right time without thinking about it. I remember where I am up to each time; I don't in any way mark the pdf text.
2. In Reaper [my DAW] I have a delay set of about 2 seconds once I hit record. That gives me time to click on the pdf of the text and read from it. Thus I have the pdf filling my whole screen. I am not looking at the waves or the DAW. When I want to stop I press the space bar. If I have included any blank space at the start or end I easily delete them with 1 select and click [that is set in my Reaper template as well]. Note that I use a desktop computer with 1 monitor. I do not use any other device.
But I probably wouldn't suggest method [2] for people who have never recorded. It's a more advanced way of doing it. It requires a sense of timing to not record "too early" before the delay ends. But after awhile you do it at the right time without thinking about it. I remember where I am up to each time; I don't in any way mark the pdf text.
Fan of all 80s pop music except Meatloaf.
I'm contributing to a project that specifies reading from a 1-up image text on archive.org (the project page link goes direct to the image). For each or the various chapters, I've taken its text and formatted it into a document, which I then proofread for corrections before reading. (The OCR is probably 85% good, but lots of little problems. See later paragraph.) This works OK for me because I tend to do shorter sections/chapters, and each chapter gets its own document. I also include the intro and outro for the section/chapter in the document, so I don't have to refer to the project page while recording. Even the solo project I've completed, and the one I have in progress, were laid out this way.mightyfelix wrote: ↑February 1st, 2023, 1:46 pmI will offer one caveat to this. There's one text format that I think readers should avoid like the plague, if you have a text from archive.org, and that is the "Full Text" link. This is a plain-text format that is easy to copy and paste, and so it may be tempting if you like to mark up your text before reading. However, this is a computer-generated text file with no human oversight or correction, so it's always riddled with errors, some of which are clearly ridiculous, and others which are rather hard to catch!
Aside from this caveat, I totally agree!
Yes, it's a lot of work, but it allows me to arrange windows on my screen similar to what stepheather described. I have Audacity open in the upper left of the screen, Checker open in the lower left corner, and the document open and covering the right half. I also have a File Explorer-type window (I use a flavor of Linux, not Windows) open in the center of the screen behind the other windows. This allows me to drag the exported .MP3 to Checker when I'm finalizing the file for upload.
The most consistent problem in this particular OCR transcription is the substitution of a space for the apostrophe in a possessive (e.g. Paul s instead of Paul's). It almost always misreads Rome (most often Eome) and Roman (Eoman or Koman). The ligature Ӕ (in Aegean) has read as simply E and 1.E in the worst case.
John R Moore [rlc77jrm]
Albertville, AL
Albertville, AL
What to do about text errors
"I will offer one caveat to this. There's one text format that I think readers should avoid like the plague, if you have a text from archive.org, and that is the "Full Text" link. This is a plain-text format that is easy to copy and paste, and so it may be tempting if you like to mark up your text before reading. However, this is a computer-generated text file with no human oversight or correction, so it's always riddled with errors, some of which are clearly ridiculous, and others which are rather hard to catch!"
mightyfelix
I just started on my first reading project (Section 2 of [u]The Inside of the Cup[/u] by Winston Churchill . I downloaded the Plain Text UTF-8 file from Project Gutenberg site link provided by LibriVox. While reading the chapter I noticed some oddities. I going to proof read the text and make notes of questionable words,phrases, etc. Should I just record the text exactly as given; or, should I record an edited version? Is there a definitive text available (like a Norton Critical Edition) to check against? Should I submit my proof notes to project coordinator? In view of public domain status concerns about the text; I want to be careful. I'd appreciate some advice.
Thanks,
txphred
"I will offer one caveat to this. There's one text format that I think readers should avoid like the plague, if you have a text from archive.org, and that is the "Full Text" link. This is a plain-text format that is easy to copy and paste, and so it may be tempting if you like to mark up your text before reading. However, this is a computer-generated text file with no human oversight or correction, so it's always riddled with errors, some of which are clearly ridiculous, and others which are rather hard to catch!"
mightyfelix
I just started on my first reading project (Section 2 of [u]The Inside of the Cup[/u] by Winston Churchill . I downloaded the Plain Text UTF-8 file from Project Gutenberg site link provided by LibriVox. While reading the chapter I noticed some oddities. I going to proof read the text and make notes of questionable words,phrases, etc. Should I just record the text exactly as given; or, should I record an edited version? Is there a definitive text available (like a Norton Critical Edition) to check against? Should I submit my proof notes to project coordinator? In view of public domain status concerns about the text; I want to be careful. I'd appreciate some advice.
Thanks,
txphred
You can try downloading one of the other formats rather than the "Plain Text UTF-8" and see if they have the same issues, but be sure you download them from that same page that's linked from LibriVox.txphred wrote: ↑March 2nd, 2023, 3:14 pm What to do about text errors
"I will offer one caveat to this. There's one text format that I think readers should avoid like the plague, if you have a text from archive.org, and that is the "Full Text" link. This is a plain-text format that is easy to copy and paste, and so it may be tempting if you like to mark up your text before reading. However, this is a computer-generated text file with no human oversight or correction, so it's always riddled with errors, some of which are clearly ridiculous, and others which are rather hard to catch!"
mightyfelix
I just started on my first reading project (Section 2 of The Inside of the Cup by Winston Churchill . I downloaded the Plain Text UTF-8 file from Project Gutenberg site link provided by LibriVox. While reading the chapter I noticed some oddities. I going to proof read the text and make notes of questionable words,phrases, etc. Should I just record the text exactly as given; or, should I record an edited version? Is there a definitive text available (like a Norton Critical Edition) to check against? Should I submit my proof notes to project coordinator? In view of public domain status concerns about the text; I want to be careful. I'd appreciate some advice.
Thanks,
txphred
If the issues are still there, then the project/book coordinator (often abbreviated BC) will be the best person to ask how to handle it on that particular project.
I'll be out for a bit on this last weekend of April, but still checking in as I get the chance. I will try to follow up on Monday, with anything I can't do on the go.
-
- LibriVox Admin Team
- Posts: 60799
- Joined: June 15th, 2008, 10:30 pm
- Location: Toronto, ON (but Minnesotan to age 32)
Some typos may get through the PG proofreaders, but it isn't overly common.
If you can find a scan of the text (at Internet Archive or HathiTrust or somewhere else), you can compare the texts there. But do consult with the BC as well.
If they are legitimate errors, they can be reported to PG. Here's their page explaining the process: https://www.gutenberg.org/help/errata.html
If you can find a scan of the text (at Internet Archive or HathiTrust or somewhere else), you can compare the texts there. But do consult with the BC as well.
If they are legitimate errors, they can be reported to PG. Here's their page explaining the process: https://www.gutenberg.org/help/errata.html
School fiction: David Blaize
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
I just finished comparing the Gutenberg Plain Text UTF-8 version of Inside the Cup by Winston Churchill with a pdf scan of the original published book. There weren't a great number of errors in the text file; but, the book's meaning and tone were altered. The pdf I used was a scan of the book published in c1913 by New York : Grosset & Dunlap. It is part of the University of California Libraries collection. It's the same as a couple of others I found. I'll write up a note and send it to the BC.
txphred
aka: Fred
txphred
aka: Fred
My method for making sure I'm recording is to watch the Audacity window while I am recording the disclaimer and section title, and then I click on the text to bring it to the front and continue reading without making any changes to Audacity. This way it just keeps running in the background. I'm recording on a laptop with a fairly small screen, but if I had more space I would opt for the side by side approach, as it would be nice to be able to see the recording happening throughout the reading.