WordCount app for counting words in a chapter

Comments about LibriVox? Suggestions to improve things? News?
tony123
Posts: 1485
Joined: July 22nd, 2011, 4:34 pm
Location: Albuquerque, New Mexico, USA

Post by tony123 »

Isana,

I've tried it now with a George Gibbs novel, The Golden Bough, and it was amazing!

Thank you! :D

Tony
Isana
Posts: 273
Joined: December 2nd, 2013, 12:46 pm
Location: USA

Post by Isana »

Thanks for the feedback, Tony. I'm glad it worked! Whew.
Isana
Posts: 273
Joined: December 2nd, 2013, 12:46 pm
Location: USA

Post by Isana »

Hi guys. Here are some tips for using the app.

First, it is important to look at the text version of the Project Gutenberg (PG) book instead of the html version.

Once you have the text version of the book, look in the body of the text for the chapter titles, not in the table of contents.

In many cases where there there are both a numeric title (e.g. Chapter I) and a descriptive title (e.g. From Paris to St. Petersburg), these two titles are separated by a blank line in the book text, like so:

Code: Select all

CHAPTER I.

FROM PARIS TO ST. PETERSBURG.
In these cases, one can simply use the "Roman numerals" method of specifying chapter titles.

However, in some PG book texts, there is no line separating the two titles, like so (PG book 48613):

Code: Select all

                  CHAPTER I
OLD PANAMA, AGAMEMNON, AND THE GENIAL PICAROON
In these cases, the "Roman numerals" method will not work. However, one can specify the chapter titles using the "Descriptive titles" method. Simply enter both titles as a single line, separated by a single space (no more, no less), on the app's input form. For example, this is how one would enter the first five chapters of this book on the app's input form:

Code: Select all

CHAPTER I OLD PANAMA, AGAMEMNON, AND THE GENIAL PICAROON
CHAPTER II THE FIGHTING WHALE AND CHINAMEN IN THE CHICKEN COOP
CHAPTER III THROUGH A TROPICAL QUARANTINE
CHAPTER IV A FORCED MARCH ACROSS THE DESERT OF ATACAMA
CHAPTER V AREQUIPA THE CITY OF CHURCHES
Note that the input form is not case-sensitive, so one could have entered the above titles in all small caps, and it still would have worked. However, it's important to include punctuations, if there are any.

That is all for now. Ask if you have questions. Go forth and read something. Later.
Isana
Posts: 273
Joined: December 2nd, 2013, 12:46 pm
Location: USA

Post by Isana »

Me again, this time with an update and a quiz.

I've updated the app so that the PG source text is now displayed below the chapter input forms.

Here's a recent book from PG. See if you can get the app to work:
http://www.gutenberg.org/ebooks/48674

Hint: There are 12 chapters, and you can flag the end of the last chapter with the word "index".

(To Kangaroo692: I haven't forgotten your bug report. I still plan to work on it when I get the chance. I know how to do it, but I'm taking my time with it since it involves modifying the core counting routine, which scares me.)
Kangaroo692
Posts: 1939
Joined: August 21st, 2014, 9:34 am
Location: Probably the holodeck :)
Contact:

Post by Kangaroo692 »

No hurry. :)

P.S. - That's a wonderful update! It's so helpful when entering descriptive titles. A Hundred Years Hence worked perfectly.
Isana
Posts: 273
Joined: December 2nd, 2013, 12:46 pm
Location: USA

Post by Isana »

Thanks, Kangaroo! I actually still have to fix the embedded text because sometimes the PG text is not encoded in UTF-8. I'll probably work on it this weekend. Good enough for government work for now. :shock:
Kangaroo692
Posts: 1939
Joined: August 21st, 2014, 9:34 am
Location: Probably the holodeck :)
Contact:

Post by Kangaroo692 »

Hi. I've used WordCount recently (on multiple computers, W7 and W8, with Google Chrome, and with different ebooks) and every time I've gotten this error:
Error downloading ebook with ID 43286.
Thank you for your work on this program.
Isana
Posts: 273
Joined: December 2nd, 2013, 12:46 pm
Location: USA

Post by Isana »

Kangaroo692 wrote:Hi. I've used WordCount recently (on multiple computers, W7 and W8, with Google Chrome, and with different ebooks) and every time I've gotten this error:
Error downloading ebook with ID 43286.
Thank you for your work on this program.
Hello Kangaroo692! Thank you for reporting this. I will look into it later today. (I am currently finishing a long-overdue recording that I claimed. :D )
Isana
Posts: 273
Joined: December 2nd, 2013, 12:46 pm
Location: USA

Post by Isana »

Kangaroo692 wrote:Hi. I've used WordCount recently (on multiple computers, W7 and W8, with Google Chrome, and with different ebooks) and every time I've gotten this error:
Error downloading ebook with ID 43286.
Thank you for your work on this program.
Hi Kangaroo692 and everyone.

I have taken a look at this, and it seems it's a problem with the ibiblio FTP server, where PG ebooks are stored. From my logs, this problem started around May 7. I will try to use a mirror to circumvent this problem. I will post again here when it is fixed. Thank you for your patience.

To those who are active on and have some pull on Project Gutenberg, it would help me if you could keep an ear out there for any signs of recent problems on the ibiblio server related to this, and to give us news if you hear of any. Thank you. :)
Kangaroo692
Posts: 1939
Joined: August 21st, 2014, 9:34 am
Location: Probably the holodeck :)
Contact:

Post by Kangaroo692 »

Isana wrote:To those who are active on and have some pull on Project Gutenberg, it would help me if you could keep an ear out there for any signs of recent problems on the ibiblio server related to this, and to give us news if you hear of any. Thank you. :)
I've pointed Don (DACSoft) to this thread. He volunteers at PG.
DACSoft
Posts: 1981
Joined: August 17th, 2013, 8:51 am
Location: Connecticut, US

Post by DACSoft »

Kangaroo692 wrote:
Isana wrote:To those who are active on and have some pull on Project Gutenberg, it would help me if you could keep an ear out there for any signs of recent problems on the ibiblio server related to this, and to give us news if you hear of any. Thank you. :)
I've pointed Don (DACSoft) to this thread. He volunteers at PG.
Well, thanks for the promotion Kangaroo :), but I'm actually a volunteer at Distributed Proofreaders, which supplies most of the books (but not all) that PG publishes, so I'm not a PG admin, whitewasher (WWer - a team which checks submissions before posting them), nor directly involved in PG itself.

However, just as DP is a partner with PG, so is the site IBiblio, which according to PG's "Partners, Affiliates and Resources" page: "IBiblio is our main eBook distribution site, holds our Web pages, and offers a variety of supporting services."

Browsing the main page of their (IBiblio's) site, I find some tweets beginning May 7, about maintenance/upgrade to the load balancing system, and May 20-21 maintenance to their mySQL structure (mySQL2 and mySQL3). These may be affecting the Word Count program.

Isana, I'd suggest contacting PG's general help (help2015 AT pglaf DOT org) or the Webmaster (webmaster2015 AT pglaf DOT org) - they don't provide contacts for their developers or database people. You probably have to go through PG, but you may want to review the information at IBiblio (www DOT ibiblio DOT org)

I hope this helps.

ETA: By the way, I've tried the WC program, and like it a lot! :thumbs:

Don
Don (DACSoft)
Bringing the Baseball Joe series to audio!

In Progress:
The Arrival of Jimpson; Baseball Joe in the World Series
Next up:
Two College Friends; Baseball Joe Around the World
Isana
Posts: 273
Joined: December 2nd, 2013, 12:46 pm
Location: USA

Post by Isana »

Hi everyone. I have changed the FTP server to PG's Seattle mirror. It should be working again now. Thanks again to Kangaroo692 for reporting the error and for the help.

Don, thank you for the help and detailed information. I shall try to review the ibiblio pages and contact them if they don't resolve it soon. I should probably also follow them on Twitter if they tweet about these things. :thumbs:
kayray
Posts: 11828
Joined: September 26th, 2005, 9:10 am
Location: Union City, California
Contact:

Post by kayray »

Hi Isana,

I can't get Wordcount to work on Dracula, ID 345 ( http://www.gutenberg.org/ebooks/345 )
The result is "Error downloading ebook with ID 345."

Have tested a few other books at random and they don't work either.

You've got an expired security certificate. Could that be the problem?
Kara
http://kayray.org/
--------
"Mary wished to say something very sensible into her Zoom H2 Handy Recorder, but knew not how." -- Jane Austen (& Kara)
Isana
Posts: 273
Joined: December 2nd, 2013, 12:46 pm
Location: USA

Post by Isana »

Hello everyone.

I apologize that I had to take down the WordCount app for the better part of last year. I took it offline to fix a problem, but life got in the way and I wasn't able to give it my attention. 2016 was a doozie.

Anyway, I am pleased to announce that the WordCount app is back up, with bugs squelched until the next one pops up:

https://karikarito.com/wordcount

As usual, let me know if you need help with a specific eBook.

Additionally, my retroLV site (a catalog of LibriVox audiobooks that makes use of Internet Archive data) is also back up after bug fixes:

https://karikarito.com/retrolv

Thank you.
kathrinee
Posts: 8397
Joined: May 14th, 2012, 5:09 am
Location: in the sun

Post by kathrinee »

Thank you for letting us know! :D
Kathrine
Post Reply