View Full Version : Word Frequency
Jamesaritchie
04-09-2005, 01:10 AM
Okay, per the mention of determining word frequency in a document, I looked around and found a macro for Word that does the job perfectly, as far as I can tell. I've been using it for a couple of hours, and have no complaints at all.
It's the second macro on this page: http://wordtips.vitalnews.com/Pages/T1510_Generating_a_Count_of_Word_Occurrences.html
Steve 211
04-09-2005, 12:06 PM
Thanks, James. I've been looking for a site just like that. Now off to learn some new tricks.
SRHowen
04-09-2005, 05:36 PM
James, I sent you a private messege about a program that someone wrote that easily counts all words in a document and tells you how many times each is used without your having to put in the words at all. It lists them for you as well as has some other functions--
You have to save your doc as .rtf or plain text but it works great.
If anyone wants the link send me a messege and I'll point you in the right direction.
Jamesaritchie
04-09-2005, 07:50 PM
James, I sent you a private messege about a program that someone wrote that easily counts all words in a document and tells you how many times each is used without your having to put in the words at all. It lists them for you as well as has some other functions--
You have to save your doc as .rtf or plain text but it works great.
If anyone wants the link send me a messege and I'll point you in the right direction.
Thanks. I appreciate that.
This macro does a pretty good job. If you put "word" into the dioalogue box, it gives you a word list in alphabetical order. The first twelve words in a short story I tested were:
173 a
1 abducted
25 about
2 absence
5 accident
1 acre
5 acres
5 across
1 acting
1 add
2 added
1 affairs
If you type FREQ into the dialogue box, it gives you a list of all words by frequency. In this case, the top dozen words were :
288 the
173 a
171 and
160 i
137 to
124 was
112 it
92 of
69 in
62 but
60 that
60 he
If you know how to use macros, or just take a couple of minutes to learn, it also lets you put a button on the Word toolbar. Click the button and it gives you the list from inside Word without the need to save the file or do anything at all outside of Word. It counts the frequency of all words in the open document, and generates the list in a pop-up Word document. Very handy.
It also tells you how many different words the story has. In this case, a 6,000 word short story contained 1,331 different words. Just about right, in this case.
The only tweaking it needs is to change the maxword line in the macro from 9000 to 90000, if you want to run it on a really long novel, and to either add or delete from a line of excluded words inside the macro. All very evident as you enter the macro.
You also don't have to record a macro, of course. Since this is a finished macro, you just click on "create" inside the macro doalogue, then copy and paste from the website.
It also tells me I have some editing to do on this particular story, I definitely overused three or four words.
AnneMarble
04-09-2005, 08:34 PM
It's the second macro on this page: http://wordtips.vitalnews.com/Pages/T1510_Generating_a_Count_of_Word_Occurrences.html
Yay, I got it to work!
Back in the days of 5-1/4 floppies, I used to have a shareware grammar program that not only provided a list of the most frequently used words but also told you if you used the same word close to another usage of the same word. The grammar tips it provided were often silly, but I liked the way it kept an eye on word proximity.
I miss that program, but this will at least let me keep an eye on how often I use "then" and "that" and how often my characters glare and turn. ;)
I know this is the novel-writing thread, but one warning on that macro: If you're writing nonfiction and use footnotes, it's won't help; it definitely won't count material in footnotes, and may (I haven't tested it, but there's a bad assumption in the code that is broken by footnotes) choke entirely. Thus, it's not an all-purpose tool
sGreer
04-10-2005, 02:08 AM
Now if only there were such an option for the Macintosh?
Jamesaritchie
04-10-2005, 03:57 AM
I know this is the novel-writing thread, but one warning on that macro: If you're writing nonfiction and use footnotes, it's won't help; it definitely won't count material in footnotes, and may (I haven't tested it, but there's a bad assumption in the code that is broken by footnotes) choke entirely. Thus, it's not an all-purpose tool
No, it won't count frequency in footnotes. This is no problem for me, since I seldom use footnotes, and don't need a frequency count in them when I do. (At least, not one that includes the count with the master document count.)
I'm not a programmer by any stretch. My sole experience with programming lies in reading a lengthy article about six months ago on how to write macros in Visual Basic so I could write a couple for Word. But I did worry about this macro choking, so I tested it on four articles I have that contain footnotes, and it's performed fine. It just doesn't count frequency in the footnotes.
I thought about messing with the code, but realized counting footnotes would be a huge disadvantage for me unless the macro counted them as a separate document, and I have no idea how to make this happen.
AnneMarble
04-10-2005, 04:17 AM
Now if only there were such an option for the Macintosh?
There's a web site that does it:
http://textalyser.net/
Unlike the macro, the web site actually lets you exclude words under a certain length.
I think there are other, similar sites that do this because there seems to be a popular Java program that can count words, list the frequency, etc.
Jamesaritchie
04-10-2005, 05:38 AM
There's a web site that does it:
http://textalyser.net/
Unlike the macro, the web site actually lets you exclude words under a certain length.
I think there are other, similar sites that do this because there seems to be a popular Java program that can count words, list the frequency, etc.
The macro also lets you exclude words, but I've found the short words, a, I, it, and, but, to, can, got, etc, are the ones I need the most help elimitating. Excluding words under a certain length would defeat the entire purpose of the macro for me. These are the ones that kill my writing. The first thing I did with the macro was delete the strong of excluded words.
One of the words I worry most about is "that," and it has only four letters. But "a," "I," "but," "and," "he," "she," and "got" are the real killers.
There are many websites where you can enter text and get a count, but they all have the same problems where I'm concerned. . .you have to go to the website, you have to cut and paste, and you don'tt get to keep the word list inside of Word unless you do another cut and paste.
Vomaxx
04-10-2005, 06:11 AM
I can see that programs like this could be useful, but I think they could drive us crazy, too. I would say, use with caution.
My worry in this regard is not so much how often I use a word in a book but whether I have used the same word too often on one page, or even in successive sentences--a fault (well, it's often a fault) I am prone to, unfortunately. The best use of these programs might be page by page.
AnneMarble
04-10-2005, 06:40 AM
I can see that programs like this could be useful, but I think they could drive us crazy, too. I would say, use with caution.
Yup. I used to go through and rewrite sentences to delete as many "that's" as I could. Then I realized that (whoops, there's another one) the sentences often sucked without the "that." :(
Jamesaritchie
04-10-2005, 07:37 AM
Yup. I used to go through and rewrite sentences to delete as many "that's" as I could. Then I realized that (whoops, there's another one) the sentences often sucked without the "that." :(
Different styles. I think "that" is the most overused word in third person fiction, and any sentence where it can be eliminated becomes a much better sentence. If "that" isn't grammatically required, get rid of it, says I.
It may be because I'm so conscious of the word, but unnecessary "thats" jump out at me like cold sores on a model's lips, and it doesn't take many of them to make me stop reading. I think it's the first word a good editor does his best to eliminate. More often than not, "that" is simply an unnecessary word.
"Got" is a similar word, as in "I got up, and got dressed. Then I got a cup of coffee." For me, that's three gots too many. "Got" has its uses, but more often than not, it's overused and a horrid little thing. A weak verb, if ever there was one.
"And" and "but" can also be overused. But in the case of these conjunctions, too few can make the reading as bad as can too many. Sometimes these programs say much when they tell you a word hasn't been used enough.
In first person fiction, "I" is a horribly overused word, and the primary reason editors warn against newbies writing in first person. Use "I" too many times, and first person becomes unreadable. I can almost always look at the number of "I's" in a story and tell you whether or not it will be rejected on this basis alone.
I think the value of word frequency programs is mostly in the eye of the individual writer, but they save me an awful lot of time and energy in the final draft stage.
SRHowen
04-10-2005, 08:11 AM
That and Just are my two worse words, almost is right behind them.
As to first person, I do it well, but I see an awful lot done not very well. Many new writers feel it is easy to write because they can put themselves into the main character's shoes.
Doesn't work well most times.
I do a word seach for uses of I and if it can be said without using the I word, the sentence gets reworded.
Shawn
azbikergirl
04-10-2005, 10:55 AM
My text editor, NoteTab, has a frequency counter with its text statistics feature. The Word one would be nice so I don't have to copy/paste my story into the text editor to get the frequency.
I am a programmer, and I've been trying to put together a tool that would count sentence length, to get a snapshot of the rhythm. The output would be something like
17,12,22,5,10,19,11,8,16,15,1,23
and the purpose is to search for chunks like
17, 12, 22, 5, 6, 6, 6, 6, 7, 5, 16...
which could sound really choppy (of course, sometimes we want to do that for effect, but generally, paragraphs sound better when sentence lengths vary).
<snip>
I am a programmer, and I've been trying to put together a tool that would count sentence length, to get a snapshot of the rhythm.
Rhythm is so vital in writing and I think it's often overlooked. You can have plain prose, or wordy, descriptive, packed full writing - as long as it has a good rhythm, I believe it will work. Fascinating that you would be trying to 'detect' it with software. My feeling is that the pace is more syllabic, than sentence length.
Let me know how it goes!
Okay, per the mention of determining word frequency in a document, I looked around and found a macro for Word that does the job perfectly...
This is a FASCINATING thread! Is there another one that discusses word frequency, too? I did a search but didn't turn it up...
I'm really interested to know if there's a "rule" or "rule of thumb" for how many times a word should appear in a document (a percentage, I'm guessing).
I understand that works need rhythm, style, voice, etc. to truly succeed, but now I'd like to use word frequency count as a tool to make my writing better. Can you offer any general guidelines?
Julie Worth
01-12-2006, 12:10 AM
I'm really interested to know if there's a "rule" or "rule of thumb" for how many times a word should appear in a document (a percentage, I'm guessing).
I'm wondering too. I ran my latest thriller (90,000 words, computer count) thourgh a frequency counter, and found just over 8000 unique words. So is that good or bad?
Word gives me this:
Flesch Reading ease--79
Flesch Grade level--4.6
I've read that the grade level should be in the 4-6 range, but I haven't a clue as to the vocabulary.
blacbird
01-12-2006, 12:47 AM
Speaking of overused words, in rough drafts, I have a lot of trouble with "some", for some reason.
I keep a short but slowly expanding list of such repetitive weaknesses, "that" being another, and when I start editing, the first thing I do is search through the document and check every occurrence of any of them, excising as many as I can.
caw.
scarletpeaches
01-12-2006, 12:54 AM
My overused words come at the start of dialogue. "So..." and "Well..."
'That' is a frequent flyer as well. And 'then'.
If I want to count how many times I use these words, I just go into search/replace and, as an example, replace 'so' with 'so' - it then tells me "Word has made X-number of replacements" and my document has not been altered in any way!
DamaNegra
01-12-2006, 03:27 AM
Somewhere on the Tech Help board (I think it's on a sticky) there are macros for word that can count the frequency of words, highlight all the passives you use and can also count the frequency of phrases. I haven't used them, but it sounds very helpful.
jen.nifer
01-12-2006, 03:36 AM
Yes, if Margaret Atwood used such a tool, she'd have found that she seems to have an unhealthy obsession with the word furtively in The Blind Assassin.
I'm still enjoying the book though.
scarletpeaches
01-12-2006, 03:49 AM
I could never get past the first ten pages of any of her books without falling into a coma...
jen.nifer
01-12-2006, 05:29 AM
Hehe. She writes some good stuff, but I have to wade through a fair bit of waffling to get to it, unfortunately. If only she could condense. And she uses far too many adjectives, if you ask me. Generally, I try to finish any books that I start despite being very aware of the "life's too short to read a bad book" saying.
DamaNegra
01-12-2006, 05:34 AM
Here (http://www.absolutewrite.com/forums/showthread.php?t=11653) it is.
Word Frequency Counter:
http://absolutewrite.com/forums/showthread.php?t=11152
This program counts all instances of each word in a document. This can be useful to determine if you are using some words too frequently. In addition, it also tracks the shortest distance (in words) between any two instances of a word. (For instance, "I said that he said" would return 3. This can help determine if you are using words in too close proximity.
Phrase Frequency Counter:
http://absolutewrite.com/forums/showthread.php?t=11380
This program is similar to the Word Frequency Counter, but it counts only those phrases you specify. If you have a tendancy to use cliches or over-worked phrases, this can help identify them.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.