Zipf's Law

Zipf's Law (see for example http://www.parc.xerox.com/istl/groups/iea/papers/ranking/ranking.html, http://www.kornai.com/MatLing/statling.html) was formulated in the 1940's by Harvard linguistics professor George Kingsley Zipf (1902-1950) as an empirical generalisation, and states that the n-th most frequent word in a language shows up with frequency 1/n.

So the most frequent two words account for 150% of the language?

  • ... ignoring boundary cases, obviously.


Zipf made the further assumption that, the shorter a word is, the more common it is; this ties in to the more general empirical observation that 'smaller' events are commoner than 'larger' events. (http://www.parc.xerox.com/istl/groups/iea/papers/ranking/ranking.html for other laws expressing this.) This observation is also referred to loosely as 'Zipf's Law', but is not what people outside linguistics understand by it.



However, this is only a generalization; & every language has common polysyllabic terms, because
they are useful. It doesn't mean a long term is somehow "doomed". (And as Talen says, 'If you want
Vorlin, you know where to find it.')

  • How common do you mean? Just "everyday", or "extremely high frequency, top 250 words" sort of thing? If the former, then yes, but that is not really the key issue. If the latter, I will bother to check for English. --And


Created by MichaelHelsem. Last Modification: Wednesday 18 of December, 2002 15:28:49 GMT by MichaelHelsem.