Even languages that we have yet to decode, such as the language in the Voynich Manuscript, seem to follow this rule.
Humans often take pride in the complexity and unpredictability of how we use language. However, a strange phenomenon known as Zipf’s Law has challenged this notion, revealing that the arrangement and frequency of words in most languages around the world adhere to a fixed mathematical law, even though the reasons behind it remain a mystery.
The frequency of words appearing in language follows a power law.
More than 80 years ago, linguist George Kingsley Zipf discovered that the frequency of words in language follows a power law. Specifically, the most common word in a language—such as “the” in English—appears twice as often as the second most common word, three times as often as the third most common word, and continues to decrease at this rate.
This phenomenon is not limited to English but occurs in every studied language, from Hindi and Mandarin to Spanish. Surprisingly, even undeciphered languages such as those in the Voynich Manuscript or ancient texts also comply with Zipf’s Law. Literary works such as On the Origin of Species by Charles Darwin and even Hamlet by Shakespeare are no exceptions.
Why does language adhere to this law?
Language is not completely random; it follows underlying rules.
The existence of Zipf’s Law raises many significant questions. One hypothesis proposed by George Zipf is the balance between effort and efficiency. Speakers or writers tend to use common words to minimize effort, while listeners or readers seek clarity in less common words. As a result, language evolves in a way that optimizes information transmission.
Another idea suggests that more common words have a “snowball effect”, meaning the more they are used, the more popular they become. However, no explanation has been universally accepted.
Exploring language through the lens of Zipf’s Law
Although linguists and mathematicians have yet to uncover the deeper reasons, Zipf’s Law opens up a new perspective on how language operates. This also highlights the strange logic of communication, showing that language is not entirely random but adheres to underlying rules.
You can even test the validity of this law by applying it to personal texts. Paste a novel or a long article into a language analysis software, and you will see just how astonishingly words conform to this law.
While we do not fully understand the reasons behind it, Zipf’s Law remains a fascinating testament to the bond between mathematics and language, raising significant questions about how language forms and evolves within human culture.