Numbers, we see them everywhere. For instance, in the image at the beginning of this article, there are lots of numbers, but two men have shown that a higher percentage of those numbers than expected will start with "1".
How is this possible? In 1881, the Canadian-American astronomer Simon Newcomb made an interesting observation. He noted that the pages in a book of logarithm tables starting with the number "1" were more worn than the other pages.
In 1938, the American electrical engineer, Frank Benford, tested Newcomb's notion on over 20 different data sets which included: the surface areas of 335 rivers, 3,259 U.S. populations, 104 physical constants, 1,800 molecular weights, 418 death rates and even 308 numbers contained within an issue of Reader's Digest. He published his findings in a paper entitled, "The Law of Anomalous Numbers."
Benford's Law, which is also known as the Newcomb-Benford Law, describes the frequency distribution of the leading digits of numbers within sets of numbers. It states that the number "1" appears as the first digit, or leading significant digit, around 30% of the time, the number "2" appears as the first digit 17.6% of the time, down to the number "9" which appears as the first digit less than 5% of the time.
If each number between "1" and "9" had an equal chance of appearing as the first digit of a number, each would appear 11.1% of the time.
Benford's law applies only to sets of numbers that span multiple orders of magnitude and that are not dimensionless, so the numerical values of the data depend on the units. If you remember from high school math, an order of magnitude is a logarithm, usually having base ten, of a number, and it can be expressed as follows:
1 = 1 x 100
10 = 1 x 101
100 = 1 x 102
1000 = 1 x 103, etc.
So, to what data sets might Benford's Law apply? It applies to street addresses, Fibonacci numbers, electricity bills, stock prices, factorials, home prices, population numbers, death rates, powers of 2, the lengths of rivers, and both physical and mathematical constants.
Fibonacci numbers are formed by adding two successive numbers starting at "0"
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144 ...
0 + 1 = 1, 1 + 1 = 2, 2 + 1 = 3, 3 + 2 = 5, 5 + 3 = 8
Factorials, written as n!, are the product of all positive integers that are less than or equal to n. For example,
5! = 5 x 4 x 3 x 2 x 1 = 120.
Physical constants include c, the speed of light in a vacuum, or 186,282 (299,792 k) miles per second, h, the Planck constant, or, 6.62607015 × 10−34, G, the Newtonian constant of gravitation, or 6.674 × 10−11 m3 x kg−1x s−2, and e, the elementary charge, or 1.602176634 × 10−19 Coulombs.
On a logarithmic scale, the probability of a number being the leading digit is proportional to the interval between numbers on the scale. The interval between the log of the number ".1" and the log of the number ".2" is much wider than the interval between the log of the number ".8" and the log of the number ".9". And, the same thing is true, and the intervals are the same, for the numbers "1" and "2", "10" and "20" and "100" and "200" and the numbers "8" and "9", "80" and "90" and "800" and "900".
Therefore, any random number is much more likely to fall into the wider intervals than into the narrower ones. Remarkably, this works whether we use a logarithmic scale having base 10 or any other base greater than base 2.
Numbers expressed in base 2 are called binary numbers. They are always expressed beginning with the number "1", and they include only "1s" and "0s". You probably recognize these as the kind of numbers used by computers.
Data sets that conform to Benford's Law
While stock market prices conform very accurately to Benford's Law, sets of numbers within a single order of magnitude do not. For example, the heights of adults, which will have a starting digit of "4", "5", "6" or possibly even "7", and IQ scores, which range from 70 to 130 or above, are unlikely to conform to Benford's Law.
However, if we look at the heights of the 58 tallest structures in the world, "1" is overwhelmingly the most common leading digit regardless of the unit of measurement. That means it doesn't matter if we measure structure heights in meters, yards, feet, inches, or centimeters, "1" still wins.
When a data set is scale invariant, that is, it is independent of the units that the data are expressed in, the distribution of first digits always conforms to Benford's Law.
If we look at the populations of the 150 most populated countries in the world, guess what leading digit wins again, the number "1".
Uses for Benford's Law
In 1972, the economist Hal Varian, who is now Google's chief economist, suggested using Benford's Law to detect fraud in economic data. It is rumored that the Internal Revenue Service, (IRS) uses Benford's Law to analyze income tax returns, and Benford's Law has even been admitted in criminal cases on the federal, state, and local levels.
After the introduction of the Euro in 2002, Benford's Law was applied to the prices of goods which had to be converted from various currencies into the Euro. While the first digits of the new prices in Euros conformed to Benford's Law, the second and third digits did not, and this confirmed that there was a difficulty in converting prices to the new currency.
Recently, researchers have applied Benford's Law to the daily number of COVID-19 cases and deaths reported by several counties. While numbers from the U.S., Brazil, India, Peru, South Africa, Colombia, Mexico, Spain, Argentina, Chile, the UK, France, Saudi Arabia, China, Philippines, Belgium, Pakistan, and Italy all conformed to Benford's Law, numbers from Russia and Iran did not.
Recently, Physics World reported that an analysis of the 2020 election votes using Benford's Law found no irregularities in the vote counts from Fulton County, GA, Allegheny County, PA, Milwaukee, WI, and Chicago, IL.
If you want to confirm Benford's Law for yourself, in the image at the top of the screen, try comparing the numbers having the leading digit "1" with those having other leading digits. Guess who wins.