Upload
alex-hawala
View
122
Download
14
Embed Size (px)
DESCRIPTION
THis is an
Citation preview
Benfords Law
Candidate Name: Alex Evat Lineekela HawalaCandidate number: 0015
Benfords LawWord count: 1179 wordsCandidate Name: Alex Evat Lineekela HawalaCandidate Number: 0015School Number: 001179Maths ExplorationWindhoek International School
RationaleI wrote the Math Exploration on Benfords Law as I wanted to find a statistical concept that could be applied to different aspects of mathematics. The topic was also chosen to explore what kind of data sets it could be applied to and also where in can be used in the real world. I used the Fibonacci sequence as a proof because I wanted to find another mathematical concept where Benfords Law could be applied. I then used data from the Namibia Statistics Agency so that I would be able to test the Law on a random set of data. As for what it could be used for in the real world, the case of the Arizona Treasury manager was an aspect of financial forensics that was useful in my investigation. In this investigation I was able to utilise mathematical concepts such as logarithms and statistics.
IntroductionStatistics are a part of mathematics that use numerical data in order to identify trends and patterns. These trends and patterns can be used to make predictions that can be used to solve problems. An aspect of statistics that will be discussed in this mathematical investigation is Benfords law, stated by Frank Benford in 1938. Benfords law refers to the frequency of the first digit in numbers in many sets of data. Benfords law states that in a set of data, numbers that have the first digit as 1 will occur the most. In this mathematical investigation I will first fully explain the concept of Benfords Law. Afterwards, I will use data from The Namibia Statistics Agency on Buildings Completed, and Fibonaccis Sequence to prove Benfords law. I will then provide an example on how the Statistical law is used in the real world in financial forensics, particularly in the detection of fraud. From what I have discovered in the Investigation I will make a sound conclusion regarding Benfords Law.BackgroundBenfords Law was first stated by Simon Newcomb in 1881, but was popularized by Frank Benford, who later stated the law in 1938. Bedford stated the Law after using data sets from numerous sources from the surface area of rivers, death rates, and telephone numbers. Benford found that the number of digits that begin with the number one occurred around 30% of the time, with while the number two occurred 17% of the time. As the numbers of the starting digits increased, the frequency of a number occurring in that number would decrease, therefore implying that the number of figures starting with the digit nine would occur much less than that of any other number in a data set, as opposed to the thought that any number from 1 to 9 would have an equal change of occurring in a set of data. Figure 1 represents the distribution of this data.Figure 1
Simon Newcomb had calculated this distribution with the formula:
Benfords law also has uses in the real world not only as statistical phenomenon but also as a method of detecting fraud in statistical forensics. For example, an Arizona bank manager was accused of committing cheque fraud in 1993. Figure 2 is a list of the transactions made by the manager. Under scrutiny one notices that most of the digits begin with an 8 or 9 in its values. This is the managers first mistake, as his list of transactions would then have a low correlation towards Benfords Law. Most of the values in the data are values closely below US$ 100 000, which would act as a threshold for the data. This was probably because the perpetrator avoided any transactions underneath the value as it would have prompted a human signature instead of the automated transfer using the Treasurys computer system.However, the distribution does not work in this manner. Humans do not assume that some numbers more frequently than others, and assume that numbers would tend to have random frequencies. Figure 3 displays the frequency of the values in the data set. Figure 2
As shown in the example, Benfords Law can detect the fraudulent distribution of data in financial statements. However, in the analysis I will investigate whether it applies to other statistical data and set of data.AnalysisThe Fibonacci sequence will be used as an example of how Benfords Law applies to any set of data. The Fibonacci sequence is a progressive sequence that begins with the number zero, followed by one. The third term is derived from the sum of the two previous terms, 0 and 1, which would equal 1. The next terms follow this pattern as well, making the sequence: . The Fibonacci sequence was used to prove Benfords Law in naturally occurring sequences by calculating the first 200 numbers in the sequence using Wolfram Alpha, a powerful computational knowledge engine that can be found on the internet. I then took the numbers from the sequence, and counted how many of numbers in the sequence began with the digits 1-9 respectfully. I calculated the frequencies of the data, using my Graphical Display Calculator as shown in Table 1, which displays the frequencies of the first digits in the data set. First DigitNumber of occurrencesFrequency
16030.0%
23617.5%
32512.5%
4189.0%
5178.5%
6126.0%
7116.0%
8126.0%
994.5%
Total200100%
Table 1Figure 3 plots the frequencies with the graph of to examine a correlation between the two.Figure 3
The data from the first 200 numbers from the Fibonacci sequence seemed to follow Benfords Law very closely, proving its usefulness in determining the distribution in natural mathematical sequences. I then continued the investigation by using a set of data that has not been tested against Benfords Law. The data in question is from the Namibia Statistics Agency, in a report named Monthly Building Report: January 2015. The report contains the indices on Buildings Completed in Windhoek, Swakopmund, Walvis Bay, and Ongwediva in Namibia from in the time period of January 2010 to December 2014. It is not disclosed what values were used to calculate the indices. The report also contained a composite index calculated from the four towns. The values that will be used in this section of the mathematical investigation will be the values from the composite index of the report.
The list of the composite values are shown in Table 3. MonthIndexMonthIndexMonthIndexMonthIndexMonthIndex
Jan 201047.9Jan 201163.6Jan201231.6Jan 2013102.9Jan 201469.7
Feb --123.5Feb --68.4Feb --162.7Feb --119.5Feb --145.1
Mar83.7Mar85.8Mar61.2Mar176.8Mar96.2
Apr85Apr179.5Apr221.6Apr124.5Apr61
May99.5May76.3May154.6May159.6May135.5
Jun125.6Jun160.7Jun259.1Jun139.7Jun104.7
Jul165.3Jul125.7Jul120.1Jul87.5Jul316.4
Aug72.7Aug171.6Aug122.1Aug137.3Aug153.7
Sep120.7Sep237.7Sep120.1Sep70.2Sep134.8
Oct97.4Oct53.2Oct253.9Oct108.4Oct277.2
Nov95.2Nov102.8Nov131.3Nov90.6Nov108.8
Dec83.6Dec100.1Dec83.9Dec80.3Dec107.5
Table 2The tally from the composite indices was performed once again. Table 3 shows the frequency of the numbers 1-9 as first digits of the indices respectively.First DigitNumber of occurrencesFrequency
13151.7%
258.3%
311.7%
411.7%
523.3%
658.3%
735.0%
8711.7%
958.3%
Total60100%
Table 3
Figure 4 displays the frequencies with the graph of to test the relationship between the two. Figure 4It was found that the values from the NSA did not follow Bensons law as closely as the values from the Fibonacci sequence. This could be due to the fact that the values were manipulated via human interaction and therefore did not correlate with the Law, as the values were not naturally recorded, and were made up of different values. This may imply that Benfords law only works with data sets that contain naturally occurring/recorded numbers. In addition the data used in the investigation was also altered in the manner that the data was given an indices threshold of 300, which creates a statistical bias. This bias then has the effect of neglecting any values over 300.ConclusionIn conclusion, I found that Benfords Law does not apply to every set of data, particularly data in which humans had a great influence, such as in the Namibia Statistics Agency Building Indices. In these cases, a maximum threshold was created, which created a bias in the data set. Other causes in the case of the Building Indices was that the values in the data set were compiled from different values, of which the sources were not disclosed in the report. However, it was found that the Benford Index applied to data set composed of natural sequences, such as the Fibonacci sequence. In addition, in the example of the Arizona Treasury managers case of fraud in 1993, the Benford Index can also be used to detect fraud in financial data, as human interference can be easily detected.
List of SourcesBenford, Frank. 1938. "The law of anomalous numbers." Proceedings of the American Philosophical Society 551572.Namibia Statistics Agency. 2015. "http://www.nsa.org.na/files/downloads/187_Building%20Plans.pdf." Namibia Statistics Agency. February 16. http://www.nsa.org.na/files/downloads/187_Building%20Plans.pdf.Newcomb, Simon. 1881. "Note on the frequency of use of the different digits in natural numbers." American Journal of Mathematics 39-40.Nigrini, Mark J. 1999. I've Got Your Number. May 1. http://www.journalofaccountancy.com/issues/1999/may/nigrini.Weisstien, Eric W. n.d. "Benford's Law" -- from Wolfram MathWorld. http://mathworld.wolfram.com/BenfordsLaw.html.Wolfram Alpha. 2015. first 200 fibonacci numbers - Wolfram|Alpha. February 20. http://www.wolframalpha.com/input/?i=first+200+fibonacci+numbers.
AppendixPositionNumber in sequence
11
21
32
43
55
68
713
821
934
1055
1189
12144
13233
14377
15610
16987
171597
182584
194181
206765
2110946
2217711
2328657
2446368
2575025
26121393
27196418
28317811
29514229
30832040
311346269
322178309
333524578
345702887
359227465
3614930352
3724157817
3839088169
3963245986
40102334155
41165580141
42267914296
43433494437
44701408733
451134903170
461836311903
472971215073
484807526976
497778742049
5012586269025
5120365011074
5232951280099
5353316291173
5486267571272
55139583862445
56225851433717
57365435296162
58591286729879
59956722026041
601548008755920
612504730781961
624052739537881
636557470319842
6410610209857723
6517167680177565
6627777890035288
6744945570212853
6872723460248141
69117669030460994
70190392490709135
71308061521170129
72498454011879264
73806515533049393
741304969544928660
752111485077978050
763416454622906710
775527939700884760
788944394323791460
7914472334024676200
8023416728348467700
8137889062373143900
8261305790721611600
8399194853094755500
84160500643816367000
85259695496911123000
86420196140727490000
87679891637638612000
881100087778366100000
891779979416004710000
902880067194370820000
914660046610375530000
927540113804746350000
9312200160415121900000
9419740274219868200000
9531940434634990100000
9651680708854858300000
9783621143489848400000
98135301852344707000000
99218922995834555000000
100354224848179262000000
101573147844013817000000
102927372692193079000000
1031500520536206900000000
1042427893228399980000000
1053928413764606870000000
1066356306993006850000000
10710284720757613700000000
10816641027750620600000000
10926925748508234300000000
11043566776258854900000000
11170492524767089100000000
112114059301025944000000000
113184551825793033000000000
114298611126818977000000000
115483162952612010000000000
116781774079430987000000000
1171264937032043000000000000
1182046711111473990000000000
1193311648143516980000000000
1205358359254990970000000000
1218670007398507950000000000
12214028366653498900000000000
12322698374052006900000000000
12436726740705505800000000000
12559425114757512700000000000
12696151855463018400000000000
127155576970220531000000000000
128251728825683550000000000000
129407305795904081000000000000
130659034621587630000000000000
1311066340417491710000000000000
1321725375039079340000000000000
1332791715456571050000000000000
1344517090495650390000000000000
1357308805952221450000000000000
13611825896447871800000000000000
13719134702400093300000000000000
13830960598847965100000000000000
13950095301248058400000000000000
14081055900096023500000000000000
141131151201344082000000000000000
142212207101440105000000000000000
143343358302784187000000000000000
144555565404224293000000000000000
145898923707008480000000000000000
1461454489111232770000000000000000
1472353412818241250000000000000000
1483807901929474030000000000000000
1496161314747715280000000000000000
1509969216677189300000000000000000
15116130531424904600000000000000000
15226099748102093900000000000000000
15342230279526998500000000000000000
15468330027629092400000000000000000
155110560307156091000000000000000000
156178890334785183000000000000000000
157289450641941274000000000000000000
158468340976726457000000000000000000
159757791618667731000000000000000000
1601226132595394190000000000000000000
1611983924214061920000000000000000000
1623210056809456110000000000000000000
1635193981023518030000000000000000000
1648404037832974140000000000000000000
16513598018856492200000000000000000000
16622002056689466300000000000000000000
16735600075545958500000000000000000000
16857602132235424800000000000000000000
16993202207781383200000000000000000000
170150804340016808000000000000000000000
171244006547798191000000000000000000000
172394810887814999000000000000000000000
173638817435613191000000000000000000000
1741033628323428190000000000000000000000
1751672445759041380000000000000000000000
1762706074082469570000000000000000000000
1774378519841510950000000000000000000000
1787084593923980520000000000000000000000
17911463113765491500000000000000000000000
18018547707689472000000000000000000000000
18130010821454963500000000000000000000000
18248558529144435400000000000000000000000
18378569350599398900000000000000000000000
184127127879743834000000000000000000000000
185205697230343233000000000000000000000000
186332825110087068000000000000000000000000
187538522340430301000000000000000000000000
188871347450517368000000000000000000000000
1891409869790947670000000000000000000000000
1902281217241465040000000000000000000000000
1913691087032412710000000000000000000000000
1925972304273877740000000000000000000000000
1939663391306290450000000000000000000000000
19415635695580168200000000000000000000000000
19525299086886458700000000000000000000000000
19640934782466626800000000000000000000000000
19766233869353085500000000000000000000000000
198107168651819712000000000000000000000000000
199173402521172798000000000000000000000000000
200280571172992510000000000000000000000000000
14