Please read the first post of this series before this one
Learning From Numbers To Generate New Kowledge- Part 1
2. Role of statistics in different activities
Statistics is not a subject like physics, chemistry or biology. A physicist solves a problem in physics using his knowledge of physics. A chemist solves a problem in chemistry using his knowledge of chemistry, and so on. But there is no problem in statistics which we solve by using our knowledge of statistics. Essentially a statistician helps in solving problems posed by others arising in their fields of study. All investigations in science or other activities start with formulating a problem, generating relevant data, processing it, and extracting information to throw light on the problem posed. All these need special skills which a statistician is trained to do.
2.1 Scientific research
“Scientific laws are not advanced by the principle of authority or justified by
faith or medieval philosophy; statistics is the only court of appeal to new
knowledge.”
- P.C.Mahalanobis
A scientist proposes a theory to explain some natural phenomenon. An experiment is needed to verify the theory. How to design an experiment to get the maximum information from the data generated to estimate the accuracy of the theory. If the accuracy is not within acceptable limits, can the data generated from the experiment enable us to suggest improvements in the proposed theory or to propose a new theory. The new theory can be tested by further experimentation. These problems can be answered with statistical help using design of experiments developed by R.A.Fisher. Emphazing the need for consulting a statistician before the experiment is conducted, Fisher said:
“You get 10 times more information from a carefully designed experiment. To consult a
statistician after the experiment is finished is often to merely ask him to conduct a
postmortem examination. He can only say what the experiment died of”.
Through collection of relevant data by optimally designed experiments and appropriate data analysis to test hypotheses based on the proposed theory and to provide clues for improvement of the theory or for possible alternatives, statistics enables the scientist to have a full play for his creative imagination to discover new phenomena or suggest improvements in the proposed theory. Science advances through the following endless process:
-Theory-Experiment –Statistical assessment of experimental results- New theory-
2.2 Statistics as an investigative technology
“Statistics is the technology of finding the invisible and measuring the immeasurable”.
2.2.1 Measure the immeasurable
For instance narcissism, a personality disorder, is hard to measure. However, we can measure a large number of other characteristics of a person which are affected by this disorder. Statistical methodology enables us to connect the measure of narcissism, as a latent variable, to the measurable characteristics through a structural equation model, and estimate it.
2.2.2 Classification or discrimination
There was a policy in US military that while recruiting a person to the army, “ask not and do not answer” about the homosexuality of the person. However, a sample of urine of the person can be obtained and tested for the amounts of androgen and estrogen. It is seen from the two dimensional chart of the measurements obtained from sets of individuals whose sexual orientation was known, that the homo and heterosexual persons are in 2 different regions, separated by a line, apart from a few exceptions. By plotting the point for any particular individual, his sexual orientation can be inferred with a high degree of success based on the region in which his measurements fall.
This method known as discriminant analysis in statistics, developed by R.A.Fisher and perfected by various authors, has been a powerful tool in such problems. For instance, the method can be used in problems such as medical diagnosis to determine which out of several possible diseases a patient is suffering from based on a number of diagnostic tests, in detecting whether currency is faked and numerous other situations.
2.3 Birth order and eminence
Scholarly interest in the relationship between birth order and extraordinary achievement can be traced to 1874 when Francis Galton published English Men of Science: Their Nature and Nurture. This book chronicled the lives of 180 eminent men from various fields. Galton was able to collect birth order data from 99 of his subjects, revealing that 48% of them were first born sons or only sons. The percentages of the second and third born were very low. Interest in birth order and eminence has continued, and countless studies have confirmed Galton’s conclusions that eminence achieved or intelligence of a person depends on his birth order, the fist one being more intelligent than the second, the second more intelligent than the third and so on. The table gives results of intelligent tests conducted on children from families of different sizes, indicating the birth order effect on intelligence.
It would be of interest to investigate the causes of birth order effect. It is believed that the first born gets more parental attention than the later born and has a chance of growing up in the company of adults and learn from them. The second born has similarly more opportunities than the third and so on.
2.4 Common breeding ground of eels.
This is an example to show how learning from numbers led to an important discovery. In the early years of the last century, Johannes Schmidt, a scientist at the Carlsberg Laboratory found that the numbers of vertebrae and fin rays of the same species of fish caught from different localities, often even from different parts of the same lake, varied considerably. With eels, however, in which the variation in vertebrate number is large, Schmidt found sensibly the same mean, and the same standard deviation in samples drawn from all over Europe, from Iceland, from the Azores and from the Nile river, which are widely separated regions, about 1000 miles apart. He inferred that the eels of all these different river systems came from a common breeding-ground in the ocean, which was discovered 50 years later in one of the expeditions of the research vessel “Dana”. Statistical theory was unknown when Schmidt made this discovery. Simple computations of the mean and standard deviation were the only tools used.
2.5 Mournful numbers
We are continuously made aware of, through news papers, magazines and other news media, the good and deleterious effects of our dietary, exercise, smoking and drinking habits, and the stress in our profession and other daily activities. The following table gives the information on the number of days lost or gained in one’s life due to various causes. The numbers may not be appropriate for specific individuals. However, they provide useful guidelines in making individual decisions.
2.6 The importance of being left handed
T.A.Davis, a professor at the Indian Statistical Institute made several studies on coconut trees which can be classified as left-handed or right- handed depending on the direction of its foliar spiral. By doing experiments he found that spirality is not genetically inherited and left handed trees yield 10% more coconuts than the right handed trees, a conclusion of economic importance. A recommendation was made to the Government in the state of Kerala to grow only the ” leftists to increase the production of nuts”.
2.7 Chronobiology and appropriate time to take Vitamin C
Chronobiology is the study of changes in body chemistry during the day. Measurements made on the human body at different times of the day reveal some interesting facts. We are 1 cm taller in the morning than at the time we go to bed. The cortisol level is about 16mg/100 in the morning and it drops to 6mg/100 at bed time. The high cartisol level in the morning wakes you up and you are more alert. Teachers want to teach in the morning because students are more attentive in the morning due to high cartisol level. It was found that vitamin C is better absorbed if taken after a meal.
The examples given above show how numbers generated through experiments or generated through normal transactions provide us with knowledge or information to take optimal decisions in all our activities.
2.8 Facts before theory
“It is a capital mistake to theorize before one has data. Insensibly, one begins to
twist facts to suit theories instead of theories to suit facts.
- Sherlock Holms
Without good information, you won’t see things as they really are-you will see them
as you think they are.
“Aristotle maintained that women have fewer teeth than men; although he was
married twice, it never occurred to him to verify his statement by examining his
wives ’mouth”.
- Bertrand Russel
2.9 Computational stylistics
The total number of words in all the known works of Shakespeare is 884647 of which 31534 are distinct. Using a statistical method proposed by R.A.Fisher, it is estimated that Shakespeare probably knew about 35000 more words which he did not use in his writings. The total number of words Shakespeare knew is about 66000 out of about 100000 words in the English language in his time. The question arises whether Shakespeare wrote all the plays attributed to him or he had co-authors. Statistical methods, known as computational stylistics, provides answers to questions of this kind. Comparing the styles in terms of rhetorical devices, polysyllabic words and metrical habits, the following possibilities have been mentioned in the book ”Shakespeare, Co-Author”, by Brian Vickers.
Ceorge Peele wrote a third of Titus Andronicus, Thomas Middleton, two-fifths of Timon of Athens, George Wilkins, two of the five acts of Pericles and John Fletcher, more than half of Henry VIII and The two Noble Kinsmen.
2.7 Chronobiology and appropriate time to take Vitamin C
Chronobiology is the study of changes in body chemistry during the day. Measurements made on the human body at different times of the day reveal some interesting facts. We are 1 cm taller in the morning than at the time we go to bed. The cortisol level is about 16mg/100 in the morning and it drops to 6mg/100 at bed time. The high cartisol level in the morning wakes you up and you are more alert. Teachers want to teach in the morning because students are more attentive in the morning due to high cartisol level. It was found that vitamin C is better absorbed if taken after a meal.
The examples given above show how numbers generated through experiments or generated through normal transactions provide us with knowledge or information to take optimal decisions in all our activities.
2.8 Facts before theory
“It is a capital mistake to theorize before one has data. Insensibly, one begins to
twist facts to suit theories instead of theories to suit facts.
- Sherlock Holms
Without good information, you won’t see things as they really are-you will see them
as you think they are.
“Aristotle maintained that women have fewer teeth than men; although he was
married twice, it never occurred to him to verify his statement by examining his
wives ’mouth”.
- Bertrand Russel
2.9 Computational stylistics
The total number of words in all the known works of Shakespeare is 884647 of which 31534 are distinct. Using a statistical method proposed by R.A.Fisher, it is estimated that Shakespeare probably knew about 35000 more words which he did not use in his writings. The total number of words Shakespeare knew is about 66000 out of about 100000 words in the English language in his time. The question arises whether Shakespeare wrote all the plays attributed to him or he had co-authors. Statistical methods, known as computational stylistics, provides answers to questions of this kind. Comparing the styles in terms of rhetorical devices, polysyllabic words and metrical habits, the following possibilities have been mentioned in the book ”Shakespeare, Co-Author”, by Brian Vickers.
Ceorge Peele wrote a third of Titus Andronicus, Thomas Middleton, two-fifths of Timon of Athens, George Wilkins, two of the five acts of Pericles and John Fletcher, more than half of Henry VIII and The two Noble Kinsmen.
End of Part 2
Will be posting the concluding part of the lecture in my nest post – Archana




