The descriptive statistic is a very important field of Statistics which tells us all the disciplines which are required for the collection of data. The objective of the descriptive statistics is to make the summarized data Sets. It is not like inductive statistics where data is used to gain knowledge about the population. This is not developed and designed on the basis of the theory of Probability. |

The collection of data in statistics follows certain steps:-

1. Before the collection actually starts: this part includes that the team collecting the data should be very be clear about its aim, goals, the target and the methods applied.

2. Main statistics collection: it includes the main part of data collection by the team.

3. Post collection: it includes the sorting of the data and the presentation

The steps before the collection and the steps after the collection are very important just as the main collection process since the poor sampling of the data in these steps could be adverse to the output of the statistics data collection.

There are two types of collection of data in statistics:

1. Quantitative and qualitative methods in statistics data collection

2. Sample and census data collection method in statistics

3. Primary and secondary data collection methods in statistics

Quantitative data collection in statistics is a method based on the data sampling. For instance, suppose we need to find the data regarding population, then we will use Quantitative data collection in statistics.

This method includes:

1. Experiment on the raw data

2. Observation

3. Getting result

4. Sorting and analysis

Qualitative data collection in statistics is fundamentally used to know the process of quantitative analysis more deeply. It improves the quality of the result found by the former method.

This method includes:

1. Less number of rules

2. The team relies on face to face interviews

3. They confirm by several data collection methods

4. The observation is not confined to any specific area of Interest rather it can be used as general pattern.

Samples and census data are collected by the sample method and census method respectively.

Primary data is the raw data which is collected by different basic methods like interview, questionnaires, etc.

The secondary data is the data which is obtained from the primary data and is used for further analysis. So this was a brief description about the data collection in statistics.

For certain research studies, we require the collection of data to attain certain conclusions. We must ensure that the data collected should be accurate as incorrect data leads to fake results and wrong conclusions. Sometimes the data is collected from the primary sources by the researchers themselves as per the need of the research. They collect them as per their need, though this method of Data Collection is costly and more time consuming.

There are different methods of data collection. Any data collection method refers to the way or Mode of gathering data. The important data collection methods are (a) observation, (b) interviewing, (c) mail survey, (d) experimentation, (e) simulation, and (f) projective technique.

There are two types of collecting the data:

Quantitative method of collecting data: It is the method of random sampling and structured data collection method. They produce results that are east to summarize, compare and generalize.

Different ways to collect quantitative data are:

1. By Experiment and clinical trial method

2. By taking into account well defined events.

3. Questionnaires, face to face and telephonic conversation.

Various interviews of different types like face to face structured interviews where standard Set of questions were asked. On the other hand telephonic interviews are less time consuming and less expensive. If we look at Computer Aided Personal Interview, it is the modern technique used for conducting the interview which avoids the wastage of time and money used for the transportation. We can say that this type of data collection method can be much expensive to set up and needs the skill of the interviewers to operate computer .

Now we look at the questionnaires of different types:

1. Paper-pencil-questionnaire: This type of questionnaire is sent to large number of people and costs too less. So it saves the researchers time and money. We observe that people are more honest and truthful to answer these questionnaires and the true picture is observed, but most of the time we find people being casual and just not filling such forms and they are simply wasted.

2. Another type of the questionnaire we come across is web based questionnaires, which is a new and advanced means of Internet based research. This simply means receiving an e-mail on which you will simply click on a site and get a form to be filled. This type of research is easy and fast to be conducted, but it has a drawback that in this method, the people who do not have excess to the computer and internet cannot be included.

Qualitative collection of data: This type of collecting method is characterized by the following attributes:

1. In- depth interview conducted individually which helps to collect the detailed information about the candidates.

2. A research is conducted and then with the observation, we are able to collect the detailed information and qualitative output.

3. By Reviewing the documents qualitative data can be collected.

Basic statistical terms in statics include Arithmetic Mean, Median, Mode, Geometric mean and harmonic mean.

Basic Statistics terms are explained below-

Mean of raw data:

If there are n quantities x

_{1}, x

_{2}, x

_{3},..., x

_{n}then the mean is

Mean = x

_{1}+ x

_{2}+ x

_{3}+ … + x

_{n}/ n = ∑ x

_{i}/ n.

Mean for a frequency deviation (Direct method)

Mean = f

_{1}x

_{1}+ f

_{2}x

_{2}+ f

_{3}x

_{3}+ … + f

_{n}x

_{n}/ ∑ f

_{i}= ∑ f

_{i}x

_{i}/ ∑ f

_{i},

Where ‘x’ is a variable.

‘f

_{i}’ stands for the frequency of 'x'

_{I},

∑ f

_{i}stands for the total frequencies.

Mean by shortcut method

Mean = a + ∑ f

_{i}d

_{i}/ ∑ f

_{i}, d

_{i}= x

_{i}– a,

Where ‘a’ is assumed mean.

A box plot (also known as box and whisker plots) in Statistics is a method of representing numerical data graphically. To find the box whisker plot, we need to arrange our data in increasing order. Then we find out the Median of the given Numbers. If the numbers are odd, then the middle value of the number after sorting is the median. If the numbers are even then the Mean of the two middle numbers is the median. It consists of 5 parts:

1. Small observation

2. Lower quarter or Q1

3. Median

4. Upper quarter or Q2

5. Large observation

Box whisker plot are mainly used to describe the differences of population without considering statistical distribution. The Box Plots are constant and the lower quarter and the upper quarter are essentially 25

^{th}and 75

^{th}part of the total box whisker plot. The middle 50% is the median.

The box whisker plot can be used to represent the following-

1. The minimum and the maximum quantity of the whole data.

2. Standard deviation in the mean of data.

3. Any percentile of data.

The numbers of the value which are not included in the box plots or the outliers are represented with dot or Circle or star.

There are two types of Box and whisker Plots

1. Variable width box and whisker plots.

2. Notched box and whisker plots.

Variable width box and whisker represent the group of data by the width of the box proportional to the size of the group. Sometimes the width of the box plots is made proportional to the Square root of the size of the group.

Notched box and whisker narrows the size of the box around its median, it briefly describes the difference of medians. The width of the notches is proportional to the inter quarter range and inversely proportional to the square root of the size of data.

± 1.58 * IQR / √n

Advantages of box plots

1. They take less space.

2. Most useful for comparison of data Sets.

3. The width and the length of the box plots can be decided by its user.