Sales Toll Free No: 1-855-666-7446
  • Math /
  • Probability and Statistics

Probability and Statistics


Probability and statistics are two separate academic disciplines which are studied together. Statistical analysis use probability distributions. wherein Probability theory contains more of mathematics. However, there are topics in statistics which are independent of probability.

Statistics is the study of the collection, organization, analysis, interpretation and presentation of data. Descriptive statistics involves methods of organizing, picturing and summarizing information from data. Inferential statistics involves methods of using information from a sample to draw conclusions about the population.

Probability is a measure of how likely something will happen or that statement is true. If a high degree of probability is present, it is more likely that the event is to happen. Many events cannot be predicted with total certainty.
A value of probability lies in the range of 0 to 1.

Probability Topics

Back to Top
Given below are the important topics in probability.

Conditional Independence:
Conditional independence of two random variables A and B given C is
P(A, $\frac{B}{C}$) = P($\frac{A}{C}$) $\times$ P($\frac{B}{C}$)

A and B are conditionally independent given C.

Compound Event: An event that combines two or more simple events.

Conditional Probability:
Probability that one event will occur given that another has happened.

Event: An event is a subset of the sample space.

Simple Event: An event that cannot be broken down any further.

Probability Experiment: Action for which an outcome or measurement is obtained.

Kurtosis: A measure of peakedness of the probability distribution of a real valued random variable.

Sample is a subset of population.

Sample Space: Set of all possible outcomes in an experiment.

Mutually Exclusive: Occurrence of one event in an experiment prevents the occurrence of the other event from occurring during the same trial.

The events are said to be exhaustive, when atleast one of the events compulsorily occurs.

Outcome: A possible result of a probability experiment is called an outcome.

Random Variable: A variable whose value is subject to variations due to chance.

Impossible Event: An event which has zero probability of occurring.

Probability Mass Function:
Probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value.

Pair Wise Independent: Pairwise independent collection of random variables is a set of random variables, any two of which are independent.

Skewness: Measure of the extent to which a probability distribution of a real valued random variable leans to one side of the mean.

Joint Probability:
Both events A and B will occur.
P(A and B) = P(A) $\times$ P($\frac{B}{A}$)

Relative Frequency: Determined by observation or experiments.
P(E) = $\frac{\text{Number of favorable outcomes}}{\text{Total number of trials}}$

Probability Problems

Back to Top
Given below are some of the example problems in probability.

Solved Examples

Question 1: When a die is rolled, what is the probability of obtaining an even number?
When a die is rolled, the possible outcomes are 1, 2, 3, 4, 5 and 6.

Probabilities obtained when a die is rolled.

P(1) = $\frac{\text{Number of ways to roll a one}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(2) = $\frac{\text{Number of ways to roll a two}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(3) = $\frac{\text{Number of ways to roll a three}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(4) = $\frac{\text{Number of ways to roll a four}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(5) = $\frac{\text{Number of ways to roll a five}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(6) =  $\frac{\text{Number of ways to roll a six}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(Even ) = $\frac{\text{Number of ways to roll an even number}}{\text{Total number of sides}}$ = $\frac{3}{6}$
Probability of obtaining an even number is $\frac{1}{2}$

Question 2: Probability that a student takes mathematics and statistics is 0.09 and probability that a student takes statistics is 0.72. What is the probability that a student takes mathematics given that the student is taking statistics?
We need to use conditional probability.
Let A: Students taking mathematics course.
Let B: Students taking statistics course.

P($\frac{A}{B}$ ) = $\frac{P(A \cap B)}{P(B)}$

P($\frac{\text{Mathematics}}{\text{Statistics}}$ ) = $\frac{P(\text{Mathematics and Statistics})}{P(\text{Statistics})}$

= $\frac{0.09}{0.72}$

= 0.125
Therefore, the  probability that a student takes mathematics given that the student is taking statistics is 0.125.

Statistics Topics

Back to Top
Given below are some of the terms commonly used in statistics:

Anova: Collection of statistical models used to analyze the difference between group means and their associated procedures.

Average: Sum of a list of numbers divided by the size of the list.

Bar Chart:
Bar Chart is with rectangular bars with lengths proportional to the values they represent. Shows comparisons among categories.

Bernoulli Trial: An experiment whose outcome is random and can be either of the two possible outcomes, success and failure.

Binomial distribution: A discrete probability distribution of the number of successes in a sequence of n independent trials each of which yields success with probability p and failure with probability 1 - p.

Correlation: Measures the degree to which two variables vary together.

Census: Procedure of systematically acquiring and recording information about the members of a given population.

Contingency Table:
Is a type of table in a matrix form that displays the frequency distribution of the variables.

Deviation: Difference between the value of an observation and the mean of the population.

Dot plot: A statistical chart consisting of data points plotted on a simple scale representing quantitative values.

Data collection: A process of preparing and collecting data.

Estimator: Rule for calculating an estimate of a given quantity based on observed data.

Expected Value: Measure of central tendency for a variable.
E = $\sum$ [x.P(x)]

Forecasting: A process of making statements about events whose outcomes have not been observed.

Goodness of Fit: For a statistical model, it describes how well it fits a set of observations. Summarizes the discrepancy between observed values and the expected values.

Grouped Data: A raw data set organized by constructing a table showing the frequency distribution.

Histogram: A graphical representation of the distribution of data having an estimate of the probability distribution of a continuous variable.

A process of deriving logical conclusions from known premises.

Likert Scale:
A psychometric scale commonly involved in research that employs questionnaire.

Lorenz Curve: Graphical representation of the cumulative distribution function representing proportion of the distribution.

Multiple Correlatio
n: Measure of how well a given variable can be predicted using a linear function of a set of other variables.

Non Parametric Test: Doesn't rely on the assumptions that the data are drawn from a given probability distribution.

Outlier: A observation that is numerically distant from the rest of the data.

Ogive: A graph that represents the cumulative frequencies for the classes in a frequency distribution and it is a continuous frequency curve.

Pie Chart: A circular chart divided into sectors illustrating numerical proportion.

Random Sample: Subset of individuals chosen from a larger set.

Regression: A statistical technique for estimating the relationships among variables.

Scatter Plot: A mathematical diagram using Cartesian coordinates to display values for two variables for a set of data.

Time Series: Sequence of data points measured typically at successive points in time spaced at uniform time intervals.

Forecasting: A process of making statements about events whose outcomes has not been observed.

Variance: Measure of how far the numbers are spread out.

Z Statistic: Is the number of standard deviations an observation is above the mean.

Statistics Problems

Back to Top
Given below are some of the problems in statistics.

Solved Examples

Question 1: From the given table below construct a less than cumulative frequency table.
Marks 0 - 10 
10 - 20 
20 - 30 30 - 40 
40 - 50 
 50 - 60  60 - 70 
70 - 80 
 80 -  90  90- 100 
Frequency  6 8  10  12    15 17 4
 6  15  20

Starting from the upper limit, add class frequencies to get less than cumulative frequency distribution.
Less than cumulative frequency table for the given data is constructed below.

   Frequency      Less than cumulative frequency  
  0 - 10           6                        6
 10 - 20           8                      14
 20 - 30          10                      24
 30 - 40          12                      36
 40 - 50          15                      51
 50 - 60          17                      68
 60 - 70           4                      72
 70 - 80           6                      78
 80 - 90         15                     93
 90 - 100         20                    113

Question 2: Temperatures were recorded in a city for a period of one month and the data is given below. Find the range, mean and mode.
20.7, 20.6, 18, 25, 19, 20.6, 21, 17, 24, 30.2, 34, 29, 37, 35, 32, 18, 22, 24, 28, 20.6, 24, 31, 36, 19, 20.6

Solution : Range = Highest temperature - Lowest temperature
= 37 - 17
= 20

Mean = $\frac{\sum_{i=1}^{30}x_{i}}{n}$
= $\frac{(20.7+ 20.6 + 18 + 25 + 19 + 20.6 + 21 + 17 + 24 + 30.2+ 34 + 29 + 37 + 35 + 32 + 18 + 22 + 24 + 28 + 20.6 +24 + 31 + 36 + 19 + 20. 6)}{30}$
= 25.052

Mode = Value that occurs most often in a data set = 20.6
Therefore, range = 20, mean = 25.052 and mode = 20.6.