Probability and statistics are two separate academic disciplines which are studied together. Statistical analysis use probability distributions. wherein Probability theory contains more of mathematics. However, there are topics in statistics which are independent of probability. |

**Conditional Independence:**

Conditional independence of two random variables A and B given C is

P(A, $\frac{B}{C}$) = P($\frac{A}{C}$) $\times$ P($\frac{B}{C}$)

A and B are conditionally independent given C.

**Compound Event:**An event that combines two or more simple events.

**Probability that one event will occur given that another has happened.**

Conditional Probability:

Conditional Probability:

**Event:**An event is a subset of the sample space.

**Simple Event:**An event that cannot be broken down any further.

**Probability Experiment:**Action for which an outcome or measurement is obtained.

**Kurtosis:**A measure of peakedness of the probability distribution of a real valued random variable.

**Sample is a subset of population.**

Sample:

Sample:

**Sample Space:**Set of all possible outcomes in an experiment.

**Mutually Exclusive:**Occurrence of one event in an experiment prevents the occurrence of the other event from occurring during the same trial.

**The events are said to be exhaustive, when atleast one of the events compulsorily occurs.**

Exhaustive:

Exhaustive:

**Outcome:**A possible result of a probability experiment is called an outcome.

**Random Variable:**A variable whose value is subject to variations due to chance.

**Impossible Event:**An event which has zero probability of occurring.

**Probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value.**

Probability Mass Function:

Probability Mass Function:

**Pair Wise Independent:**Pairwise independent collection of random variables is a set of random variables, any two of which are independent.

**Skewness:**Measure of the extent to which a probability distribution of a real valued random variable leans to one side of the mean.

**Both events A and B will occur.**

Joint Probability:

Joint Probability:

P(A and B) = P(A) $\times$ P($\frac{B}{A}$)

**Relative Frequency:**Determined by observation or experiments.

P(E) = $\frac{\text{Number of favorable outcomes}}{\text{Total number of trials}}$

Given below are some of the example problems in probability.

### Solved Examples

**Question 1:**When a die is rolled, what is the probability of obtaining an even number?

**Solution:**

Probabilities obtained when a die is rolled.

P(1) = $\frac{\text{Number of ways to roll a one}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(2) = $\frac{\text{Number of ways to roll a two}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(3) = $\frac{\text{Number of ways to roll a three}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(4) = $\frac{\text{Number of ways to roll a four}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(5) = $\frac{\text{Number of ways to roll a five}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(6) = $\frac{\text{Number of ways to roll a six}}{\text{Total number of sides}}$ = $\frac{1}{6}$

P(Even ) = $\frac{\text{Number of ways to roll an even number}}{\text{Total number of sides}}$ = $\frac{3}{6}$

Probability of obtaining an even number is $\frac{1}{2}$

**Question 2:**Probability that a student takes mathematics and statistics is 0.09 and probability that a student takes statistics is 0.72. What is the probability that a student takes mathematics given that the student is taking statistics?

**Solution:**

Let A: Students taking mathematics course.

Let B: Students taking statistics course.

P($\frac{A}{B}$ ) = $\frac{P(A \cap B)}{P(B)}$

P($\frac{\text{Mathematics}}{\text{Statistics}}$ ) = $\frac{P(\text{Mathematics and Statistics})}{P(\text{Statistics})}$

= $\frac{0.09}{0.72}$

= 0.125

Therefore, the probability that a student takes mathematics given that the student is taking statistics is 0.125.

**Anova:**Collection of statistical models used to analyze the difference between group means and their associated procedures.

**Average:**Sum of a list of numbers divided by the size of the list.

**Bar Chart is with rectangular bars with lengths proportional to the values they represent. Shows comparisons among categories.**

Bar Chart:

Bar Chart:

**Bernoulli Trial:**An experiment whose outcome is random and can be either of the two possible outcomes, success and failure.

**Binomial distribution:**A discrete probability distribution of the number of successes in a sequence of n independent trials each of which yields success with probability p and failure with probability 1 - p.

**Correlation:**Measures the degree to which two variables vary together.

**Census:**Procedure of systematically acquiring and recording information about the members of a given population.

**Is a type of table in a matrix form that displays the frequency distribution of the variables.**

Contingency Table:

Contingency Table:

**Deviation:**Difference between the value of an observation and the mean of the population.

**Dot plot:**A statistical chart consisting of data points plotted on a simple scale representing quantitative values.

**Data collection:**A process of preparing and collecting data.

**Estimator:**Rule for calculating an estimate of a given quantity based on observed data.

**Expected Value:**Measure of central tendency for a variable.

E = $\sum$ [x.P(x)]

**Forecasting:**A process of making statements about events whose outcomes have not been observed.

**Goodness of Fit:**For a statistical model, it describes how well it fits a set of observations. Summarizes the discrepancy between observed values and the expected values.

**Grouped Data:**A raw data set organized by constructing a table showing the frequency distribution.

**Histogram:**A graphical representation of the distribution of data having an estimate of the probability distribution of a continuous variable.

**A process of deriving logical conclusions from known premises.**

Inference:

Inference:

**A psychometric scale commonly involved in research that employs questionnaire.**

Likert Scale:

Likert Scale:

**Lorenz Curve:**Graphical representation of the cumulative distribution function representing proportion of the distribution.

Multiple Correlatio

Multiple Correlatio

**n:**Measure of how well a given variable can be predicted using a linear function of a set of other variables.

**Non Parametric Test:**Doesn't rely on the assumptions that the data are drawn from a given probability distribution.

**Outlier:**A observation that is numerically distant from the rest of the data.

**Ogive:**A graph that represents the cumulative frequencies for the classes in a frequency distribution and it is a continuous frequency curve.

**Pie Chart:**A circular chart divided into sectors illustrating numerical proportion.

**Random Sample:**Subset of individuals chosen from a larger set.

**Regression:**A statistical technique for estimating the relationships among variables.

**Scatter Plot:**A mathematical diagram using Cartesian coordinates to display values for two variables for a set of data.

**Time Series:**Sequence of data points measured typically at successive points in time spaced at uniform time intervals.

**Forecasting:**A process of making statements about events whose outcomes has not been observed.

**Variance:**Measure of how far the numbers are spread out.

**Z Statistic:**Is the number of standard deviations an observation is above the mean.

Given below are some of the problems in statistics.

### Solved Examples

**Question 1:**From the given table below construct a less than cumulative frequency table.

Marks |
0 - 10 |
10 - 20 |
20 - 30 | 30 - 40 |
40 - 50 |
50 - 60 | 60 - 70 |
70 - 80 |
80 - 90 | 90- 100 |

Frequency |
6 | 8 | 10 | 12 | 15 | 17 | 4 |
6 | 15 | 20 |

**Solution:**

Less than cumulative frequency table for the given data is constructed below.

Marks |
Frequency |
Less than cumulative frequency |

0 - 10 | 6 | 6 |

10 - 20 | 8 | 14 |

20 - 30 | 10 | 24 |

30 - 40 | 12 | 36 |

40 - 50 | 15 | 51 |

50 - 60 | 17 | 68 |

60 - 70 | 4 | 72 |

70 - 80 | 6 | 78 |

80 - 90 | 15 | 93 |

90 - 100 | 20 | 113 |

**Question 2:**Temperatures were recorded in a city for a period of one month and the data is given below. Find the range, mean and mode.

**Solution:**

Solution : Range = Highest temperature - Lowest temperature

= 37 - 17

= 20

Mean = $\frac{\sum_{i=1}^{30}x_{i}}{n}$

= $\frac{(20.7+ 20.6 + 18 + 25 + 19 + 20.6 + 21 + 17 + 24 + 30.2+ 34 + 29 + 37 + 35 + 32 + 18 + 22 + 24 + 28 + 20.6 +24 + 31 + 36 + 19 + 20. 6)}{30}$

= 25.052

Mode = Value that occurs most often in a data set = 20.6

Therefore, range = 20, mean = 25.052 and mode = 20.6.