Statistics for Machine
Learning
Second Internal Question Bank
5marks questions:
1. Describe the difference between simple linear regression and multiple
linear regressions.
2. Differentiate between logistic Regression and Polynomial Regression.
3. Explain any 2 types of tree based models.
4. Explain the terminology used in decision trees.
5. Describe the difference between linear regression and logistic
regression.
6. Explain Logistic regression.
7. Explain data reduction and its techniques.
8. Explain the data cleaning techniques.
9. Explain the process of data transformation.
10. Explain any 2 data visualization techniques.
11. Explain polynomial regression.
12. Describe the terminologies used in decision trees.
13. Find the regression coefficient of the equation Y on X
Advertising Expenses (X) Sales Revenue (Y)
(in $1000s) (in $1000s)
10 25
15 30
20 40
25 45
30 55
10 mark questions:
1. Height of 20 students in centimetre are given below
143, 160, 154, 159, 172, 165, 162, 171, 146, 165, 176, 145, 165,
185, 175, 186, 160, 158, 167, 172
Make a five figure summary and use this to draw a box plot for the
above data.
Note: (Five figure summary: quartiles, interquartile range, Min and Max
)
2. Explain any 2 data cleaning techniques.
3. Explain data reduction and its techniques.
4. Explain the process of data transformation.
5. Find the regression equation of X on Y for the following data:
X 10 12 16 11 15 14 20 22
Y 15 18 23 14 20 17 25 28
6. Find a linear regression equation for the following data
City No. of Sale of
Families (in automobiles
lakhs): X (in ‘000): Y
Belagavi 70 25.2
Bangalore 75 28.6
Hubli 80 30.2
Estimate the sales for
the year Kalaburagi 60 22.3 2020 for the
city Belagavi
Mangalore 90 35.4
which is estimated to
have 100 lakh families
assuming that the same relationship holds true.
7. Explain the common preprocessing techniques.