0% found this document useful (0 votes)
39 views2 pages

SML - II Internal Question Bank

The document contains a question bank for a statistics course focused on machine learning, with questions categorized into 5-mark and 10-mark sections. Topics include various regression types, decision trees, data cleaning, transformation, and visualization techniques. Additionally, it includes practical exercises involving data analysis and regression calculations.

Uploaded by

vividvortex07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views2 pages

SML - II Internal Question Bank

The document contains a question bank for a statistics course focused on machine learning, with questions categorized into 5-mark and 10-mark sections. Topics include various regression types, decision trees, data cleaning, transformation, and visualization techniques. Additionally, it includes practical exercises involving data analysis and regression calculations.

Uploaded by

vividvortex07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Statistics for Machine

Learning
Second Internal Question Bank
5marks questions:
1. Describe the difference between simple linear regression and multiple
linear regressions.
2. Differentiate between logistic Regression and Polynomial Regression.
3. Explain any 2 types of tree based models.
4. Explain the terminology used in decision trees.
5. Describe the difference between linear regression and logistic
regression.
6. Explain Logistic regression.
7. Explain data reduction and its techniques.
8. Explain the data cleaning techniques.

9. Explain the process of data transformation.


10. Explain any 2 data visualization techniques.

11. Explain polynomial regression.


12. Describe the terminologies used in decision trees.
13. Find the regression coefficient of the equation Y on X

Advertising Expenses (X) Sales Revenue (Y)

(in $1000s) (in $1000s)

10 25

15 30

20 40

25 45

30 55
10 mark questions:
1. Height of 20 students in centimetre are given below
143, 160, 154, 159, 172, 165, 162, 171, 146, 165, 176, 145, 165,
185, 175, 186, 160, 158, 167, 172
Make a five figure summary and use this to draw a box plot for the
above data.
Note: (Five figure summary: quartiles, interquartile range, Min and Max
)

2. Explain any 2 data cleaning techniques.


3. Explain data reduction and its techniques.
4. Explain the process of data transformation.
5. Find the regression equation of X on Y for the following data:
X 10 12 16 11 15 14 20 22

Y 15 18 23 14 20 17 25 28

6. Find a linear regression equation for the following data


City No. of Sale of
Families (in automobiles
lakhs): X (in ‘000): Y

Belagavi 70 25.2

Bangalore 75 28.6

Hubli 80 30.2
Estimate the sales for
the year Kalaburagi 60 22.3 2020 for the
city Belagavi
Mangalore 90 35.4
which is estimated to
have 100 lakh families
assuming that the same relationship holds true.

7. Explain the common preprocessing techniques.

You might also like