0% found this document useful (0 votes)
8 views13 pages

Statistics For Data Science

Uploaded by

mok01012005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views13 pages

Statistics For Data Science

Uploaded by

mok01012005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Section 1: Descriptive Statistics (1–20)

1. The mean of a dataset is:


a) The middle value
b) The sum of all values divided by the number of values
c) The most frequent value
d) The difference between maximum and minimum
Answer: b
2. The median is:
a) The average of all values
b) The middle value in a sorted dataset
c) The most frequent value
d) The range
Answer: b
3. The mode of a dataset is:
a) The smallest value
b) The most frequently occurring value
c) The average
d) The middle value
Answer: b
4. Range of a dataset is:
a) Maximum – Minimum
b) Mean ± SD
c) Median – Mode
d) Q3 – Q1
Answer: a
5. Variance measures:
a) Central tendency
b) Spread of the data
c) Skewness
d) Kurtosis
Answer: b
6. Standard deviation is:
a) Square root of variance
b) Square of variance
c) Mean of squared differences
d) Difference between max and min
Answer: a
7. Which measure is resistant to outliers?
a) Mean
b) Median
c) Standard Deviation
d) Variance
Answer: b
8. For skewed data, the best measure of central tendency is:
a) Mean
b) Median
c) Mode
d) Range
Answer: b
9. Coefficient of variation is defined as:
a) SD / Mean
b) Variance / Mean
c) SD × Mean
d) Mean / SD
Answer: a
10. Which is most affected by extreme values?
a) Mean
b) Median
c) IQR
d) Mode
Answer: a
11. A dataset has values [2, 4, 6, 8, 100]. The median is:
a) 6
b) 8
c) 20
d) 100
Answer: a
12. Variance units are:
a) Same as data
b) Square of data units
c) Inverse of data units
d) Dimensionless
Answer: b
13. Standard deviation units are:
a) Same as data
b) Square of data units
c) Inverse of data units
d) Dimensionless
Answer: a
14. Which of the following is a measure of spread?
a) Mean
b) Median
c) Standard deviation
d) Mode
Answer: c
15. The sum of deviations from mean is always:
a) Positive
b) Negative
c) Zero
d) Cannot be determined
Answer: c
16. A dataset has two peaks; it is called:
a) Unimodal
b) Bimodal
c) Trimodal
d) Uniform
Answer: b
17. Quartiles divide the dataset into:
a) 2 equal parts
b) 3 equal parts
c) 4 equal parts
d) 5 equal parts
Answer: c
18. The second quartile (Q2) is the same as:
a) Mean
b) Median
c) Mode
d) Standard deviation
Answer: b
19. Skewness affects which measure the most?
a) Mean
b) Median
c) Mode
d) IQR
Answer: a
20. The 90th percentile indicates:
a) 10% of data is below this value
b) 90% of data is below this value
c) Median value
d) Maximum value
Answer: b

Section 2: Interquartile Range (IQR) (21–40)


21. IQR is calculated as:
a) Q1 – Q3
b) Q3 – Q1
c) Max – Min
d) Mean ± SD
Answer: b
22. Q1 represents:
a) 10th percentile
b) 25th percentile
c) 50th percentile
d) 75th percentile
Answer: b
23. Q3 represents:
a) 10th percentile
b) 25th percentile
c) 50th percentile
d) 75th percentile
Answer: d
24. IQR measures:
a) Central tendency
b) Spread of middle 50% of data
c) Total variability
d) Skewness
Answer: b
25. Outliers can be detected using IQR if a value is:
a) Below Q1 – 1.5×IQR or above Q3 + 1.5×IQR
b) Between Q1 and Q3
c) Equal to median
d) None
Answer: a
26. IQR is:
a) Sensitive to outliers
b) Resistant to outliers
c) Same as variance
d) Same as SD
Answer: b
27. For dataset [5, 7, 8, 12, 15, 18, 20], Q1 = ?
a) 7
b) 8
c) 12
d) 15
Answer: a
28. For the same dataset, Q3 = ?
a) 12
b) 15
c) 18
d) 20
Answer: c
29. IQR of [5, 7, 8, 12, 15, 18, 20] is:
a) 8
b) 10
c) 12
d) 15
Answer: b
30. Boxplots visually show:
a) Central tendency
b) Spread and outliers
c) Skewness only
d) Correlation
Answer: b
31. Extreme values in IQR analysis are called:
a) Quartiles
b) Outliers
c) Mean deviations
d) Skew points
Answer: b
32. IQR is more useful than range when:
a) Data is symmetric
b) Data contains outliers
c) Data is categorical
d) Data is discrete
Answer: b
33. Q2 – Q1 = ?
a) Lower quartile range
b) IQR
c) Upper quartile range
d) Median deviation
Answer: a
34. IQR helps in:
a) Standardization
b) Normalization
c) Detecting outliers
d) Regression analysis
Answer: c
35. A small IQR indicates:
a) High variability
b) Low variability
c) Presence of outliers
d) Skewness
Answer: b
36. A large IQR indicates:
a) Low variability
b) High variability
c) Symmetry
d) Normal distribution
Answer: b
37. Whiskers in boxplot usually represent:
a) IQR
b) 1.5×IQR
c) SD
d) Mean ± SD
Answer: b
38. Median line in boxplot represents:
a) Q1
b) Q2
c) Q3
d) Mean
Answer: b
39. Which is a resistant measure of spread?
a) SD
b) IQR
c) Variance
d) Mean
Answer: b
40. Interquartile range is used in:
a) Z-score calculation
b) Outlier detection
c) Correlation calculation
d) Regression modeling
Answer: b

Section 3: Gaussian (Normal) Distribution (41–60)


41. Gaussian distribution is also called:
a) Uniform distribution
b) Normal distribution
c) Binomial distribution
d) Poisson distribution
Answer: b
42. In normal distribution:
a) Mean = Median = Mode
b) Mean > Median > Mode
c) Mean < Median < Mode
d) Mean ≠ Median ≠ Mode
Answer: a
43. Standard deviation determines:
a) Center
b) Spread
c) Skewness
d) Kurtosis
Answer: b
44. In a normal distribution, ~68% of data lies within:
a) ±1 SD
b) ±2 SD
c) ±3 SD
d) ±0.5 SD
Answer: a
45. ~95% of data lies within:
a) ±1 SD
b) ±2 SD
c) ±3 SD
d) ±4 SD
Answer: b
46. ~99.7% of data lies within:
a) ±1 SD
b) ±2 SD
c) ±3 SD
d) ±4 SD
Answer: c
47. Z-score formula is:
a) (X – Mean)/SD
b) (Mean – X)/Variance
c) SD/Mean
d) Mean/SD
Answer: a
48. Z-score indicates:
a) Distance from mean in SD units
b) Median value
c) Mode
d) Quartile position
Answer: a
49. Standard normal distribution has:
a) Mean = 0, SD = 1
b) Mean = 1, SD = 0
c) Mean = 0, SD = 0
d) Mean = 1, SD = 1
Answer: a
50. Negative Z-score means:
a) Value above mean
b) Value below mean
c) Value = mean
d) Outlier
Answer: b
51. Normal distribution is:
a) Symmetric
b) Skewed
c) Uniform
d) Bimodal
Answer: a
52. Bell-shaped curve represents:
a) Normal distribution
b) Uniform distribution
c) Skewed distribution
d) Exponential distribution
Answer: a
53. Skewness of perfect normal distribution is:
a) 0
b) 1
c) -1
d) Undefined
Answer: a
54. Kurtosis measures:
a) Spread
b) Peakedness
c) Skewness
d) Mean
Answer: b
55. High kurtosis indicates:
a) Flat distribution
b) Heavy tails
c) Symmetry
d) Outliers absent
Answer: b
56. Low kurtosis indicates:
a) Flat distribution
b) Heavy tails
c) Normal distribution
d) Positive skew
Answer: a
57. Empirical rule applies to:
a) Normal distribution
b) Uniform distribution
c) Exponential distribution
d) Binomial distribution
Answer: a
58. Standardizing data converts it to:
a) Z-scores
b) Quartiles
c) Mean ± SD
d) Percentiles
Answer: a
59. Probability density function is used in:
a) Regression
b) Normal distribution
c) Clustering
d) Classification
Answer: b
60. Area under normal curve equals:
a) 0
b) 0.5
c) 1
d) SD
Answer: c

Section 4: Skewness (61–80)


61. Skewness measures:
a) Central tendency
b) Spread
c) Symmetry of distribution
d) Correlation
Answer: c
62. Positive skew means:
a) Tail on left
b) Tail on right
c) Symmetric
d) Uniform
Answer: b
63. Negative skew means:
a) Tail on left
b) Tail on right
c) Symmetric
d) Uniform
Answer: a
64. For positive skew:
a) Mean > Median > Mode
b) Mean < Median < Mode
c) Mean = Median = Mode
d) Mode > Median > Mean
Answer: a
65. For negative skew:
a) Mean > Median > Mode
b) Mean < Median < Mode
c) Mean = Median = Mode
d) Mode > Median > Mean
Answer: a
66. Skewness of normal distribution is:
a) 0
b) 1
c) -1
d) Undefined
Answer: a
67. High skewness indicates:
a) Symmetry
b) Extreme values
c) Low variability
d) Normality
Answer: b
68. Median is more robust to skew than:
a) Mean
b) Mode
c) SD
d) IQR
Answer: a
69. Log transformation helps to:
a) Reduce positive skew
b) Reduce negative skew
c) Increase spread
d) Normalize SD
Answer: a
70. Boxplot shows skew by:
a) Quartiles and whiskers
b) Mean only
c) SD only
d) Z-score
Answer: a
71. Skewed data affects:
a) Mean
b) Median
c) Mode
d) All
Answer: d
72. Symmetric distribution skewness is:
a) 0
b) Positive
c) Negative
d) Undefined
Answer: a
73. Right-skewed data has:
a) Long right tail
b) Long left tail
c) No tail
d) Uniform tail
Answer: a
74. Left-skewed data has:
a) Long right tail
b) Long left tail
c) Symmetric tail
d) Uniform
Answer: b
75. Skewness formula uses:
a) Cubed deviations from mean
b) Squared deviations
c) Absolute deviations
d) Quartiles
Answer: a
76. Pearson’s skewness coefficient uses:
a) Mean, Median, SD
b) Median only
c) Quartiles
d) Variance only
Answer: a
77. High positive skew may indicate:
a) Outliers on right
b) Outliers on left
c) Symmetry
d) Low SD
Answer: a
78. High negative skew may indicate:
a) Outliers on left
b) Outliers on right
c) Symmetry
d) Normal distribution
Answer: a
79. Transformations to reduce skew include:
a) Log
b) Square root
c) Cube root
d) All of the above
Answer: d
80. Skewness affects which measure the most?
a) Mean
b) Median
c) Mode
d) IQR
Answer: a

Section 5: Mixed / Applied Statistics (81–100)


81. Variance formula uses:
a) Squared deviations from mean
b) Cubed deviations from mean
c) Absolute deviations
d) Quartiles
Answer: a
82. Median splits the dataset into:
a) 25%-75%
b) 50%-50%
c) 33%-67%
d) 40%-60%
Answer: b
83. Outliers inflate:
a) Median
b) IQR
c) Range
d) Mode
Answer: c
84. Empirical rule applies to:
a) Normal distribution
b) Skewed data
c) Uniform data
d) Categorical data
Answer: a
85. Boxplot whiskers typically extend to:
a) Q1 – 1.5×IQR and Q3 + 1.5×IQR
b) Min and Max
c) Mean ± SD
d) Median ± SD
Answer: a
86. Z-score > 3 may indicate:
a) Outlier
b) Median
c) Mode
d) Quartile
Answer: a
87. Standardizing data results in:
a) Mean 0, SD 1
b) Mean 1, SD 0
c) Median 0, IQR 1
d) Mean = SD
Answer: a
88. Skewness > 1 indicates:
a) High positive skew
b) Moderate positive skew
c) Negative skew
d) Symmetry
Answer: a
89. Skewness < –1 indicates:
a) High negative skew
b) Moderate negative skew
c) Positive skew
d) Symmetry
Answer: a
90. SD measures:
a) Spread around mean
b) Spread around median
c) Spread around mode
d) Quartile deviation
Answer: a
91. Quartile deviation =
a) (Q3 – Q1)/2
b) Q3 – Q1
c) Mean – Median
d) SD/2
Answer: a
92. Mean > Median indicates:
a) Positive skew
b) Negative skew
c) Symmetric
d) Uniform
Answer: a
93. Median > Mean indicates:
a) Positive skew
b) Negative skew
c) Symmetric
d) Uniform
Answer: b
94. Z-score formula standardizes:
a) Continuous data
b) Categorical data
c) Ordinal data
d) Nominal data
Answer: a
95. Outliers are extreme values:
a) True
b) False
Answer: a
96. Boxplot central line shows:
a) Mean
b) Median
c) Mode
d) Quartile deviation
Answer: b
97. Gaussian curve is:
a) Symmetric
b) Skewed right
c) Skewed left
d) Uniform
Answer: a
98. IQR is a:
a) Robust measure of spread
b) Sensitive measure of spread
c) Measure of central tendency
d) Measure of skew
Answer: a
99. Skewness formula involves:
a) Cubed deviations / SD³
b) Squared deviations / SD²
c) Absolute deviations / SD
d) Quartile deviations / SD
Answer: a
100. In a normal distribution, ~34% of data lies between:
a) Mean and +1 SD
b) Mean and –1 SD
c) Median and Q1
d) Median and Q3
Answer: a

You might also like