2.3 Organizing Quantitative Data
group quantitative data:
To organize quantitative data, we first group the observations into classes (also known as categories or bins)
single-value grouping |limit grouping|cutpoint grouping
(1)Singlevalue grouping:
1.group quantitative data is to use classes in which each class represents a single possible value
2. is particularly suitable for discrete data in which there are only a small number of distinct values.
(2)Limit Grouping:class limits.
Lower class limit: The smallest value that could go in a class.
Upper class limit: The largest value that could go in a class.
Class width: The difference between the lower limit of a class and the lower limit of the next-higher class.
Class mark: The average of the two class limits of a class.
Guideline:
- 分类依据要合适
- 一一对应
- 宽度一致
(3)Cutpoint Grouping:class cutpoints(对于float?)
lower cutpoint同上 Lower class limit
upper cutpoint同上Upper class limit
rounding error or roundoff error.:由于得到的relative frequencies仅保留有限位数,所以最终sum值有可能小于1。
Lower class cutpoint: The smallest value that could go in a class.
Upper class cutpoint: The smallest value that could go in the next-higher class (equivalent to the lower cutpoint of the next-higher class).
Class width: The difference between the cutpoints of a class.(以数轴为例就是断点cutpoint)
Class midpoint: The average of the two cutpoints of a class.
<Histograms>:同 bar chat但是position the bars in a histogram so that they touch each other
Note: Some statisticians and technologies use class marks or class midpoints centered under the bars.
图形特点:
1.the frequency histogram and relative-frequency histogram have the same shape(The same vertical scale is used for all relative-frequency histograms—a minimum of 0 and a maximum of 1—making direct comparison easy)
2.single-value grouping label the single value
3.cutpoint grouping label the limit
<Dotplots> are similar to histograms(适用于小数单值数据多的情况,易于构建和使用)
<Stem-and-Leaf>Histograms的抽象版(float?)
40.000,41.000,40.009,40.789使用5列茎叶图
缺点:can be awkward with data containing many digits