Page 132 - Demo
P. 132


                                    92 Chapter 1 %u2022 Analyzing One-Variable Dataalso use a bar chart to compare the distribution of a categorical variable in two or more groups.%u2022%u2022 Use frequencies or relative frequencies (percentages or proportions) to describe the distribution of a categorical variable.%u2022%u2022 Beware of graphs that mislead the eye. Look at the scales to see if they have been distorted to create a particular impression. Avoid making graphs that replace the bars of a bar chart with pictures whose height and width both change.Quantitative Variables%u2022%u2022 You can use a dotplot, stemplot, or histogram to show the distribution of a quantitative variable. A%u00a0dotplot displays individual values on a number line. Stemplots separate each observation into a stem and a one-digit leaf. Histograms plot the frequencies (counts) or relative frequencies (percentages or proportions) of values in equal-width intervals.%u2022%u2022 Histograms are appropriate for quantitative data; bar charts are appropriate for categorical data. %u2022%u2022 Be sure to use relative frequencies when comparing data sets of different sizes.%u2022%u2022 When examining any graph of quantitative data, describe the overall pattern and any clear departures from that pattern, in context. Shape, center,and variability describe the overall pattern of the distribution of a quantitative variable. Outliersare observations that lie outside the overall pattern of a distribution. Always look for outliers and try to explain them.%u2022%u2022 Some distributions have simple shapes, such as roughly symmetric, skewed to the left, or skewed to the right. The number of peaks is another important characteristic of shape.%u2022%u2022 When comparing distributions of quantitative data, be sure to compare shape, outliers, center, and variability in context.Numerical Summaries for Quantitative Data%u2022%u2022 The median and the mean measure the center of a distribution in different ways. The median is the midpoint of the distribution, the number such that about half the observations are smaller and half are larger. The mean is the average of the observations:%u2211x = xni%u2022%u2022 The simplest measure of variability for a distribution of quantitative data is the range, which is the distance from the maximum value to the minimum value: range m= aximum %u2013 minimum.%u2022%u2022 When you use the mean to describe the center of a distribution, measure variability using the standard deviation. The standard deviation gives the typical distance of the values in a distribution from the mean. In symbols, the sample standard deviation is given by %u2211 = %u2212%u2212( )1 .2sx xn xiThe standard deviation sx is zero when there is no variability and gets larger as variability from the mean increases.%u2022%u2022 When you use the median to describe the center of a distribution, measure its variability using the interquartile range (IQR). The first quartile Q1has about one-fourth of the observations below it, and the third quartile Q3 has about threefourths of the observations below it. The IQRmeasures variability in the middle half of the distribution: IQR = Q Q 3 1 %u2013 .%u2022%u2022 The median is a resistant measure of center because it tends not to be affected by extreme values. The mean is not resistant. Among measures of variability, the IQR is resistant, but the standard deviation and range are not.%u2022%u2022 The mean and standard deviation are good measures for roughly symmetric distributions with no outliers. The median and IQR are better measures for skewed distributions or distributions with outliers.%u2022%u2022 According to the 1.5 %u00d7 IQR rule, an observation is an outlier if it is smaller than Q1 %u2013 1.5 %u00d7 IQR or larger than Q3 + %u00d7 1.5 IQR.%u2022%u2022 Boxplots are based on the five-number summaryof a distribution, consisting of the minimum, Q1, the median, Q3, and the maximum. The box shows the variability in the middle half of the distribution. The median is marked within the box. Lines extend from the box to the smallest and the largest observations that are not outliers. Outliers are plotted with special symbols. Boxplots are especially useful for comparing distributions.%u00a9 Bedford, Freeman & Worth Publishers. For review purposes only. Do not distribute. 
                                
   126   127   128   129   130   131   132   133   134   135   136