What can we say about the mean and the median when the distribution is symmetric but not bell shaped?

Measures of Shape

What is a measure of shape?

Measures of shape describe the distribution (or pattern) of the data within a dataset.

The distribution shape of quantitative data can be described as there is a logical order to the values, and the 'low' and 'high' end values on the x-axis of the histogram are able to be identified.

The distribution shape of a qualitative data cannot be described as the data are not numeric.


What are the shapes of a dataset?

A distribution of data item values may be symmetrical or asymmetrical. Two common examples of symmetry and asymmetry are the 'normal distribution' and the 'skewed distribution'.

In a symmetrical distribution the two sides of the distribution are a mirror image of each other.

A normal distribution is a true symmetric distribution of observed values.

When a histogram is constructed on values that are normally distributed, the shape of columns form a symmetrical bell shape. This is why this distribution is also known as a 'normal curve' or 'bell curve'.

The following graph is an example of a normal distribution:


If represented as a 'normal curve' (or bell curve) the graph would take the following shape (where µ = mean, and σ = standard deviation):


Key features of the normal distribution:

  • symmetrical shape
  • mode, median and mean are the same and are together in the centre of the curve
  • there can only be one mode (i.e. there is only one value which is most frequently observed)
  • most of the data are clustered around the centre, while the more extreme values on either side of the centre become less rare as the distance from the centre increases (i.e. About 68% of values lie within one standard deviation (σ) away from the mean; about 95% of the values lie within two standard deviations; and about 99.7% are within three standard deviations. This is known as the empirical rule or the 3-sigma rule.)

    In an asymmetrical distribution the two sides will not be mirror images of each other.

    Skewness is the tendency for the values to be more frequent around the high or low ends of the x-axis.

    When a histogram is constructed for skewed data it is possible to identify skewness by looking at the shape of the distribution.

    For example:

    A distribution is said to be positively skewed when the tail on the right side of the histogram is longer than the left side. Most of the values tend to cluster toward the left side of the x-axis (i.e. the smaller values) with increasingly fewer values at the right side of the x-axis (i.e. the larger values).

    A distribution is said to be negatively skewed when the tail on the left side of the histogram is longer than the right side. Most of the values tend to cluster toward the right side of the x-axis (i.e. the larger values), with increasingly less values on the left side of the x-axis (i.e. the smaller values).

    Key features of the skewed distribution:

  • asymmetrical shape
  • mean and median have different values and do not all lie at the centre of the curve
  • there can be more than one mode
  • the distribution of the data tends towards the high or low end of the dataset

    What are the other possible distribution shapes?

    Other distributions include uni-modal, bi-modal, or multimodal.

    A uni-modal distribution occurs if there is only one 'peak' (a highest point) in the distribution, as seen in the previous histograms. This means there is one mode (a value that occurs more frequently than any other) for the data item (variable).

    The distribution shape of the data in the histogram below is bi-modal because there are two modes (two values that occur more frequently than any other) for the data item (variable).


    Why are measures of shape useful?

    The shape of the distribution can assist with identifying other descriptive statistics, such as which measure of central tendency is appropriate to use.

    If the data are normally distributed, the mean, median and mode are all equal, and therefore are all appropriate measure of centre central tendency.

    If data are skewed, the median may be a more appropriate measure of central tendency.



    Return to Statistical Language Homepage

    Further information:
    External links:
    easycalculation.com - Normal Distribution
    easycalculation.com - Skewness calculator
  • Consider the following data set.

    4; 5; 6; 6; 6; 7; 7; 7; 7; 7; 7; 8; 8; 8; 9; 10

    This data set can be represented by following histogram. Each interval has width one, and each value is located in the middle of an interval.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{1}\)

    The histogram displays a symmetrical distribution of data. A distribution is symmetrical if a vertical line can be drawn at some point in the histogram such that the shape to the left and the right of the vertical line are mirror images of each other. The mean, the median, and the mode are each seven for these data. In a perfectly symmetrical distribution, the mean and the median are the same. This example has one mode (unimodal), and the mode is the same as the mean and median. In a symmetrical distribution that has two modes (bimodal), the two modes would be different from the mean and median.

    The histogram for the data: 4; 5; 6; 6; 6; 7; 7; 7; 7; 8 is not symmetrical. The right-hand side seems "chopped off" compared to the left side. A distribution of this type is called skewed to the left because it is pulled out to the left.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{2}\)

    The mean is 6.3, the median is 6.5, and the mode is seven. Notice that the mean is less than the median, and they are both less than the mode. The mean and the median both reflect the skewing, but the mean reflects it more so.

    The histogram for the data: 6; 7; 7; 7; 7; 8; 8; 8; 9; 10, is also not symmetrical. It is skewed to the right.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{3}\)

    The mean is 7.7, the median is 7.5, and the mode is seven. Of the three statistics, the mean is the largest, while the mode is the smallest. Again, the mean reflects the skewing the most.

    Generally, if the distribution of data is skewed to the left, the mean is less than the median, which is often less than the mode. If the distribution of data is skewed to the right, the mode is often less than the median, which is less than the mean.

    Skewness and symmetry become important when we discuss probability distributions in later chapters.

    Example \(\PageIndex{1}\)

    Statistics are used to compare and sometimes identify authors. The following lists shows a simple random sample that compares the letter counts for three authors.

    • Terry: 7; 9; 3; 3; 3; 4; 1; 3; 2; 2
    • Davis: 3; 3; 3; 4; 1; 4; 3; 2; 3; 1
    • Maris: 2; 3; 4; 4; 4; 6; 6; 6; 8; 3
    1. Make a dot plot for the three authors and compare the shapes.
    2. Calculate the mean for each.
    3. Calculate the median for each.
    4. Describe any pattern you notice between the shape and the measures of center.

    Solution

    1. What can we say about the mean and the median when the distribution is symmetric but not bell shaped?

      Figure \(\PageIndex{4}\): Terry’s distribution has a right (positive) skew.

      What can we say about the mean and the median when the distribution is symmetric but not bell shaped?

      Figure \(\PageIndex{5}\): Davis’ distribution has a left (negative) skew

      What can we say about the mean and the median when the distribution is symmetric but not bell shaped?

      Figure \(\PageIndex{6}\): Maris’ distribution is symmetrically shaped.
    2. Terry’s mean is 3.7, Davis’ mean is 2.7, Maris’ mean is 4.6.
    3. Terry’s median is three, Davis’ median is three. Maris’ median is four.
    4. It appears that the median is always closest to the high point (the mode), while the mean tends to be farther out on the tail. In a symmetrical distribution, the mean and the median are both centrally located close to the high point of the distribution.

    Exercise \(\PageIndex{1}\)

    Discuss the mean, median, and mode for each of the following problems. Is there a pattern between the shape and measure of the center?

    a.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{7}\).

    b.

    The Ages Former U.S Presidents Died
    4 6 9
    5 3 6 7 7 7 8
    6 0 0 3 3 4 4 5 6 7 7 7 8
    7 0 1 1 2 3 4 7 8 8 9
    8 0 1 3 5 8
    9 0 0 3 3
    Key: 8|0 means 80.

    c.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{1}\)

    Looking at the distribution of data can reveal a lot about the relationship between the mean, the median, and the mode. There are three types of distributions. A left (or negative) skewed distribution has a shape like Figure \(\PageIndex{2}\). A right (or positive) skewed distribution has a shape like Figure \(\PageIndex{3}\). A symmetrical distribution looks like Figure \(\PageIndex{1}\).

    Use the following information to answer the next three exercises: State whether the data are symmetrical, skewed to the left, or skewed to the right.

    Exercise 2.7.2

    1; 1; 1; 2; 2; 2; 2; 3; 3; 3; 3; 3; 3; 3; 3; 4; 4; 4; 5; 5

    Answer

    The data are symmetrical. The median is 3 and the mean is 2.85. They are close, and the mode lies close to the middle of the data, so the data are symmetrical.

    Exercise 2.7.3

    16; 17; 19; 22; 22; 22; 22; 22; 23

    Exercise 2.7.4

    87; 87; 87; 87; 87; 88; 89; 89; 90; 91

    Answer

    The data are skewed right. The median is 87.5 and the mean is 88.2. Even though they are close, the mode lies to the left of the middle of the data, and there are many more instances of 87 than any other number, so the data are skewed right.

    Exercise 2.7.5

    When the data are skewed left, what is the typical relationship between the mean and median?

    Exercise 2.7.6

    When the data are symmetrical, what is the typical relationship between the mean and median?

    Answer

    When the data are symmetrical, the mean and median are close or the same.

    Exercise 2.7.7

    What word describes a distribution that has two modes?

    Exercise 2.7.8

    Describe the shape of this distribution.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{9}\)

    Answer

    The distribution is skewed right because it looks pulled out to the right.

    Exercise 2.7.9

    Describe the relationship between the mode and the median of this distribution.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{10}\)

    Exercise 2.7.10

    Describe the relationship between the mean and the median of this distribution.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{11}\)

    Answer

    The mean is 4.1 and is slightly greater than the median, which is four.

    Exercise 2.7.11

    Describe the shape of this distribution.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{12}\)

    Exercise 2.7.12

    Describe the relationship between the mode and the median of this distribution.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{13}\)

    Answer

    The mode and the median are the same. In this case, they are both five.

    Exercise 2.7.13

    Are the mean and the median the exact same in this distribution? Why or why not?

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{14}\)

    Exercise 2.7.14

    Describe the shape of this distribution.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{15}\)

    Answer

    The distribution is skewed left because it looks pulled out to the left.

    Exercise 2.7.15

    Describe the relationship between the mode and the median of this distribution.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{16}\)

    Exercise 2.7.16

    Describe the relationship between the mean and the median of this distribution.

    What can we say about the mean and the median when the distribution is symmetric but not bell shaped?
    Figure \(\PageIndex{17}\)

    Answer

    The mean and the median are both six.

    Exercise 2.7.17

    The mean and median for the data are the same.

    3; 4; 5; 5; 6; 6; 6; 6; 7; 7; 7; 7; 7; 7; 7

    Is the data perfectly symmetrical? Why or why not?

    Exercise 2.7.18

    Which is the greatest, the mean, the mode, or the median of the data set?

    11; 11; 12; 12; 12; 12; 13; 15; 17; 22; 22; 22

    Answer

    The mode is 12, the median is 12.5, and the mean is 15.1. The mean is the largest.

    Exercise 2.7.19

    Which is the least, the mean, the mode, and the median of the data set?

    56; 56; 56; 58; 59; 60; 62; 64; 64; 65; 67

    Exercise 2.7.20

    Of the three measures, which tends to reflect skewing the most, the mean, the mode, or the median? Why?

    Answer

    The mean tends to reflect skewing the most because it is affected the most by outliers.

    Exercise 2.7.21

    In a perfectly symmetrical distribution, when would the mode be different from the mean and median?