Running speed and ability is known to be correlated with both physical sex and with a person's general level of athleticism. In the sample dataset, there are several variables relating to this question:
Let's use the Compare Means procedure to summarize the relationship between running ability, athletics, and gender.
First, we will summarize the mile times without the grouping variables using the mean, standard deviation, sample size, minimum, and maximum. Running the ProcedureUsing the Compare Means Dialog Window
Using SyntaxMEANS TABLES=MileMinDur /CELLS=MEAN COUNT STDDEV MIN MAX.OutputThe Compare Means procedure will report two tables: the Case Processing Summary, which contain information about the number of valid cases that the statistics are based on, and the Report table, which contains the descriptive statistics themselves. The average mile time overall was 8 minutes, 9 seconds, with a standard deviation of about 2 minutes. The fastest mile time was about 5 minutes; the slowest was about 14 minutes.
Now let's look at how the mile times vary with respect to whether or not someone is an athlete. Note that Compare Means with one layer produces results that are similar to using the Split File technique with the Descriptives procedure. The major difference between using Compare Means and viewing the Descriptives with Split File enabled is that Compare Means does not treat missing values as an additional category -- it simply drops those cases from the analysis. Compare Means is limited to listwise exclusion: there must be valid values on each of the dependent and independent variables for a given table. Running the ProcedureUsing the Compare Means Dialog WindowIf you are continuing the example from the first section, you will only need to do step 3.
Using SyntaxMEANS TABLES=MileMinDur BY Athlete /CELLS=MEAN COUNT STDDEV MIN MAX.OutputThe Case Processing Summary table shows how many cases had nonmissing values for both the mile time and the athlete indicator variable. The Report table has the descriptive statistics with respect to each group, as well as the overall average mile time of the valid cases (n = 392). From this table, there are several observations we can make about the relationship between mile time and athletics in the sample:
Let's modify the one-layer analysis to report mile times with respect to athletics, with respect to gender. Recall that there are two levels for Gender (Male and Female), and two levels for Athlete (Non-athlete and Athlete). This means that there are four possible factor level combinations:
When we run Compare Means with two layers, we will be able to simultaneously view the averages with respect to each possible factor combination. As mentioned before, Compare Means is limited to listwise exclusion, so a two-layer analysis requires that cases not have missing values for the dependent variable and all independent variables. Running the ProcedureUsing the Compare Means Dialog WindowIf you are continuing the example from the previous section, you will only need to do step 4.
Note: Be careful that you put each factor on its own separate layer. It is easy to accidentally list two factor variables in the Independent List area for the first layer. (If more than one factor is listed on the first layer, it will produce multiple single-layer reports.) Your Independent List area should look like this:
Using SyntaxMEANS TABLES=MileMinDur BY Athlete BY Gender /CELLS=MEAN COUNT STDDEV MIN MAX.OutputThe Case Processing Summary table shows how many cases had nonmissing values for mile time and the athlete indicator and gender. The Report table has the descriptive statistics with respect to each combination of the factors, as well as the total sample overall. Notice that because of listwise exclusion, there are now only 383 valid cases, whereas the single-layer report of mile time by athlete included 392 cases. Using this table, we can expand upon several observations we made from the single-layer table:
|