Group-Level Descriptive Statistics in SAS
Group-Level Descriptive Statistics in SAS, the NWAY
statement can be employed within PROC SUMMARY
to compute summary statistics at a group level, focusing on specific categories rather than the entire dataset.
Group-Level Descriptive Statistics in SAS
This technique is especially useful when you want to aggregate data based on specific classifications.
Example: Implementing NWAY in PROC SUMMARY
For this illustration, we will utilize the built-in Fish
dataset in SAS, which contains various measurements for 159 different fish caught in a Finnish lake.
To begin, let’s check the first ten observations of the dataset for context:
/* View the first 10 observations from the Fish dataset */
proc print data=sashelp.Fish (obs=10);
run;
Calculating Group-Level Statistics Without NWAY
Next, we can execute PROC SUMMARY
to calculate descriptive statistics for the Weight
variable, organized by the Species
variable. Here’s the code:
/* Calculate descriptive statistics for Weight, grouped by Species */
proc summary data=sashelp.Fish;
var Weight;
class Species;
output out=summaryWeight;
run;
/* Print the output dataset */
proc print data=summaryWeight;
run;
Interpreting the Output
The output will include multiple rows. Specifically:
- TYPE: Indicates whether the row encompasses statistics for the entire dataset or for specific groups. A value of 0 means all rows were used.
- FREQ: The count of observations used for each statistic.
- STAT: The descriptive statistic itself.
- Weight: The calculated value associated with each statistic.
For example, the first five rows typically display statistics for the entire dataset:
- Total observations: 158
- Minimum weight: 0
- Maximum weight: 1,650
- Mean weight: 398.70
- Standard deviation: 359.09
Subsequent rows present the summary statistics for individual Species
, such as Bream and Parkki.
Applying the NWAY Statement
To restrict the output to only the highest-level group statistics, you can include the NWAY
statement.
This modification ensures that only the rows with a _TYPE_
value of 1 are displayed, omitting those related to the entire dataset.
Here’s how to implement it:
/* Calculate descriptive statistics for Weight, grouped by Species using the NWAY statement */
proc summary data=sashelp.Fish nway;
var Weight;
class Species;
output out=summaryWeight;
run;
/* Print the output dataset */
proc print data=summaryWeight;
run;
Observing the Result
With the NWAY
statement, the summary statistics for the overall dataset will no longer appear.
Instead, the output will exclusively show the statistics relevant to each individual species, providing a clearer view of the aggregated data by category.
Conclusion
Using the NWAY
statement in PROC SUMMARY
allows for more precise control over the output of summary statistics, focusing specifically on group-level analyses.
This feature enables data analysts and researchers to derive meaningful insights from categorical data while avoiding unnecessary clutter from overall dataset statistics.
By following the examples and techniques outlined above, you can enhance your data analysis capabilities in SAS.