Normality with PROC UNIVARIATE in SAS
Normality with PROC UNIVARIATE in SAS, you can utilize PROC UNIVARIATE
along with the NORMAL
statement to conduct various normality tests on a variable in your dataset.
Normality with PROC UNIVARIATE in SAS
The basic syntax for this procedure is:
proc univariate data=my_data normal;
var my_variable;
run;
Example: Using PROC UNIVARIATE for Normality Testing
Consider a dataset containing information about different basketball players. Let’s create the dataset and perform normality tests on the points
variable.
/* Create dataset */
data my_data;
input team $ points rebounds;
datalines;
A 12 8
A 12 8
A 12 8
A 23 9
A 20 12
A 14 7
A 14 7
B 20 2
B 20 5
B 29 4
B 14 7
B 20 2
B 20 2
B 20 5
;
run;
/* View dataset */
proc print data=my_data;
run;
To conduct normality tests on the points
variable, we can execute the following code:
proc univariate data=my_data normal;
var points;
run;
Interpreting the Output
The output will include various tables, with one titled Tests for Normality that presents the results of the normality tests.
By default, SAS performs four common normality tests and displays their statistics along with corresponding p-values:
- Shapiro-Wilk Test: W = .867, p = .0383
- Kolmogorov-Smirnov Test: D = .237, p = .0318
- Cramer-von Mises Test: W-Sq = .152, p = .0200
- Anderson-Darling Test: A-Sq = .847, p = .0223
The hypotheses for these tests are as follows:
- Null Hypothesis (H0): The data are normally distributed.
- Alternative Hypothesis (HA): The data are not normally distributed.
Since the p-value for each test is below .05, we would reject the null hypothesis across the board.
This indicates that there is substantial evidence to conclude that the points
variable does not follow a normal distribution.
Visualizing the Distribution
You can also create a histogram of the points
variable, overlaying a normal curve for visual assessment:
proc univariate data=my_data;
histogram points / normal;
run;
From the histogram, it becomes evident that the distribution of points does not closely align with the normal curve, reinforcing the findings from our normality tests.
Conclusion
Using PROC UNIVARIATE
with the NORMAL
statement in SAS allows for the rigorous testing of normality while also providing visual tools, like histograms, to evaluate the distribution of your data effectively.