PROC RANK in SAS to Calculate Rankings
PROC RANK in SAS to Calculate Rankings, PROC RANK in SAS is a versatile tool that allows you to rank one or more numeric variables in a dataset.
PROC RANK in SAS to Calculate Rankings
Below are four common methods of using PROC RANK.
Method 1: Rank One Variable
To rank a single variable, you can use the following syntax:
proc rank data=original_data out=ranked_data;
var var1;
ranks var1_rank;
run;
Method 2: Rank One Variable by Group
If you want to rank a variable but group your data by another variable, you can modify the syntax like this:
proc rank data=original_data out=ranked_data;
var var1;
by var2;
ranks var1_rank;
run;
Method 3: Rank One Variable into Percentiles
To create percentile rankings, you can use the groups
option, which allows you to split your data into a specified number of groups:
proc rank data=original_data groups=4 out=ranked_data;
var var1;
ranks var1_rank;
run;
Method 4: Rank Multiple Variables
You can also rank multiple variables simultaneously by specifying them within the var
statement:
proc rank data=original_data out=ranked_data;
var var1 var2;
ranks var1_rank var2_rank;
run;
Example Dataset
Let’s illustrate these methods with the following dataset representing basketball teams and their statistics:
/* Create dataset */
data original_data;
input team $ points rebounds;
datalines;
A 25 10
A 18 4
A 18 7
A 24 8
B 27 9
B 33 13
B 31 11
B 30 16
;
run;
/* View dataset */
proc print data=original_data;
run;
Example 1: Rank One Variable
Here is how to create a new variable named points_rank
, which ranks the points scored by each team:
/* Rank points scored by team */
proc rank data=original_data out=ranked_data;
var points;
ranks points_rank;
run;
/* View ranks */
proc print data=ranked_data;
run;
In this case, the highest points will receive the highest rank, while the lowest points will receive the lowest rank.
Ties in scores will receive the average rank; for example, if two teams are tied for the lowest points, both will receive a rank of 1.5.
You can also rank in descending order to assign the lowest rank to the team with the most points:
/* Rank points scored by team in descending order */
proc rank data=original_data descending out=ranked_data;
var points;
ranks points_rank;
run;
/* View ranks */
proc print data=ranked_data;
run;
Example 2: Rank One Variable by Group
To rank the points scored within each team group, you can use the following code:
/* Rank points scored, grouped by team */
proc rank data=original_data out=ranked_data;
var points;
by team;
ranks points_rank;
run;
/* View ranks */
proc print data=ranked_data;
run;
Example 3: Rank One Variable into Percentiles
You can rank the points into quartiles using the groups
option:
/* Rank points into quartiles */
proc rank data=original_data groups=4 out=ranked_data;
var points;
ranks points_rank;
run;
/* View ranks */
proc print data=ranked_data;
run;
In this output, scores in the lowest quartile will be ranked 0, while those in the next quartile will be assigned a rank of 1, and so forth. If you wish to divide the data into deciles, you would simply use groups=10
.
Example 4: Rank Multiple Variables
To rank both points and rebounds simultaneously, you can use the following code:
/* Rank both points and rebounds */
proc rank data=original_data out=ranked_data;
var points rebounds;
ranks points_rank rebounds_rank;
run;
/* View ranks */
proc print data=ranked_data;
run;
Conclusion
PROC RANK is an effective procedure in SAS for ranking numeric variables, whether you need to rank a single variable, rank by group, categorize into percentile groups, or rank multiple variables at once.
These methods allow for flexible data analysis in various contexts.
Explore these functionalities further to enhance your data reporting and analysis capabilities.