What is Proc Tabulate in SAS


1. Basic form:

Among them we useCLASS Declared variables tell SAS which variables are categorical variables.
TABLE Declared 3 dimensions in the order: side dimension, row dimension, column dimension;Different dimensions are separated by ","
Note that: a dimension is declared, the dimension at this point is a column dimension, two dimensions are declared, a row dimension and a column dimension;
Application Skills: Define the row and column dimension first, then add the page dimension after satisfactory results. This can increase machine efficiency.

Example: This time, use the SASHELP.BWEIGHT record (including the baby's birth weight and various mother's characteristics) to show the number of newborns of different races, marital status, and sex at birth.

The results are as follows:

MISSING option
Some observations are missing in the variables declared in the CLASS. By default, SAS ignores these observations and does not count them as a separate type of statistics added to the table in the statistical process of missing them;

2. Add descriptive statistics in PROC TABULATE

As mentioned above, CLASS SAS tells what variables are categorical variables. Then there must be another statement that tells SAS which variables are continuous variables. That is, VAR instruction;
Basic form:

In addition, all variables in TABLE must come from the variables in the CLASS or VAR declaration.
The dimensions in TABLE can not only write variables, but also contain many keywords. Here are some commonly used keywords:

ALLESShow the total in a new page / row / column
NNumber of non-missing values
NMISSNumber of missing values
PCTNPercentage of the number in the group
PCTSUMCumulative percentage
STDDEVStandard deviation

Now that you know these keywords, you need to use them in sentences. Here are three operations in the dimension:Concatenation, crossover and grouping
cascade: Concatenate different variables with spaces

cross: Connect variables or keywords with *

Group: Insert the same group with ().

Example: Note the average birth weight of babies born to women of different sex and marital status. The code is as follows

The results are as follows:

3. Detailed processing: Control the appearance of the PROC TABULATE output

(1) FORMAT = option: Control the output format in the table, e.g. B. FORMAT = COMMA5.1;

Be sure to,COMMA format means generating numbers separated by commas, .1 stands for 1 decimal place, 7 stands for the "digits" of the output: 3,411.2 occupy a total of 7 characters. If the number of bits is insufficient (e.g. less than 7), SAS returns a different result. If you are interested you can give it a try. I will not repeat it here.

(2) Options BOX and MISSTEXT =: The empty grid is filled, both appear after the TABLE command with/ Make a separate statement.
The BOX = option can fill the empty upper left corner of the table with content, e.g.

Effect picture:

same forMISSTEXT = optionAsk SAS to put the "Description" in the blank cell, e.g. B. MISSTEXT = "None".

(3) Change the header (including variable name and variable value).
1) Variable value: use PROC FORMAT, do not repeat here;
2) Variable name (refers to the variable declared in the CLASS command):
Enter directly after the variable name. For example: Married = "whether to marry" if you want to delete a header, e.g. B. "Average", only "Average" = "".
However, it should be noted that whenStatistical analysis variables(MEAN SUM etc) waitRow dimensionIf MEAN = ’’ is declared, SAS leaves empty cells and does not delete them. In this case, you must declare TABLE after the commandROW = FLOAT (must be separated by /)To achieve the purpose of clearing the header.

The statistical variable MEAN in this example is inColumn dimensionIt is therefore not necessary to declare the command ** / ROW = FLOAT **. The effect is as follows:

Is it much clearer?

(4) Set different formats for the table cell.
Basic form: Add * after the variable nameFORMAT = option
This time we added the maternal weight gain during pregnancy (momwtgain) variable to the VAR command and set it to keep 2 decimal places.
The code is as follows:

The results are as follows:

So far, this is the end of PROC TABULATE's notes. I'm still lazy and many days have passed since I filled out this note. Will be checked

The follow-up will continue to study the statistical report SAS learning_5: Statistical report generation (2) -PROC REPORT, looking forward to ~