SAS PROGRAMMING HANDOUT #3 STATISTICAL PROCEDURES This handout describes several of the basic procedures in SAS SAS always uses the most recently created data set, but you can also specify the data set to be used in all procedures. PROC CONTENTS DATA=ONE; This will give you information about the data set ONE PROC PRINT data=one; VAR X Y; This will print the data set ONE. PROC MEANS DATA=ONE; This will compute the mean and std VAR X Y; of the variables X and Y, and then put them OUTPUT OUT=TWO MEAN=MX MY STD=SX SY; in a data set called TWO PROC PRINT DATA=TWO; RUN; DATA FOUR; IF _N_=1 THEN SET TWO; This will merge the data set TWO (one observation) SET ONE; into the data set ONE. PROC PRINT DATA=FOUR; RUN; PROC SORT DATA=ONE OUT=THREE; BY X; This will sort your data set by X. and rename it as THREE Can also do: BY X Y; BY X DESCENDING; PROC SORT; BY X; This will compute the mean and std PROC MEANS; BY X; VAR Y; of Y for each level of x. OUTPUT OUT=BB MEAN=MY STD=SY; PROC FREQ DATA=ONE; This will create a 2x2 table for A and B TABLES A*B/CHISQ OUT=EE; frequency counts and do a Chi-squared test and puts the counts in the data set EE PROC UNIVARIATE PLOT NORMAL; This will compute several statistics for Y VAR Y; percentiles, box-plots, stem-leaf, and a test of normality OUTPUT OUT=CC MEDIAN=MED P5=P5; put the median and 5th percentile in the data set CC PROC CORR DATA=ONE OUTP=ZZ; This will compute the correlation between VAR X Y; X and Y and puts the correlations in ZZ PROC REG; This will preform linear regression y=a+bx MODEL Y=X/P; compute the predicted values, and residuals OUTPUT OUT=DD P=P R=R; and put them in the data set DD You are strongly encouraged to use the SAS on-line help to learn more about the SYNTAX for these basic procedures. NOTE: Good SAS programming requires proper indentation. DATA and PROC start in column 1, everything else in Column 2 or higher. Always specify the data set in a PROC Include comments, use sensible data set names and variable names