Most junior research students did not have any experience with data manipulation and analysis with SAS and the Wharton Research Data Services (WRDS), which motivates me about this tutorial series.
This post explains how to use those handy macros on WRDS when you use remote submission, following the previous one on how to encode your WRDS password and use remote submission to run your code on the WRDS cloud.
Before everything, just make sure that this
autoexec.sas is located in the home folder on your WRDS cloud.
* The library name definitions below are used by SAS; * Assign default libref for WRDS (Wharton Research Data Services); %include '/wrds/lib/utility/wrdslib.sas'; options sasautos=('/wrds/wrdsmacros/', SASAUTOS) MAUTOSOURCE;
This code runs automatically when you’ve connected to the WRDS cloud. The first line assigns the default library references for you to use, e.g.
comp for Compustat. The second line makes available the macros. A list of these handy macros is available at the WRDS documentation.
If you don’t have this SAS code in the home folder, simply create one there or you can choose to include these two lines of code in your remotely submitted code.
Let’s say we want to winsorize a dataset by using the macro provided by WRDS (full code). Below is an example of winsorizing Total Assets
AT of Compustat sample by fiscal year from 1980 to 2018.
%let wrds=wrds-cloud.wharton.upenn.edu 4016; options comamid=TCP remote=WRDS; signon username=my_username password="ENCODEDPASSWORD"; rsubmit; /* Create a dataset in the work directory */ data work.funda(keep=gvkey fyear at); set comp.funda; if 1980 <= fyear <= 2018; /* Generic filter */ if indfmt='INDL' and datafmt='STD' and popsrc='D' and consol='C'; run; /* Invoke the macro */ /* The documentation is available at: https://wrds-www.wharton.upenn.edu/pages/support/research-wrds/macros/wrds-macros-winsorize/ */ %WINSORIZE(INSET=funda,OUTSET=funda_w,SORTVAR=fyear,VARS=at,PERC1=1,TRIM=0); /* Before the winsorization */ proc means data=work.funda; by fyear; var at; output out=funda_before_win min= mean= max= / autoname; run; /* After the winsorization */ proc means data=work.funda_w; by fyear; var at; output out=funda_after_win min= mean= max= / autoname; run; proc print data=funda_before_win; proc print data=funda_after_win; run; endrsubmit; signoff;
Invoking the macro is as simple as a single line:
However, one thing to note about this particular winsorization macro by WRDS is that a variable named
a is used in line 57 and 59. So if the
INSET has a variable named
a as well, there’ll be possible data integrity issue. Hence, I prefer to use another version described in my other post Winsorization in SAS.
Use macros from a third party
I tend to collect and store all useful macros on my personal server, hence I don’t need to worry about a loss of or changes to the macros. To use these macros, simply include them before invoking.
filename winsor url "https://adrian-gao.com/utils/winsor.sas"; %include winsor;
Then, I can simply call
winsor as below.
%let winsVars = tac inv_at_l drev drevadj ppe roa; %winsor(dsetin=work.funda, dsetout=work.funda_wins, byvar=fyear, vars=&winsVars, type=winsor, pctl=1 99);