Pad Character Variables with Leading Zeros - SAS Tutorial (3)

Pad Character Variables with Leading Zeros - SAS Tutorial (3)

Feb 29, 2020
WRDS/SAS Tutorial
WRDS, SAS, STATA

Working with WRDS data, sometimes we’ll want to add leading zeroes to certain variables, for example, CIK. CIK (Central Index Key) is a 10-digit number used by SEC as an entity identifier.

Say there is a CIK field in a dataset. If it is stored as a numeric variable, then most software will ignore leading zeroes. As a result, the imported dataset may look like:

cikfile_typedate
10002298-K2011-09-30
1005918-K2006-05-11
1008268-K2009-06-30
935428-K2007-01-25

To convert the numerically stored CIK to 10-digit character variable, we need to pad it with leading zeros. This can be achieved by the following SAS code:

1
2
3
data want; set have;
    cik_str = put(cik_numeric, z10.); 
run;

The generated cik_str variable is of format and informat $10., and the dataset becomes:

cik_strfile_typedate
00010002298-K2011-09-30
00001005918-K2006-05-11
00001008268-K2009-06-30
00000935428-K2007-01-25

By the way, in STATA we can use the following method to convert numeric variable to string with leading zeros (assuming 6-digit fixed length):

1
gen newstring  = string(oldvar,"%06.0f")

comments powered by Disqus