Convert Between Numeric and Character Variable¶
Converting between numeric and character variables is one of the most frequently encountered issues when processing datasets. This article explains how to do this conversion correctly and efficiently.
Numeric to Character¶
Assume there's an imported dataset named
cik is stored as a numeric variable as shown below:
cik is of different digits, to convert the numeric
cik into a character variable, the natural procedure is to pad it with leading zeros. For example,
cik (Central Index Key) itself is a 10-digit number used by SEC.
In SAS, convert numeric variable to string with leading zeros (assuming 10-digit fixed length) is done via
data filings(drop=cik); set filings; cik_char = put(cik, z10.); run;
PUT() function also works in
cik_char variable is of format and informat
$10., and the dataset becomes:
In STATA, convert numeric variable to string with leading zeros (assuming 6-digit fixed length) can be achieved via the
gen char_var = string(num_var,"%06.0f")
Character to Numeric¶
In SAS, converting a character variable to a numeric one uses the
var_numeric = input(var_char, best12.);
In STATA, this conversion be can be done via either
real() function or
gen num_var = real(char_var);
real() function works on a single variable.
destring command can convert all character variables into numeric in one go.
If a character variable has non-numeric characters in it, then it will not be converted. In such a case, you may choose to use the
encode command, although it in fact is generating categories.
A more detailed explanation with examples is available at stats.idre.ucla.edu