Convert Between Numeric and Character Variable¶
Converting between numeric and character variables is one of the most frequently encountered issues when processing datasets. This article explains how to do this conversion correctly and efficiently.
Numeric to Character¶
Assume there's an imported dataset named filings
, where cik
is stored as a numeric variable as shown below:
cik  file_type  date 

1000229  8K  20110930 
100591  8K  20060511 
100826  8K  20090630 
93542  8K  20070125 
Because cik
is of different digits, to convert the numeric cik
into a character variable, the natural procedure is to pad it with leading zeros. For example, cik
(Central Index Key) itself is a 10digit number used by SEC.
In SAS, convert numeric variable to string with leading zeros (assuming 10digit fixed length) is done via PUT()
function:
1 2 3 

Tip
PUT()
function also works in PROC SQL
.
The generated cik_char
variable is of format and informat $10.
, and the dataset becomes:
cik_char  file_type  date 

0001000229  8K  20110930 
0000100591  8K  20060511 
0000100826  8K  20090630 
0000093542  8K  20070125 
In STATA, convert numeric variable to string with leading zeros (assuming 6digit fixed length) can be achieved via the string()
function.
1 

Character to Numeric¶
In SAS, converting a character variable to a numeric one uses the INPUT()
function:
1 

In STATA, this conversion be can be done via either real()
function or destring
command.
1 

The real()
function works on a single variable. destring
command can convert all character variables into numeric in one go.
1 

Warning
If a character variable has nonnumeric characters in it, then it will not be converted. In such a case, you may choose to use the encode
command, although it in fact is generating categories.
A more detailed explanation with examples is available at stats.idre.ucla.edu