Mingze Gao

Convert Between Numeric and Character Variables

| 2 min read

Converting between numeric and character variables is one of the most frequently encountered issues when processing datasets. This article explains how to do this conversion correctly and efficiently.

Numeric to Character

Assume there's an imported dataset named filings, where cik is stored as a numeric variable as shown below:

cikfile_typedate
10002298-K2011-09-30
1005918-K2006-05-11
1008268-K2009-06-30
935428-K2007-01-25

Because cik is of different digits, to convert the numeric cik into a character variable, the natural procedure is to pad it with leading zeros. For example, cik (Central Index Key) itself is a 10-digit number used by SEC.

In SAS, convert numeric variable to string with leading zeros (assuming 10-digit fixed length) is done via PUT() function:

data filings(drop=cik); set filings;
    cik_char = put(cik, z10.);
run;

The generated cik_char variable is of format and informat $10., and the dataset becomes:

cik_charfile_typedate
00010002298-K2011-09-30
00001005918-K2006-05-11
00001008268-K2009-06-30
00000935428-K2007-01-25

In STATA, convert numeric variable to string with leading zeros (assuming 6-digit fixed length) can be achieved via the string() function.

gen char_var = string(num_var,"%06.0f")

Character to Numeric

In SAS, converting a character variable to a numeric one uses the INPUT() function:

var_numeric = input(var_char, best12.);

In STATA, this conversion be can be done via either real() function or destring command.

gen num_var = real(char_var);

The real() function works on a single variable. destring command can convert all character variables into numeric in one go.

destring, repalce

A more detailed explanation with examples is available at stats.idre.ucla.edu