An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Large-scale data processing and analysis is not a new challenge for the U.S. Census Bureau, but the number of statistical programming languages and tools available to perform such work has expanded in recent years. We evaluate how statistical programming languages perform on a common data management task within the Census’s Bureau’s high-performance computing cluster. Specifically, we develop Python, SAS, Stata, and R scripts that merge the person, household, and geographic microdata from the full-count 1990 Census microdata files. We then use these merged data to perform basic analyses such as counting the number of individuals per household and calculating the average household size for every county in the U.S. We compare the different language implementations of these scripts based on runtime for each task. We find that there is wide variation between languages in runtime, and the speed of the programming language depends most heavily on the file format of the input data file.
Some content on this site is available in several different electronic formats. Some of the files may require a plug-in or additional software to view.
Top