Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching

von Hippel, Paul T.; Hunter, David J.; Drown, McKalie

doi:10.15195/v4.a26

Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching

By Parker Webservices on November 15, 2017 in Articles

Paul T. von Hippel, David J. Hunter, McKalie Drown

Sociological Science, November 15, 2017
DOI 10.15195/v4.a26

Abstract

PDF (5847 views)

0 Citation

Abstract
Author Information
Process Info

Researchers often estimate income statistics from summaries that report the number of incomes in bins such as $0 to 10,000, $10,001 to 20,000, …, $200,000+. Some analysts assign incomes to bin midpoints, but this treats income as discrete. Other analysts fit a continuous parametric distribution, but the distribution may not fit well. We fit nonparametric continuous distributions that reproduce the bin counts perfectly by interpolating the cumulative distribution function (CDF). We also show how both midpoints and interpolated CDFs can be constrained to reproduce the mean of income when it is known. We evaluate the methods in estimating the Gini coefficients of all 3,221 U.S. counties. Fitting parametric distributions is very slow. Fitting interpolated CDFs is much faster and slightly more accurate. Both interpolated CDFs and midpoints give dramatically better estimates if constrained to match a known mean. We have implemented interpolated CDFs in the “binsmooth” package for R. We have implemented the midpoint method in the “rpme” command for Stata. Both implementations can be constrained to match a known mean.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Paul T. von Hippel: Lyndon B. Johnson School of Public Affairs, University of Texas at Austin
Email: paulvonhippel.utaustin@gmail.com

David J. Hunter: Department of Mathematics and Computer Science, Westmont College
Email: dhunter@westmont.edu

McKalie Drown: Department of Mathematics and Computer Science, Westmont College
Email: mdrown@westmont.edu

Acknowledgements: Drown is grateful for support from a Tensor Grant of the Mathematical Association of America.

Citation: von Hippel, Paul T., David J. Hunter, and McKalie Drown. 2017. “Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching.” Sociological Science 4: 641-655.
Received: September 23, 2017
Accepted: October 8, 2017
Editors: Jesper Sørensen, Stephen Morgan
DOI: 10.15195/v4.a26

Gini, Grouped Data, Income Brackets, Inequality

Navigation

Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching

Sociological Science, November 15, 2017
DOI 10.15195/v4.a26

No reactions yet.

Write a Reaction Click here to cancel reply.

Navigation

Sociological Science, November 15, 2017 DOI 10.15195/v4.a26

Abstract

No reactions yet.

Write a Reaction Click here to cancel reply.

Sociological Science, November 15, 2017
DOI 10.15195/v4.a26