For Good Measure Read online

Page 10


  As stated, the overwhelming majority of inequality data has been based on household surveys. The Nordic countries, however, stand apart from other countries, due to their reliance on a well-developed system of registers that allow statistical offices to get information on personal income (and sometimes wealth) from various personal records, which are then combined into household files. While administrative records allow more precise information on people’s economic resources to be obtained, and to link these resources for the same individual and sometimes across generations, these registers are far from perfect. An important downside is that they may only imperfectly match people belonging to the same household, and record members of the same household as separate households (e.g., students living away from the parental home for part of the year).

  The distinction between survey-based and record-based methods is, moreover, becoming increasingly blurred, as several statistical offices in advanced countries have come to rely on mixed methods of data collection, whereby some of the information required by the survey is retrieved from administrative records (in most cases with the prior consent of the person being interviewed), or information from administrative records is used to identify groups of individuals that should be oversampled in the survey (as done by the Survey of Consumer Finances in the United States). While these mixed methods of data collection have proved effective in delivering higher-quality information, their use is sometimes limited by statistical laws and administrative constraints. Obviously, the quality of statistical information provided by administrative registers depends on the quality of the registers (e.g., on how widespread tax evasion is), on the capacity of various administrations to link their records, and so on.

  An additional challenge for data on the distribution of household economic resources refers to the problems in reconciling the totals from micro-data—i.e., consumption, income, and wealth totals from household surveys and administrative records—with those available through macro-data—i.e., totals for the (supposedly) same variables in the System of National Accounts. For most countries in the world, totals for household income and consumption from surveys do not match the equivalent totals from national accounts. These differences can be very large in some countries, as illustrated by Table 3.3 for a sample of Latin American countries.25 Also, discrepancies are not limited to levels of different types of household economic resources but extend, more importantly, to their growth rates (Deaton, 2005). Gaps between macro- and micro-statistics have been widening in many countries. While the causes of this pattern are not all well understood—and some of the discrepancy is probably due to the same problem of under-coverage and under-reporting of top incomes mentioned above—its very existence casts doubts on efforts to disentangle the relation between GDP growth and income distribution on the basis of metrics that rely not only on different definitions of household income, but which may also suffer from a series of measurement errors themselves. Diverging trends in income and consumption growth between household surveys and national accounts have led to the creation of the OECD-Eurostat Expert Group on Integrating Disparities in National Accounts in Europe. The US Census Bureau and others in the United States are also trying to address this challenge.

  One of the most important limitations of household surveys is that they underrepresent the rich and the poor, and under-report incomes at both ends of the distribution. Much of the current attention of researchers and statisticians has focused on the top end of the distribution. While this issue will be taken up in more detail in a later section of this chapter, it should be stressed here that—as emphasized by Deaton (2005)—there can be no general supposition that estimated inequality will be biased either up or down in the case of “selective under-sampling.”26 Issues of noncoverage, underrepresentation, and under-reporting of the richest households become particularly relevant whenever much of the action concerning changes in the distribution is taking place at the top (as has been the case in many countries over the last decades) and is particularly problematic in very unequal societies, characterized by income and wealth being highly concentrated in the hands of a small number of families.

  The potential for mismeasurement is, however, not limited to the top end of the distribution but extends to the bottom end, as discussed in Atkinson (2016). Many poor people may not be adequately covered by existing measures, due to lack of a permanent address (e.g., the homeless), because they live in collective living quarters (e.g., slum-dwellers), or because they are recent arrivals in the country (e.g., refugees). Because of the undeclared and sometimes illegal nature of their activities, very poor people may also be unwilling to fully declare their income when asked in surveys. Many low-income people often report levels of consumption expenditures well in excess of their declared income, a factor that underscores the importance of joint analysis of income, consumption, and wealth to assess, for instance, whether the poor are “eating up” their assets.

  While problems of underrepresentation and under-reporting exist at both ends of the distribution, for inequality measures it is particularly relevant to correct the data for the missing rich, a topic that is discussed in the next section.

  Table 3.3. Ratio of Mean Income in Household Survey to Mean Household Final Consumption Expenditure per Capita in National Accounts, Selected Latin American Countries

  Source: Bourguignon, F. (2015b), “Appraising income inequality databases in Latin America,” in Ferreira, F.H.G. and N. Lustig (eds.), “Appraising cross-national income inequality databases,” special issue, Journal of Economic Inequality, Vol. 13(4), pp. 557–578. StatLink 2 http://dx.doi.org/10.1787/888933839544.

  The “Missing Rich” in Household Surveys

  Whether they collect data on income, consumption, or wealth, there is reason to believe that household surveys do not capture the rich well. How do we know that very high incomes are not captured in household surveys? Why is this issue important? What are its causes? What can be done to address the problem? Here I present a synthesis of the factors that give rise to the “missing rich” problem in household surveys, and review the approaches that have been proposed to deal with the problem.27

  By inspection, one can observe that the top incomes as measured by surveys are at most close to the earnings of a well-paid manager; additionally, capital incomes as measured by surveys are a tiny fraction of what national accounts identify as the amounts accrued to the household sector.28 The fact that rich individuals are largely missing and that their income is frequently under-reported in household surveys may explain in part the worrisome result that, especially in middle- and low-income countries, the survey-based measure of per capita household income (or some of its components) or consumption frequently show levels substantially lower than the per capita household income or consumption from either national accounts29 or tax records.30 The missing rich problem may explain as well why there are striking discrepancies in inequality levels and trends, depending on the source of the data (e.g., surveys versus tax records) (see Alvaredo and Londoño-Velez, 2013; Alvaredo et al., 2015; and Belfield et al., 2015). If the rich are missing, the survey-based distributions of income, consumption, or wealth, and the concomitant inequality measures should be viewed with caution: actual inequality may be considerably different than that recorded by survey estimates.31 As discussed below, however, it is not necessarily true that correcting the information for the rich that are missing will necessarily result in higher inequality.

  The most obvious reason why the rich, especially the ultra-rich, are missing in household surveys is because there are very few of them in the target population; thus, the probability of including one of these individuals in a survey (sample) is rather low. As discussed in Lustig (forthcoming) there are, essentially, five additional factors embedded in the data collection process that may give rise to the missing rich problem in household surveys: (1) frame or noncoverage error; (2) unit nonresponse; (3) item (income) nonresponse; (4) under-reporting; and (5) top coding and trimming. Surveys may
suffer from one or any combination of problems 1–5, and any one of them can potentially result in an underestimation of the income share of the top income fractile. In addition, as mentioned above, even if there is full coverage and response rate, no under-reporting and no top coding or trimming, rich individuals may not appear in household surveys due to sparseness: i.e., there is no density mass at all points of the upper tail of the true distribution’s support, especially for extreme values.32 Sparseness or low frequency of observations at the top will result in a frequent underestimation of the income share of rich individuals but, on occasion, the income share may be overestimated.

  In the presence of any of the sampling and nonsampling problems described above, survey-based inequality measures will be biased. The direction of the bias in inequality measures can be positive or negative, as use of the corrected data will affect both what happens at the top but also on how correcting for the missing rich problem affects the mean (Deaton, 2005).33 Even if there are no errors in the achieved sample that led to biased inequality estimates, sparseness in the upper tail can result in volatile inequality estimates. If the rich are selected in the sample with a very low frequency, the survey-based inequality measures will frequently be below the true inequality measure and above it on occasion (Higgins, Lustig, and Vigorito, 2017).

  As described in Lustig (forthcoming), a variety of approaches have been proposed in statistics and in the measurement of inequality literature to address the missing rich problem.34 In terms of the data sources used, these can be classified into three broad groups: alternate data (i.e., relying on alternative data such as tax records instead of surveys); within-survey corrections (i.e., correcting top incomes in surveys using parametric and nonparametric methods); and survey-cum-external data (i.e., correcting survey data or inequality estimates by combining surveys, administrative data, and national accounts using parametric and nonparametric methods).

  A key distinction among existing methods is whether they correct the data by replacing incomes at the top by a parametric distribution (e.g., Pareto) or using external information (e.g., tax records); or by changing the weights of the “rich” and “nonrich” population, i.e., reweighting or poststratification. The first approach assumes that the population shares of top incomes (the rich) and the rest (the nonrich) in the achieved sample survey are correct, and that the problem lies in that the incomes captured at the top are incorrect. This can occur either because the incomes in the survey are under-reported or because the individuals captured by the survey are not really representative of the rich (due to undercoverage, underrepresentation, top coding, and/or sparseness). The second approach assumes that the population weights for the rich and nonrich in the sample are incorrect: one must “add people” at the top either by increasing the weights of rich individuals in the survey or generating the upper tail through some parametric or nonparametric method. Under the replacing and reweighting approaches, there exist a variety of methods. Table 3.4, drawn from Lustig (forthcoming), presents a summary of the correction approaches and refers the reader to a sample of their applications.

  Broadening the Indicators of Households’ Economic Well-Being

  There is a long-standing discussion, among economists and statisticians, about the best metric to describe people’s economic well-being. One perspective, articulated by Stiglitz, Sen, and Fitoussi (2009), is that, ideally, one would like to focus on the distribution of consumption possibilities across people, socio-economic groups, and generations. While income flows and wealth holdings are an important gauge for assessing power relations within a community, a narrower economic view is that what really matters for people’s economic well-being is what they are potentially able to consume over time—including across generations.

  Consumption possibilities are determined not only by current earned income but also by accumulated wealth and by the ability to borrow against existing wealth or future savings. Wealth is an important indicator of the sustainability of observed consumption: for a given income, consumption can be raised by running down assets or by increasing debt. Similarly, savings and additions to assets reduce consumption for a given level of income. In addition to earned income flows and wealth, consumption possibilities are determined by transfers between households (e.g., gifts, remittances, and inheritance) and within them (e.g., from income earners to other members).

  Table 3.4. Approaches to Address the Missing Rich Problem in Household Surveys

  Note: The “mapping” of studies to methods under the “References” column should be viewed as an approximation because studies frequently apply more than one method.

  Source: Lustig, N. (forthcoming), “The missing rich in household surveys: Causes and correction methods,” CEQ Working Paper, No. 75, Commitment to Equity Institute, Tulane University, Table 1. StatLink 2 http://dx.doi.org/10.1787/888933839563.

  Consumption possibilities are also determined by state action. Subtracting direct taxes (e.g., personal income and wealth taxes) and social security contributions paid by workers, and adding current transfers provided by governments and nonprofit institutions (e.g., cash transfers to the poor or to people unable to work) to earned and unearned income yields disposable income. Disposable income at any point in time, however, does not capture consumption possibilities accurately. A better indicator of the latter is final consumption expenditures, equal to disposable income plus consumption financed by borrowing or by drawing down assets and less saving. In practice, however, measured final consumption expenditures do not capture consumption possibilities accurately either. For example, the benefits from consumer durables other than housing are typically recorded when expenditures are incurred, rather than over the longer period when these benefits are provided. In some instance, to avoid distortionary spikes in consumption expenditures, spending on consumer durables other than housing is not included at all. Additional limitations occur with the exclusion of specific types of difficult-to-measure flows (such as imputed rents, i.e., the income that accrues to property owners from the dwellings that they own; or the value of goods produced by households for own consumption, which are important in countries with large subsistence farming).

  In this section, however, we would like to draw attention to two elements that are typically excluded from the conceptual definitions of household income and expenditures that are conventionally used in analysis of economic inequalities: free in-kind services (especially, education and health care) provided to households by governments and nonprofit institutions; and consumption taxes and subsidies.35

  Social Transfers in Kind

  In addition to earned income and cash transfers, households receive benefits in kind such as education, health care, and social housing that governments provide to households for free (or at highly subsidized prices) and whose provision is financed out of taxes (and often user fees or other forms of direct payments made by the user of such services). Including these in-kind benefits in measures of household income and consumption is important, for example, to avoid that reductions in direct taxes, offset by lower provision of these government services, lead to higher measures of people’s economic welfare simply because the concomitant reduction in public services has not been recorded. Adding the value of those services—also called “social transfers in kind”—to household income and consumption provides, in theory, a better measure of households’ consumption possibilities. However, there is no consensus on how to make these imputations; there are also concerns that such imputations may lead to metrics that are further away from what people actually experience (UNECE, 2011).

  Valuing social transfers in kind raises both conceptual and measurement challenges. Decisions are needed in terms of the range of services to be considered (ideally, all types of individualized services provided by governments and nonprofit institutions, excluding public goods such as defense or law and order); the monetary valuation of the services provided; and their allocation to various beneficiaries.36

  In practice,
the most frequently used approach is to value in-kind transfers at the production costs incurred by the government in producing them (Lustig, 2018a). For education, the method most commonly used consists of attributing a value to an individual who attends public school, using values equal to the per-beneficiary input costs obtained from administrative data, and adding this value to the household’s income. For example, average government expenditure per primary school student obtained from administrative data is allocated to households based on how many children are reported attending public school at the primary level (the same method applies to other levels of schooling). Information on whether school-age children are attending public or private school, or whether they are in school at all, may not be collected in income and consumption surveys, so that general allocation based on the age of children may fail to identify the true beneficiaries or allocate to them a benefit that they never received.

  Imputation to individual service users is even more complex in the case of health care. In this case, the allocation of benefits is done following either the “actual consumption approach” or the “insurance value approach.” As described in Higgins and Lustig (2018), the first approach allocates the value of public services to the individuals who are actually using the service. The second approach assigns the same per capita spending to everybody sharing the same characteristic such as age or gender, irrespectively of their actual use of these services, based on the principle that all people with the same demographic characteristics are entitled to these public benefits. The reliance on one approach over the other depends, often, on data availability, but the choice, along with leading to very different empirical results, raises conceptual problems. To impute the value received from public health services on the basis of actual consumption, the household survey must provide information about the use of health services, and distinguish between public care (which is usually received from the public health system or paid for by public health insurance schemes) and private care. In the absence of information about whether the care received was subsidized by government, a survey may ask about whether the patient is covered by private insurance. Patients who received health care and report having private health insurance are considered to have received private care, and thus received no in-kind transfer, while patients who report not having private health insurance are considered to have received public care. Ideally, the survey should also contain one or more questions about the type of service received (for more details, see Higgins and Lustig, 2018). Attributing health care services to users also implies making sick people “richer” than they would otherwise have been, while also raising the issue of whether allowance should be made for their higher needs, which are ignored by the equivalence scales typically used in analysis.