
a52e2fd33f07c8dab935e6db94ccd103.ppt
- Количество слайдов: 10
Big Data in the National Accounts Experience in the United States Brent Moulton Advisory Expert Group on National Accounts Washington, DC 9 September 2014 www. bea. gov
What are big data? ▪ Wikipedia: “Any collection of data sets so large and complex that it becomes difficult to process using… traditional data processing applications. ” ▪ IBM: “Every day we create 2. 5 quintillion bytes of data… This data comes from everywhere… This is big data. ” ▪ Forbes: “ 12 big data definitions: what’s yours? ” § # 11 – “The belief that the more data you have, the more insights and answers will arise automatically from the pool” § # 12 – “A new attitude… that combining data from multiple sources could lead to better decisions. ” www. bea. gov 2
Big data and official statistics ▪ Statistical agencies as producers of big data § § Consistency in format and presentation Catalogued in common, machine-readable format Accessible in bulk Desirable to make government data available on a single platform ▪ Big data as source data for national accounts § Administrative data, especially micro-data § Data from private sources § Web scraping www. bea. gov 3
Concerns about using big data ▪ Do the concepts match those needed for national accounts? ▪ How representative are the data? § Selection biases ▪ Is it possible to fill the gaps in coverage? ▪ Do the data provide consistent time series and classifications? ▪ How timely are the data? ▪ How cost effective? www. bea. gov 4
Defined-benefit pension funds ▪ For the SNA’s new treatment of definedbenefit pensions, BEA found it useful to work with administrative micro-data filed by pension funds § “Form 5500” data from Pension Benefit Guaranty Corporation § ~ 45, 000 records per year covering 98% of private pension funds § BEA had to edit data to remove data errors and anomalies www. bea. gov 5
Private source data for early estimates ▪ For “advance” GDP estimate (release about 30 days after the end of the quarter), official monthly/quarterly indicators are not always available ▪ Examples of private source data used by BEA: § § § www. bea. gov Ward’s/JD Powers/Polk (auto sales/price/registrations) American Petroleum Institute (oil drilling) Air Transport Association of America (airlines) Variety magazine (motion picture admissions) Smith Travel Research (hotels and motels) Investment Company Institute (mutual fund sales) 6
Health care satellite account ▪ Schultze Commission (At What Price? 2002) recommended that health care price indexes should be based on cost of treating a specific diagnosis ▪ BEA is preparing a health care satellite care (http: //www. bea. gov/national/health_care_satellite_account. htm) § One approach uses insurance claims data for several million insured individuals § Claims grouped in disease episodes § Allows comparison of change in cost for treating particular diseases www. bea. gov 7
Local area tracking system ▪ Used by BEA’s regional accounts staff for independent data on regional economies ▪ Used to vet official statistics before publishing ▪ Types of data § Employment data: largest employers, principal industries, recent layoffs § Natural events affecting the economy § Local real estate and financial trends ▪ Automated using web scraping methods § Identifying key word searches § Archiving relevant articles www. bea. gov 8
BEA research on depreciation ▪ Identifying depreciation in the presence of obsolescence is a long-standing issue ▪ BEA research on motor vehicle depreciation proposes to address this problem using data on “build dates, ” which can differ from model years ▪ Data scraping – VIN-level data from decodethis. combined with auction data from NADA and data from other auto websites ▪ Goal is improved estimates of depreciation www. bea. gov 9
Conclusions ▪ Big data will become increasingly important ▪ Priority to improving data quality, filling gaps, and keeping up with changing economy ▪ Big data especially useful for research projects ▪ Big data may allow for more timely or higher frequency estimates ▪ Attention must continue to be paid to traditional data quality issues www. bea. gov 10
a52e2fd33f07c8dab935e6db94ccd103.ppt