What search engines tell about the economic

Article tiré du journal Guardian Weekly.

Big data remains in its infancy – one of the main challenges of using the data is its short history.

If you really want to know how the economy is doing now, just Google it. At least that’s the goal of a growing number of researchers who are turning to big data in hopes of unlocking the secrets of the economy at the speed of the internet. The movement – dubbed « nowcasting » – is piquing the interest of policymakers frustrated by the lag in official government statistics as they make decisions where timing is everything.

Want to figure out where prices are headed in 86 countries on a given day? A project at the Massachusetts Institute of Technology tracks them. How many Americans will file for unemployment benefits in one week? Economists at the University of Michigan are tapping Twitter to estimate the number of new applicants. Are more young men finding jobs? Google suggests the incidence of searches for adult entertainment can provide a clue.

« Statistics serve us really well and are completely essential as benchmarks for where the economy is – or more precisely, has been, » said Matthew Shapiro, an economist at the University of Michigan working on the Twitter project. « But we don’t have a lot of indicators that tell us what’s happening right now, particularly when the economy is changing direction. »

In fact, there’s a running joke that economics is like driving a car by looking through the rear-view mirror. Case in point: the government announced in May that the economy shrank – two months ago. The first official reading of where things stand now won’t be ready until July.

The delay is a natural consequence of the government’s meticulous method of collecting data, which still relies heavily on phone conversations with families and businesses. Though its numbers are considered the gold standard, the aftermath of the great recession has shown the data can come too late for policymakers at crucial moments in the recovery.

The contraction at the start of the year, for example, came as lawmakers in Washington debated whether to extend benefits for the long-term unemployed. That measure languished, in part because officials believed the recovery was strengthening.

The slowdown also coincided with the phase-out of the Federal Reserve’s trillion-dollar stimulus programme – a move that was supposed to signal the central bank’s confidence in the economy. Federal Reserve chair Janet Yellen acknowledged in a recent speech that it probably « overdid the optimism » early in the year.

« You’re not just missing accuracy, but you’re missing a turning point, » said Keith Hall, a senior research fellow at the Mercatus Centre at George Mason University and former head of the Bureau of Labour Statistics (BLS).

In the midst of the recession, Google’s chief economist, Hal Varian, released a paper showing how to use the company’s search data to measure auto sales and consumer spending, among other things. Now researchers both inside and outside of government are using it to estimate everything from unemployment to mortgage delinquencies.

The Bank of Israel and the Bank of England incorporate Google analytics into some of their forecasts. Last month Varian gave the keynote address at a workshop on big data convened by the European Central Bank. He also recently briefed top White House officials on how to use its data. « We don’t have any better ways to predict the future, » Varian said. « What we’re working on is predicting the present. »

The models are based on connections between key search terms and related economic indicators. Google found that rising unemployment was not only linked to phrases such as « companies that are hiring ». It was also closely correlated to searches for new technology (« free apps »), entertainment (« guitar scales beginner ») and adult content. The company said its data can improve the accuracy of standard estimates of economic data in a current month as much as 10%.

At the University of Michigan, Shapiro and his colleagues scoured more than 19bn tweets over two years for references to unemployment, hunting for phrases such as « axed », « pink slip » and « downsized ». They indexed the findings and compared them to the government’s weekly tally of people applying for unemployment benefits for the first time.

Their results are remarkably similar – and where they do diverge, the Twitter index may be more reliable. Computer malfunctions and the US government shutdown last year distorted the official numbers, while the trends in Shapiro’s index held firm.

« If this is how people are going to communicate, it behoves us to try to figure out how to do the measurement that way, » he said.

There is also another, more mundane, reason economists are experimenting with new types of data: federal budget cuts mean the government is producing fewer statistics. The BLS slashed its budget by $30m, or 5%, last year in response to federal spending cuts. It no longer produces its survey of mass layoffs and may trim its quarterly census of employment and wages, which provides an important benchmark for monthly job estimates.

« The quality and quantity of some BLS data will likely be diminished, as fewer resources are available to collect and review data or to perform data analysis, » the agency said on its website.

Still, big data remains in its infancy. In fact, one of the main challenges of using the data is its short history: Twitter was created in 2006; the US government began calculating gross domestic product during the Great Depression.

There are also concerns that people who use the internet or social media sites do not represent the broader public, thus skewing the results. Though economists who work with big data say they adjust for those factors, those techniques are still a work in progress.

« More data is not always better, » said Jasper McMahon, co-founder of Now-Casting Economics, which does not use social media or search trends in its calculations. « You can be blinded by having access to masses and masses of data. But that exposes you to masses and masses of noise. »

Others raise a more fundamental issue: ultimately, using big data is just a faster way of calculating what the government is already doing. Some argue what’s needed instead is an overhaul of how we measure – and judge – the world around us to include intangibles such as happiness, education and health.

« There’s not really a one-size-fits-all number that’s going to describe the experience of everybody or even most people, » said Zachary Karabell, an economist and author of The Leading Indicators. « I don’t think we should create that fiction. »


Ylan Q Mui

This article appeared in Guardian Weekly, which incorporates material from the Washington Post.