Back Industry News

Big Data Mining In Email Receipts Pays Off With Human Analysis Added Posted on Sep 13 - 2017

Share This :

The search for big data knows no limits. As Apple was releasing its latest iPhone, alternative big data mining providers were scraping social media reaction to feed into trading algorithms (the stock sold off after the product announcement). Now new analysis from Macquarie examines the value of email receipt big data mining, pointing to the ability to predict Amazon’s retail sales to various degrees of success before the quarterly earnings announcement. But the key is not only technical, but applying the human touch makes the difference.

Big Data Mining learns a lot by studying consumer emails

For Macquarie, diving into the big data pool was an eye opening experience. What they ultimately found was that quantitative investment methods with alternative data are best used when combined with fundamental analysis, not in replacing human analysis.

For their study, Macquarie worked with email receipt data provided by Quandl, a platform for financial data. The company scans numerous niche information sources including millions of email inboxes where receipts of consumers can be found. This provides insight into transaction level information such as product description, taxes paid and shipping accounts all from artificial intelligence scans of their email accounts. The data is updated weekly and covers a wide variety of e-commerce platforms.

From Macquarie’s standpoint, transforming the raw transaction data into tradable information was no small task, but the exercise proved meaningful.

Massive computer power required to mine massive vertical niches

In a September 11 report titled “Big is beautiful: How email receipt data can help predict company sales,” Macquarie learned firsthand the challenges.

While the date they used only cover three listed companies – Amazon, Walmart and H&M – the absolute size of the dataset made it a challenge to conduct “even the simplest queries” using standard database tools.

Macquarie then turned to Amazon Redshift, a data warehouse solution that transforms standard SQL analysis into a quick and cost effective process. This became their preferred solution, allowing analytical processing to occur through simple syntax or with slight modifications to standard SQL queries.


Get the Global Big Data Conference

Weekly insight from industry insiders.
Plus exclusive content and offers.