Speaker "Anurag Bhardwaj" Details Back



Large-Scale Multimodal Automated Document Categorization in eCommerce


We present a novel framework for large-scale multimodal automated document categorization in eCommerce. Unlike existing techniques for document categorization,our proposed framework integrates signals from multiple modalities such as text and images. First, state-of-the-art classification systems are built for each modality of signals. For text classification, we employ two classifiers in the of traditional Bag-of-Words (BoW) based word representation and recently proposed word vector embedding (Word2Vec) based representation. These systems utilize both product titles as well as product breadcrumbs present on eCommerce product pages or documents. For image classification, 8-layer Convolution Neural Network (CNN) is trained on the primary thumbnail found on each product page. To combine the results from all classifiers, a majority voting based classifier fusion strategy is proposed. To illustrate the efficacy of the proposed framework, we conduct experiments on a large dataset of eCommerce categories spread across 100+ merchants, leading to higher degree of heterogeneity in the form of different styles and qualities of product titles and images.Our experimental results demonstrate superior performance of our system on a large number of categories thereby showcasing good generalization capabilities of our automated categorization system.


Anurag Bhardwaj currently leads data science efforts at Quad Analytix, where he focuses on large-scale product classification, large-scale smart extraction, and various other machine-learning techniques. Previously, he worked on image understanding at eBay Research Labs. Anurag received his PhD and MS from the State University of New York at Buffalo and holds a BTech in computer engineering from the National Institute of Technology, Kurukshetra, India.