Back

Speaker "Anusua Trivedi" Details Back

 

Topic

Transfer Learning NLP: Machine Reading Comprehension for Question Answering

Abstract

Modern machine learning models, especially deep neural networks, often significantly benefit from transfer learning. In computer vision, deep convolutional neural networks trained on a large image classification dataset such as ImageNet have proved to be useful for initializing models on other vision tasks, such as object detection. But how can we leverage the transfer leaning technique for text?
 
Question answering (QA) is a long-standing challenge in NLP, and the community has introduced several paradigms and datasets for the task over the past few years. These paradigms differ from each other in the type of questions and answers and the size of the training data, from a few hundreds to millions of examples. For human beings, reading comprehension is a basic task, performed daily. As early as in elementary school, we can read an article, and answer questions about its key ideas and details. But for AI, full reading comprehension is still an elusive goal. Therefore, building machines that can perform machine reading comprehension (MRC) is of great interest. Recently, several researchers have explored various approaches to attack MRC transfer learning problem. Their work has been a key step towards developing some scalable QA solutions to extend MRC to a wider range of domains.
 
In this session, we capture a comprehensive study of existing text transfer learning literature in the research community. We explore popular Machine Reading Comprehension (MRC) algorithms. We evaluate and compare the performance of transfer learning approach for creating a QA system for a book corpus using the pretrained MRC models. For our evaluation scenario, the performance of the Document-QA model outperforms that of other transfer learning approaches like BIDAF, ReasoNet and R-NET models. We compared the performance of finetuning learning approach for creating a QA corpus for this book using a couple of these pretrained MRC models. For our evaluation scenario, the performance of the OpenNMT model outperforms that of the SynNet model.

Profile

Anusua Trivedi is a Data Scientist at Microsoft’s Cloud AI Platform Team. She works on developing advanced Deep Learning models & AI solutions. She’s an advanced trainer and conducts hands-on deep learning labs. Prior to joining Microsoft, Anusua has held positions with UT Austin and University of Utah. Anusua is a frequent speaker at machine learning and AI conferences.