Back

Speaker "Anjali Shah" Details Back

 

Topic

Abstractive Summarization of Industry Specific Long Documents

Abstract

Recent advances in abstractive text summarization have concentrated on developing baselines with news articles datasets. News articles tend to be narratively threaded with key concepts having shorter span dependencies between them. Many downstream domain specific text summarization tasks, such as legal documents, tend to have longer spans of text between key concept dependencies. To address this requirement of longer, domain specific documents, we introduce a sequence-to-sequence model architecture that combines the encoder of a state-of-the-art model with the decoder of another state-of-the-art model. We test the performance of our proposed encoder-decoder model using the CNN/Daily Mail dataset to establish a baseline for comparison with recent state-of-the-art models. We further finetune the decoder on legal domain specific BillSum dataset and report the results of our experimental runs. We use the evaluation methods from SummEval, an evaluation toolkit designed specifically for summarization tasks. Our finetuned model on BillSumm outperforms our baseline model using CNN/Daily Mail on many important metrics for abstractive summarization.
Who is this presentation for?
Machine Learning, deep learning scientists who work with text data
Prerequisite knowledge:
Some understanding of neural network architectures used in deep learning
What you'll learn?
How recent advances in natural language processing using deep learning is helping us gain tremendous insights from text data.

Profile

Anjali is a Senior Data Scientist at IBM's Technology Garage helping clients across many industries (healthcare, financial services and telecommunications) along their AI and hybrid cloud journey. Her expertise in applying cutting-edge technology to analyze structured and unstructured data has helped her clients convert data into actionable business insights. Her early career in software engineering focused on managing complex projects with strict deadlines (having delivered multiple technology solutions). She has presented as speaker at various data science and AI conferences. Prior to joining IBM, she has delivered 80+ lectures as Assistant Professor in Health Information Management. She has a Ph.D. in Biomedical Informatics and Applied Statistics, Master’s and Bachelor’s degrees in Computer Science.