Speaker "Jie Wang" Details Back



Semantic SentenceRank to Facilitate Reading for Understanding


We present in this talk Semantic SentenceRank (SSR) to rank sentences in a single document according to their relative importance and demonstrate how to use it to facilitate reading for understanding. It begins with Semantic PhraseRank (SPR) to score phrases and words using an article-structure-biased (ASB) PageRank algorithm on a weighted semantic phrase-word graph for a given document. SPR then scores a sentence by summing up the scores of the phrases and words it contains after Softplus elevation. Next, SSR creates a semantic sentence graph based on Word Mover's Distances. SSR scores nodes using modified ASB PageRank, and combines them with the SPR scores to score sentences. Finally, SSR ranks sentences based on sentence scores and topic diversity through affinity propagation subtopic clustering, which dynamically determines an appropriate number of topics during clustering. We show that on the SummBank benchmarks, SSR significantly outperforms each individual human judge and produces almost the same ranking of sentences as the combined ranking of all judges. Next, we demo a text mining tool that uses SSR and other text mining techniques to assist reading. In particular, it highlights with different colors blocks of sentences in descending order of importance, where sentences may not be consecutive in the document. This allows the user to focus on reading the most important block first, then the next important block interleaving with the previous blocks of sentences in the original order, and continue reading in this fashion until the entire document or a certain layer of blocks is read.

Who is this presentation for?

Prerequisite knowledge:

What you'll learn?


Dr. Jie Wang is Professor of Computer Science at the University of Massachusetts Lowell and an adjunct data scientist at the VA Hospital in Massachusetts. He chaired the department from 2007 to 2016. He received a PhD in Computer Science from Boston University in 1990, an MS in Computer Science and a BS in Computational Mathematics both from Sun Yat-sen University in, respectively, 1984 and 1982. He has about 30 years of teaching and research experience at the university level and has worked as a network security consultant for a national bank. His research interests include data modeling and applications, text mining and learning, text automation systems, machine learning, algorithms and combinatorial optimizations, medical computation, network security, and computational complexity theory. He has published over 180 journal and conference papers, 12 books, and 4 edited books. His research has been funded by the National Science Foundation, IBM, Intel, and a few startup companies. He is active in professional service, including chairing conference program committees and organizing workshops, serving as journal editors and the editor-in-chief of a book series on mathematical and interdisciplinary modeling. He has graduated 18 PhD students and is currently directing 7 PhD students.