Back

Speaker "Deepak Paramanand" Details Back

 

Topic

Less Processing More Analysing

Abstract

Our customers spend more time processing text documents than analyzing the content of the documents. On avg they analyse 10,000 documents or 40,000 pages or 10 million words or 1 million sentences over a period of 6 months. 
Using Natural Language Processing (e.g. NLTK in Python) we will 

1. Categorize these documents into Legal, Finance, Environmental based on frequency of words in such documents.
2. Provide a list of 5 word to 20 word sentences with frequency by document
3. Reduce the 1 million sentences to 100,000 sentences based on redundancy in 7 days

Further, if our customers dont like our categorization or reduction they can use the default categorization and teach the software to look for specific words or phrases.
The more they teach the software the better it will get. We intend to use Tensorflow for this.

Ultimately customers spend more time analyzing the content and less time processing the data.

Profile

Product Manager, AI, NLP & Blockchain.