Speaker "Anirudh Koul" Details Back



How advances in deep learning can empower the blind community


Motivated by making technology more accessible, we’ll explore how deep learning can enrich image understanding that can, in turn, enable the blind community to experience and interact with the physical world in a more holistic manner than has ever been possible before. The intersection of vision and language is a ripe area of research and, fueled by advances in deep learning, is shaping the future of artificial intelligence. Exploring how computer vision has evolved through history and outlining cutting-edge research in this area, we’ll explore the areas of object recognition, image captioning, visual question answering, and emotion recognition. Using a 152-layer neural network, we first discuss the successes and pitfalls of object recognition. Going beyond object classification, we attempt to understand objects in context (as well as their relationships) and describe them in a sentence. We conclude by examining the exciting area of visual question answering, which enables blind users to get answers to questions asked about their surroundings. We also briefly cover Microsoft’s Cognitive Services, the set of machine-learning APIs for vision, speech, facial, and emotion recognition, whose APIs make it straightforward for developers to integrate state-of-the-art image understanding into their own applications. By the end of the session, you’ll develop intuition about what works and what doesn’t, understand the practical limitations during development, and know how to use these techniques for your own applications.


Anirudh Koul is a Senior Data Scientist at Microsoft. Anirudh brings a decade of production-oriented Applied Research experience on Peta Byte scale Social Media datasets. An entrepreneur at heart driven by innovation, he has been running a mini startup team within Microsoft, prototyping ideas using deep learning techniques for Social Good. He has worked on a variety of Machine Learning, NLP, Deep Learning, Computer Vision and Scalability related projects at Yahoo, Microsoft and Carnegie Mellon University. A regular at hackathons, he has won close to three dozen awards, including top-3 finishes for three years consecutively in the world’s largest private hackathon with 16,000 participants. He has also been invited to showcase some of his recent work at a White House AI event, HBO and National Geographic, and also to the Prime Minister of Canada. You can reach him at and @anirudhkoul