Speaker "David Talby" Details Back



Building accurate longitudinal patient records using Spark NLP


Building databases that track real patients’ stories over time is essential for medical research, drug development, epidemiology, population health, and chronic disease management. Doing this well traditionally presents three key challenges. First, a lot of relevant information such as patient demographics, comorbidities, history, and social determinants of health is only available in free-text documents and notes. Second, there are gaps and conflicts between different data points about each patient which must be resolved. Third, a large number of both patients and variables are required to make most analyses useful – which in turn means that building these databases manually is often impractical.
This session describes these challenges in the context of real-world projects and use cases. We’ll then cover how recent advances in natural language processing (NLP) and transfer learning have changed the game in terms of achievable accuracy and scale. Results and benchmarks from doing so using Spark NLP for Healthcare will be shared, as well as best practices and lessons learned from early adopters of the technology.


David Talby is a chief technology officer at John Snow Labs, helping healthcare & life science companies put AI to good use. David is the creator of Spark NLP – the world’s most widely used natural language processing library in the enterprise. He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK. David holds a PhD in computer science and master’s degrees in both computer science and business administration.