Back

Speaker "Matei Negulescu" Details Back

 

Topic

Building Genomic Data processing and machine learning workflows using Spark

Abstract

At Epinomics, we are advancing epigenetic research to drive personalized medicine, using epigenomic data analysis. Our goal is to provide an analysis resource to the community that will promote high quality, replica-table, and interpretable results. We work with academic and commercial users to get their genomic sequencing data and metadata in our system. We find some epigenetic features from the sequenced genome, which are called ""chromatin accessibility"" which is indicative of the instrumental epigenetic changes responsible for differential gene expression and disease development. We have built a Spark-based pipeline which retrieves chromatin accessibility data from the epigenome and runs analysis finding overlapping accessibility using GraphX, cluster this data and run machine-learning algorithms. In this talk we will provide a primer on epigenomics, details about how we have built a Spark based data pipeline focusing on parallel bioinformatic analysis and using machine learning models to learn insights for building an Epigenomic map to help accelerate research into personalized immunotherapies.

Profile

Matei is a computer scientist specializing in Human Computer Interaction. His love of user-centric products started while doing his PhD in Canada and continued at companies such as SAP and Google. At Epinomics, Matei is working on a new line of scientific management and epigenomic analytics products.