Speaker "Sanghamitra Deb" Details Back



"From Rocket Science to Data Science"


Data Scientist has been regarded as the sexiest job of the twenty first century. As data in every industry keeps growing the need to organize, explore, analyze, predict and summarize is insatiable. Data Science is creating new paradigms in data driven business decisions. As the field is emerging out of its infancy a wide range of skill sets are becoming an integral part of being a Data Scientist. In this talk I will discuss the different driven roles and the expertise required to be successful in them. I will highlight some of the unique challenges and rewards of working in a young and dynamic field.


Data Scientist: Accenture Tech Laboratory May 2014-Present Data modeling and machine learning for research and client projects. Other Duties: Data Scientist: Gild Nov 2013-March 2014 Data Modeling and Machine learning. Postdoctoral Fellow: Argonne National Laboratory. June 2011-Oct 2013 Research Associate: Lawrence Berkeley National Laboratory August 2010-June 2011 Ph.D. , Physics, Drexel University August 2005-June 2010 Core Skills: Extensive experience with the Python machine learning stack. scipy pandas numpy ipython matplotlib seaborn vincent scikit-learn statsmodels nltk astroML fuzzy-wuzzy flask-rest api Expertise is Machine Learning and Data Modeling. time series topic modeling semantic modeling data profiling & mapping n-grams NLP Data Munging: Experience with database technologies such as MySQL and MongoDB and data transformation tools such as Trifacta. Prototyping in Apache Spark using python on the hadoop ecosystem (eg: BlueData) and DataBricks Cloud notebooks. Code deployment in EC2. Currently creating a back-end integration with New Expertise Development: Data Modeling in R. Working with Vizualization tools. d3(,Tableau ( Credited the course Introduction to Databases by Professor Jennifer Widom,, Introduction to Data Science by Prof. Bill Howe , Machine Learning by Prof. Andrew Ng ,. Attended PyData 2014, Graphlab 2014, MLConf 2014. Attended several spark training sessions. Hobby Projects: Jobs Jam: Applied NLP on unstructured data to create an application to help middle range job seekers such as veterans. kaggle competitions: Finished the amazon challenge using one hot encoding and logistic regression, titanic problem using and digit recognition challenge using random forest to achieve 96.37% accuracy. San Francisco crime data: Statistical Analysis of open crime data of the city of San Francisco DS Interview process: Created an IPython based presentation encompassing the different aspects of interviewing for a Data Scientist Position. Speaking on “Becoming A DataScientist” at GITPRO ( Organizer of Women in Data Science. Arranging meetups and for local female data scientist. @sangha_deb 5105016863 (cell) Github: Google Scholar: