Research
Our lab is broadly interested in advancing machine learning and artificial intelligence, and using these to transform health care. Here we explain our three broad focus areas as well as representative papers. For more papers, see our publications section.
Clinical Prediction
These are exciting times for the practice of medicine. The rapid adoption of electronic health records has created a wealth of new data about patients, which is a goldmine for improving our understanding of human health. Our lab develops algorithms that use this data to make better clinical predictions in areas like antibiotic resistance, multiple myeloma, Parkinson’s disease, and other chronic illnesses. In addition, we are concerned with efforts around fairness and interpretability to ensure accurate, useful, and equitable clinical predictions.
- I. Chen, F. Johansson, D. Sontag. Why is My Classifier Discriminatory?, 32nd International Conference on Neural Information Processing Systems (NeurIPS), Dec 2018.
- N. Razavian, J. Marcus, D. Sontag. Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests. Proceedings of the 1st Machine Learning for Healthcare Conference (MLHC), 2016. [code]
- X. Wang, D. Sontag, F. Wang. Unsupervised Learning of Disease Progression Models. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Aug. 2014. [Slides]
Probabilistic and Causal Inference
Probabilistic inference is one of the cornerstones of machine learning. Whether for parameter inference at training time or answering queries at test time, we build new inference algorithms for inference in undirected and directed graphical models along with tools to analyze their efficacy. We work on probabilistic inference in deep generative models by developing new inference networks that learn to amortize approximate variational inference. In many instances, the quantity of interest within a Bayesian network is of a causal nature. To that end, our lab develops novel methods for answering causal queries that work effectively with high-dimensional data.
- M. Oberst, D. Sontag. Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models. Thirty-Sixth International Conference on Machine Learning (ICML), 2019. [code]
- H. Lang, D. Sontag, A. Vijayaraghavan. Optimality of Approximate Inference Algorithms on Stable Instances. Twenty-First International Conference on Artificial Intelligence and Statistics (AI-STATS), 2018.
- Louizos, C.; Shalit, U.; Mooij, J.; Sontag, D.; Zemel, R. S.; and Welling, M. Causal Effect Inference with Deep Latent-Variable Models. 31st International Conference on Neural Information Processing Systems (NeurIPS), 2017. [code]
- R. Krishnan, U. Shalit, D. Sontag. Structured Inference Networks for Nonlinear State Space Models, Thirty-First AAAI Conference on Artificial Intelligence, Feb. 2017. [code]
- F. Johansson, U. Shalit, D. Sontag. Learning Representations for Counterfactual Inference. 33rd International Conference on Machine Learning (ICML), June 2016. [code]
- D. Sontag, T. Meltzer, A. Globerson, Y. Weiss, T. Jaakkola. Tightening LP Relaxations for MAP using Message Passing. Uncertainty in Artificial Intelligence (UAI) 24, July 2008. [code]
Medical Knowledge and Extraction
Today’s electronic health records are predominately a place for recording a patient’s health data. We aim to develop the foundation for the next-generation of intelligent electronic health records, where machine learning and artificial intelligence is built-in to help with medical diagnosis, automatically trigger clinical decision support, personalize treatment suggestions, autonomously retrieve relevant past medical history, make documentation faster and higher quality, and predict adverse events before they happen. A major challenge is the need for robust machine learning algorithms that are safe, interpretable, can learn from little labeled training data, understand natural language, and generalize well across medical settings and institutions.
- M. Rotmensch, Y. Halpern, A. Tlimat, S. Horng, D. Sontag. Learning a Health Knowledge Graph from Electronic Medical Records, Nature Scientific Reports, July 2017. [Supplement].
- S. Blecker, S. Katz, L. Horwitz, G. Kuperman, H. Park, A. Gold, D. Sontag. Comparison of Approaches for Heart Failure Case Identification From Electronic Health Record Data. Journal of the American Medical Association (JAMA) Cardiology, Oct. 2016.
- Y. Kim, Y. Jernite, D. Sontag, A.M. Rush. Character-Aware Neural Language Models. Thirtieth AAAI Conference on Artificial Intelligence. 2016. [code]
- Y. Halpern, S. Horng, Y. Choi, D. Sontag. Electronic Medical Record Phenotyping using the Anchor and Learn Framework. Journal of the American Medical Informatics Association (JAMIA), April 2016. [Supplement] [code]
- Y. Jernite, Y. Halpern, S. Horng, D. Sontag. Predicting Chief Complaints at Triage Time in the Emergency Department. NeurIPS 2013 Workshop on Machine Learning for Clinical Data Analysis and Healthcare, Dec. 2013.