My research is in Natural Language Processing and Machine Learning, with an emphasis on applications in health.
Working in the domain of health naturally motivates the methodological problems that I have worked on. For example, these include: model interpretability; learning with limited supervision from diverse sources; human-in-the-loop/hybrid systems; and trustworthiness of model outputs. For more details, see recent publications here.
On the applications side, one thread of my research concerns developing language technologies to automate (or semi-automate) biomedical evidence synthesis. Here is an episode of the NLP highlights podcast in which I discuss this work, here is a (brief) talk I gave at SciNLP 2020, and here is an article written for a lay audience about the effort. Elsewhere, I have worked on models for processing notes in Electronic Health Records.
Eric Lehman, Sarthak Jain, Karl Pichotta, Yoav Goldberg and Byron C. Wallace. Does BERT Pretrained on Clinical Notes Reveal Sensitive Data? NAACL; 2021.
Xiongyi Zhang, Jan-Willem van de Meent and Byron C. Wallace. Disentangling Representations of Text by Masking Transformers EMNLP; 2021.
Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, and Byron C. Wallace. Learning to Faithfully Rationalize by Construction ACL 2020; 2020.
03/18/2021 Student Paper Award @ AMIA Summits
Our paper — led by my PhD student Ben Nye — received the best student-led paper award at the AMIA (Virtual) Summits
01/20/2021 Lutron Award
I received the 2021 Joel and Ruth Spira Excellence in Teaching Award for the Khoury College of Computer Sciences
08/15/2019 NIH/NLM R01 Renewed
The NIH has renewed the R01 grant that supports our work on RobotReviewer and related methods!
06/24/2019 NSF Grant
The NSF has awarded Jan-Willem van de Meent and I a grant to study disentangled representations for text!