DS

Dr. Sarah Chen

Associate Professor of Computer Science

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology

Large Language Models
Natural Language Processing
AI Alignment
Reasoning & Interpretability
Human-AI Interaction

About

I am an Associate Professor in the Department of Electrical Engineering and Computer Science at MIT, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL).

My research sits at the intersection of natural language processing and machine learning, with a particular focus on building language systems that reason reliably, explain their decisions, and align with human intentions.

Before joining MIT I was a research scientist at Google DeepMind. I completed my PhD at Stanford University under the supervision of Prof. Christopher Manning.

Education

PhD in Natural Language Processing

Stanford University, Computer Science

2013 - 2018

Stanford, CA

Thesis: Structured Representations for Compositional Semantics

BSc (Honours) in Computer Science & Mathematics

University of Toronto, Computer Science

2009 - 2013

Toronto, Canada

Experience

Associate Professor in Computer Science

Massachusetts Institute of Technology, EECS / CSAIL

2021 - Present

Cambridge, MA

Research Scientist

Google DeepMind, Language Research

2018 - 2021

Mountain View, CA

Publications

3,450
Citations
42
h-index
58
i10-index

Featured

Interpretable Attention Heads as Concept Detectors

Wei Zhang, Sarah Chen, Yann LeCun

ACL2024
Conference Paper
45

Human Preference Alignment Without Reinforcement Learning

Sarah Chen, Amanda Liu, David Silver

Nature MI2023
Journal Article
312
2023

Efficient Fine-tuning of Multilingual Models for Low-Resource Languages

Sarah Chen, Priya Sharma, Carlos Rodriguez

EMNLP2023
Conference Paper
78

A Survey of Hallucination in Large Language Models

Wei Zhang, Sarah Chen

CSUR2023
Journal Article
520

Grants & Funding

Active Grants

Foundations of Faithful Reasoning in Language Models

National Science Foundation (NSF)
PI
$750,0002023–2026

Developing training methods and evaluation frameworks for improving logical consistency in large language models.

Human-Aligned NLP Systems

DARPA
Co-PI
$1,200,0002022–2025

Multi-institution project on building NLP systems that align with human values and intentions.

Awards & Honors

MIT Technology Review Innovators Under 35

MIT Technology Review2023

Recognized for pioneering work on faithful reasoning in AI systems.

Best Paper Award

NeurIPS 20242024

NSF CAREER Award

National Science Foundation2022

Early-career faculty award for research on interpretable language models.

Lab Members

Current Members

WZ
Wei Zhang
Postdoctoral Researcher

Mechanistic interpretability of transformer models

AL
Amanda Liu
PhD Student

Preference learning and RLHF alternatives

CR
Carlos Rodriguez
PhD Student

Multilingual NLP and cross-lingual transfer

YT
Yuki Tanaka
Master's Student

LLM evaluation benchmarks

Open Positions

PhD Student in LLM Reasoning
PhD Student

We are looking for 2 PhD students interested in improving reasoning capabilities of large language models. Strong background in NLP or ML required.

Requirements

MSc or equivalent in CS/ML/NLP. Strong programming skills in Python/PyTorch.

Apply by December 15, 2025
Postdoctoral Researcher — AI Alignment
Postdoc

Postdoc position on our DARPA-funded project on human-aligned NLP systems.

Requirements

PhD in NLP, ML, or related field. Publications in top venues.

Apply by June 30, 2025

Courses

Current Courses

6.8610Advanced Natural Language Processing
Fall 2024
Current

Graduate seminar covering modern approaches to NLP including large language models, in-context learning, and alignment techniques.

Past Courses

6.3900Machine Learning
Spring 2024

Introduction to machine learning concepts, supervised and unsupervised methods, with a focus on deep learning fundamentals.

Announcements

Pinned
award
NeurIPS

NeurIPS 2024 Oral Presentation

Sep 15, 2024

Excited to share that our paper 'Scaling Faithful Reasoning in Large Language Models' has been accepted as an oral presentation at NeurIPS 2024!

Read more
recruiting

Looking for PhD Students — Fall 2025

Oct 1, 2024

I am recruiting 2 PhD students to start Fall 2025. Research areas: LLM reasoning, interpretability, and alignment. Please apply through the MIT EECS admissions portal.

Read more

Media & Press

MIT Technology Review

The Researchers Making AI Think More Clearly

Article
Jul 20, 2024

Feature article on our group's work on faithful reasoning in language models.

Lex Fridman Podcast

AI Alignment: Where Are We Now?

Podcast
Mar 15, 2024

Conversation about the current state of AI alignment research and practical approaches.

Frequently Asked Questions

Contact

Department of Electrical Engineering and Computer Science

Massachusetts Institute of Technology