Dr. Sarah Chen

About

I am an Associate Professor in the Department of Electrical Engineering and Computer Science at MIT, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL).

My research sits at the intersection of natural language processing and machine learning, with a particular focus on building language systems that reason reliably, explain their decisions, and align with human intentions.

Before joining MIT I was a research scientist at Google DeepMind. I completed my PhD at Stanford University under the supervision of Prof. Christopher Manning.

Education

PhD in Natural Language Processing

Stanford University, Computer Science

2013 - 2018

Stanford, CA

Thesis: Structured Representations for Compositional Semantics

BSc (Honours) in Computer Science & Mathematics

University of Toronto, Computer Science

2009 - 2013

Toronto, Canada

Experience

Associate Professor in Computer Science

Massachusetts Institute of Technology, EECS / CSAIL

2021 - Present

Cambridge, MA

Research Scientist

Google DeepMind, Language Research

2018 - 2021

Mountain View, CA

Publications

3,450

Citations

42

h-index

58

i10-index

Featured

Scaling Faithful Reasoning in Large Language Models

Sarah Chen, Wei Zhang, Aditya Ramesh, Christopher Manning

NeurIPS2024

Conference Paper

87

PDF Code DOI

Interpretable Attention Heads as Concept Detectors

Wei Zhang, Sarah Chen, Yann LeCun

ACL2024

Conference Paper

45

DOI

Human Preference Alignment Without Reinforcement Learning

Sarah Chen, Amanda Liu, David Silver

Nature MI2023

Journal Article

312

DOI

2023

Efficient Fine-tuning of Multilingual Models for Low-Resource Languages

Sarah Chen, Priya Sharma, Carlos Rodriguez

EMNLP2023

Conference Paper

78

A Survey of Hallucination in Large Language Models

Wei Zhang, Sarah Chen

CSUR2023

Journal Article

520

Grants & Funding

Active Grants

Foundations of Faithful Reasoning in Language Models

National Science Foundation (NSF)

PI

$750,0002023–2026

Developing training methods and evaluation frameworks for improving logical consistency in large language models.

Human-Aligned NLP Systems

DARPA

Co-PI

$1,200,0002022–2025

Multi-institution project on building NLP systems that align with human values and intentions.

Awards & Honors

MIT Technology Review Innovators Under 35

MIT Technology Review2023

Recognized for pioneering work on faithful reasoning in AI systems.

Best Paper Award

NeurIPS 20242024

NSF CAREER Award

National Science Foundation2022

Early-career faculty award for research on interpretable language models.

Lab Members

Current Members

WZ

Wei Zhang

Postdoctoral Researcher

Mechanistic interpretability of transformer models

AL

Amanda Liu

PhD Student

Preference learning and RLHF alternatives

CR

Carlos Rodriguez

PhD Student

Multilingual NLP and cross-lingual transfer

YT

Yuki Tanaka

Master's Student

LLM evaluation benchmarks

Open Positions

PhD Student in LLM Reasoning

PhD Student

We are looking for 2 PhD students interested in improving reasoning capabilities of large language models. Strong background in NLP or ML required.

Requirements

MSc or equivalent in CS/ML/NLP. Strong programming skills in Python/PyTorch.

Apply by December 15, 2025

Apply Email

Postdoctoral Researcher — AI Alignment

Postdoc

Postdoc position on our DARPA-funded project on human-aligned NLP systems.

Requirements

PhD in NLP, ML, or related field. Publications in top venues.

Apply by June 30, 2025

Email

Courses

Current Courses

6.8610Advanced Natural Language Processing

Fall 2024

Current

Graduate seminar covering modern approaches to NLP including large language models, in-context learning, and alignment techniques.

Past Courses

6.3900Machine Learning

Spring 2024

Introduction to machine learning concepts, supervised and unsupervised methods, with a focus on deep learning fundamentals.

Announcements

Pinned

award

NeurIPS

NeurIPS 2024 Oral Presentation

Sep 15, 2024

Excited to share that our paper 'Scaling Faithful Reasoning in Large Language Models' has been accepted as an oral presentation at NeurIPS 2024!

Looking for PhD Students — Fall 2025

Oct 1, 2024

I am recruiting 2 PhD students to start Fall 2025. Research areas: LLM reasoning, interpretability, and alignment. Please apply through the MIT EECS admissions portal.

Media & Press

MIT Technology Review

The Researchers Making AI Think More Clearly

Article

Jul 20, 2024

Feature article on our group's work on faithful reasoning in language models.

Lex Fridman Podcast

AI Alignment: Where Are We Now?

Podcast

Mar 15, 2024

Conversation about the current state of AI alignment research and practical approaches.

Frequently Asked Questions

Contact

schen@mit.edu

Department of Electrical Engineering and Computer Science

Massachusetts Institute of Technology

About

Education

Experience

Publications

Featured

Scaling Faithful Reasoning in Large Language Models

Interpretable Attention Heads as Concept Detectors

Human Preference Alignment Without Reinforcement Learning

Efficient Fine-tuning of Multilingual Models for Low-Resource Languages

A Survey of Hallucination in Large Language Models

Grants & Funding

Active Grants

Awards & Honors

Lab Members

Current Members

Open Positions

Courses

Current Courses

Past Courses

Announcements

NeurIPS 2024 Oral Presentation

Looking for PhD Students — Fall 2025

Media & Press

Frequently Asked Questions

Are you accepting new PhD students?

Can I do a research internship with your group?

Do you have open postdoc positions?

Contact