Yanda Chen
I am a Member of Technical Staff (Research Scientist) at the Alignment Science team at Anthropic.
I work on natural language processing, AI safety, and machine learning.
Previously, I did my PhD in Computer Science at Columbia University,
where I was very fortunate to be co-advised by Prof. Kathy
McKeown, Prof. He He, and
Prof. Zhou Yu.
I received my bachelor's degree in Computer Science at Columbia University in
April 2021.
Email  / 
CV  / 
Semantic
Scholar  / 
Twitter  / 
Github
|
|
Research
My current research interest lies in two directions: i) Explainability: building
explainable deep learning systems and understanding how LLMs behave, and ii)
Reliability: improving the calibration and reducing the sensitivity of LLMs.
Below are my publications.
|
Parallel Structures in Pre-training Data Yield In-Context Learning
Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He
ACL, 2024
paper  / 
code
We find that ICL ability of language models emerges from parallel structures in the
pre-training data—--pairs of phrases following similar templates in the same context
window. Specifically, we show that removing parallel structures in the pre-training
data reduces LMs' ICL accuracy by 51% (vs 2% from random ablation). This drop persists
even when excluding common patterns such as n-gram repetitions and long-range dependency.
|
Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning
Yanda Chen, Chandan Singh, Xiaodong Liu, Simiao Zuo, Bin Yu, He He, Jianfeng Gao
arXiv preprint, 2024
paper  / 
code
We propose explanation-consistency finetuning (EC-finetuning), which adapts
LLMs to generate more consistent natural-language explanations on related examples by
finetuning them on synthetic data that is carefully constructed to contain consistent
explanations. EC-finetuning improves explanation consistency by 10.0% on four finetuning
datasets, and by 4.5% on seven out-of-distribution datasets.
|
Do Models Explain Themselves? Counterfactual Simulatability of
Natural Language Explanations
Yanda Chen, Ruiqi Zhong, Narutatsu Ri, Chen Zhao, He He, Jacob
Steinhardt, Zhou Yu, Kathleen McKeown
ICML (Spotlight), 2024
paper  / 
code
We propose to evaluate the counterfactual simulatability of natural
language explanations: whether an explanation can enable humans to precisely
infer the model's outputs on diverse counterfactuals of the explained input. We
implemented two metrics—precision and generality, and found that i)
LLM's explanations have low precision, and ii) precision does not
correlate with plausibility.
|
On the Relation between Sensitivity and Accuracy in In-context
Learning
Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He
EMNLP Findings, 2023
paper  / 
code  / 
poster
We find that label bias obscures true ICL sensitivity and that ICL sensitivity is
strongly and negatively correlated with accuracy. Motivated by our study, we
propose SenSel, a few-shot selective prediction method based on ICL
sensitivity.
|
In-context Learning Distillation: Transferring Few-shot Learning
Ability of Pre-trained Language Models
Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeown
arXiv preprint, 2022
paper
We proposed in-context learning distillation, which transfers in-context learning
(ICL) ability from large language models to small language models by augmenting
in-context tuning with teacher-student distillation. Experiments on LAMA and
CrossFit show that in-context learning distillation improves the ICL ability of
small language models.
|
Meta-learning via Language Model In-context Tuning
Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
ACL, 2022
paper  / 
code  / 
slides
We propose a novel few-shot meta-learning method called in-context
tuning, where training examples are used as prefix in-context
demonstrations for task adaptation. We show that in-context tuning out-performs
MAML in terms of accuracy and eliminates several well-known oversensitivity
artifacts of few-shot language model prompting.
|
Cross-language Sentence Selection via Data Augmentation and
Rationale Training
Yanda Chen, Chris Kedzie, Suraj Nair, Petra Galuscakova, Rui Zhang,
Douglas Oard, Kathleen McKeown
ACL, 2021
paper  / 
code  / 
talk
 / 
slides
We propose a data augmentation strategy and a rationale training strategy for
cross-lingual sentence selection in low-resource settings where no labeled
relevance judgment is available for training. Our methods achieve
state-of-the-art results on three language pairs.
|
Improved Synthetic Training for Reading Comprehension
Yanda Chen, Md Arafat Sultan, Vittorio Castelli
arXiv preprint, 2020
paper
We propose two novel synthetic training strategies: targeted synthetic
pre-training (a method to select useful synthetic examples to target weakness of
existing models) and synthetic knowledge distillation. The two techniques, when
combined, yield QA models that are simultaneously smaller, faster, and more
accurate.
|
Detecting and Reducing Bias in a High Stakes Domain
Ruiqi Zhong, Yanda Chen, Desmond Patton, Charlotte Selous, Kathy
McKeown
EMNLP, 2019
paper
/
code
/
poster
We propose a framework to systematically detect and reduce language bias of deep
learning models under the high-stakes context of gang intervention.
|
Microsoft Research, Summer 2023, Mentor: Chandan Singh, Xiaodong Liu
AWS AI, Summer 2021, Mentor: He He
IBM Research, Summer 2020, Mentor: Arafat Sultan, Vittorio Castelli
|
Avanessians Doctoral Fellowships for Engineering Thought Leaders and Innovators in Data Science. 2023.
Mudd Doctoral Fellowship, Columbia SEAS. 2021.
Honorable Mention, CRA Undergraduate Research Awards. 2021.
Theodore R. Bashkow Research Award, Columbia Computer Science Dept. 2021.
|
Natural Language Processing, Spring 2022 & Spring 2021
Analysis of Algorithms, Spring 2021 & Spring 2020
|
|