Ph.D. Student @ UW CSE
I am a Ph.D. student in Computer Science & Engineering at the University of Washington, working with Ranjay Krishna and Alex Ratner on tackling challenges in today’s large-scale machine learning environment.
Previously, I recevied my B.S. and M.S. from National Taiwan University, where I was fortunate to work with Hsuan-Tien Lin. Prior to joining UW, I spent wonderful time visiting Carnegie Mellon University and Univeristy of California, Los Angeles, where I worked with Pradeep Ravikumar and Cho-Jui Hsieh.
My research goal is to democratize AI development by making both data and model scaling more efficient and effective in today’s large-scale environment, based on four complementary areas of work tackling different aspects of data and model scaling challenges. On data side, I study (1) how to efficiently curate large datasets, and (2) how to effectively align model behavior through data. On model side, I tackle (3) how to efficiently deploy large models, and (4) how to effectively adapt large models to downstream applications.
The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better.
Scott Geng, Cheng-Yu Hsieh, Vivek Ramanujan, Matthew Wallingford, Chun-Liang Li, Pang Wei Koh, Ranjay Krishna.
NeurIPS 2024.
DataComp-LM: In search of the next generation of training sets for language models.
DataComp-LM Team.
NeurIPS 2024.
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps.
Yung-Sung Chuang, Linlu Qiu, Cheng-Yu Hsieh, Ranjay Krishna, Yoon Kim, James Glass
EMNLP 2024.
Is C4 Dataset Enough for Pruning? An Investigation of Calibration Data for LLM Pruning.
Abhinav Bandari, Lu Yin, Cheng-Yu Hsieh, Ajay Jaiswal, Tianlong Chen, Li Shen, Ranjay Krishna, Shiwei Liu
EMNLP 2024.
The Hard Positive Truth about Vision-Language Compositionality.
Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang, Ranjay Krishna.
ECCV 2024.
Found in the middle: Calibrating Positional Attention Bias Improves Long Context Utilization.
Cheng-Yu Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long Le, Abhishek Kumar, James R. Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna*, Tomas Pfister*.
ACL Findings 2024.
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity.
Lu Yin, You Wu, Zhenyu Zhang, Cheng-Yu Hsieh, Yaqing Wang, Yiling Jia, Gen Li, Ajay Jaiswal, Mykola Pechenizkiy, Yi Liang, Michael Bendersky, Zhangyang Wang, Shiwei Liu.
ICML 2024.
SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality.
Cheng-Yu Hsieh*, Jieyu Zhang*, Zixian Ma, Aniruddha Kembhavi, Ranjay Krishna.
NeurIPS 2023.
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models.
Cheng-Yu Hsieh, Si-An Chen, Chun-Liang Li, Yasuhisa Fujii, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister.
Technical Report. 2023.
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes.
Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, Tomas Pfister.
ACL Findings 2023.
Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data Programming.
Cheng-Yu Hsieh, Jieyu Zhang, and Alexander Ratner.
VLDB 2023.
A Survey on Programmatic Weak Supervision.
Jieyu Zhang*, Cheng-Yu Hsieh*, Yue Yu*, Chao Zhang, and Alexander Ratner.
Technical Report. 2022.
Understanding Programmatic Weak Supervision via Source-aware Influence Function.
Jieyu Zhang*, Haonan Wang*, Cheng-Yu Hsieh, and Alexander Ratner.
NeurIPS 2022.
Evaluations and Methods for Explanation through Robustness Analysis.
Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep Ravikumar, Seungyeon Kim, Sanjiv Kumar, and Cho-Jui Hsieh.
ICLR 2021.
On the (In)fidelity and Sensitivity of Explanations.
Chih-Kuan Yeh, Cheng-Yu Hsieh, Arun Sai Suggala, David Inouye, and Pradeep Ravikumar.
NeurIPS 2019.
A Pseudo- Label Method for Coarse-to-Fine Multi-Label Learning with Limited Supervision.
Cheng-Yu Hsieh, Miao Xu, Gang Niu, Hsuan-Tien Lin, and Masashi Sugiyama.
Learning from Limited Labeled Data @ ICLR 2019.
A Deep Model with Local Surrogate Loss for General Cost-sensitive Multi-label Learning.
Cheng-Yu Hsieh, Yi-An Lin, and Hsuan-Tien Lin.
AAAI 2018.
Automatic Bridge Bidding using Deep Reinforcement Learning.
Chih-Kuan Yeh, Cheng-Yu Hsieh, and Hsuan-Tien Lin.
IEEE Transactions on Games 2018.
Research Intern, Apple Machine Learning Research
Mentor: Hadi Pouransari and Pavan Kumar Anasosalu Vasu
Spring 2024 - Present
Student Researcher, Google Cloud AI Research
Mentor: Chen-Yu Lee and Chun-Liang Li
Summer 2022 - Winter 2024
Research Intern, RIKEN AIP
Mentor: Masashi Sugiyama
April 2018 - July 2018