Assistant Professor
Kim Jaechul Graduate School of AI
KAIST
Seoul Campus Building 9 Suite 202
minjoon@lklab.io (KAIST)
seominjoon@gmail.com (personal)
[CV]
[Google Scholar]
[GitHub]
[Twitter]
[LinkedIn]
I am an Assistant Professor at KAIST AI, where I am the Director of Language & Knowledge Lab.
I am interested in how the language (and multimodal) intelligence is induced and the knowledge is acquired, which often means studying the learning dynamics of (multimodal) language models and coming up with novel methods based on the understanding.
I obtained PhD in Computer Science at the University of Washington where I was fortunate to be advised by
Hannaneh Hajishirzi and
Ali Farhadi
and supported by Facebook Fellowship and AI2 Key Scientific Challenges Award.
Before that, I obtained BS in Electrical Engineering & Computer Science at the University of California, Berkeley.
[Prospective students: please read before you email me!]
Please see my Semantic Scholar or Google Scholar profiles for the full list.
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang, Jinho Park, Seonghyeon Ye, Sohee Yang, Youngkyung Seo, Du-Seong Chang, Minjoon Seo
NeurIPS 2024
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee, Sue Hyun Park, Seungone Kim, Minjoon Seo
NeurIPS 2024
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Seungone Kim, Juyoung Suk, Shayne Longpre, Bill Yuchen Lin, Jamin Shin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo
EMNLP 2024
Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Utilization
Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo
EMNLP 2024
Exploring the Practicality of Generative Retrieval on Dynamic Corpora
Soyoung Yoon, Chaeeun Kim, Hyunji Lee, Joel Jang, Sohee Yang, Minjoon Seo
EMNLP 2024
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning
Geewook Kim, Minjoon Seo
EMNLP 2024
Rethinking the Role of Proxy Rewards in Language Model Alignment
Sungdong Kim, Minjoon Seo
EMNLP 2024
Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards
Hyeonbin Hwang, Doyoung Kim, Seungone Kim, Seonghyeon Ye, Minjoon Seo
EMNLP 2024 Findings
Semiparametric Token-Sequence Co-Supervision
Hyunji Lee, Doyoung Kim, Jihoon Jun, Se June Joo, Joel Jang, Kyoung-Woon On, Minjoon Seo
ACL 2024
LangBridge: Multilingual Reasoning Without Multilingual Supervision
Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo
ACL 2024
Aligning Large Language Models by On-Policy Self-Judgment
Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu
ACL 2024
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
Seongyun Lee, Seungone Kim, Sue Hyun Park, Geewook Kim, Minjoon Seo
ACL 2024 Findings
REPLUG: Retrieval-Augmented Black-Box Language Models
Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih
NAACL 2024
Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision
Seongyun Lee, Sue Hyun Park, Yongrae Jo, Minjoon Seo
NAACL 2024
KTRL+F: Knowledge-Augmented In-Document Search
Hanseok Oh, Haebin Shin, Miyoung Ko, Hyunji Lee, Minjoon Seo
NAACL 2024
How Well Do Large Language Models Truly Ground?
Hyunji Lee, Sejune Joo, Chaeeun Kim, Joel Jang, Doyoung Kim, Kyoung-Woon On, Minjoon Seo
NAACL 2024
Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis
Sohee Yang, Jonghyeon Kim, Joel Jang, Seonghyeon Ye, Hyunji Lee, Minjoon Seo
TACL 2024
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
spotlight
Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo
ICLR 2024
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo
ICLR 2024
SuRe: Improving Open-domain Question Answering of LLMs via Summarized Retrieval
Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha and Jinwoo Shin
ICLR 2024
Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, Minjoon Seo
AAAI 2024
A Bayesian Approach To Analysing Training Data Attribution In Deep Learning
Elisa Nguyen, Minjoon Seo, Seong Joon Oh
NeurIPS 2023
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Seungone Kim, Se June Joo, Doyoung Kim, Joel Jang, Seonghyeon Ye, Jamin Shin, Minjoon Seo
EMNLP 2023
Aligning Large Language Models through Synthetic Feedback
Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, Minjoon Seo
EMNLP 2023
An Integrated Search System for Korea Weather Data
Jinkyung Jo, Dayeon Ki, Soyoung Yoon, Minjoon Seo
EMNLP 2023 Industry Track
Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt
Seonghyeon Ye, Joel Jang, Doyoung Kim, Yongrae Jo, Minjoon Seo
EMNLP 2023 Findings
Gradient Ascent Post-training Enhances Language Model Generalization
Dongkeun Yoon*, Joel Jang*, Sungdong Kim, Minjoon Seo
ACL 2023 (short)
Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation
Soyoung Yoon, Sungjoon Park, Gyuwan Kim, Junhee Cho, Kihyo Park, Gyu Tae Kim, Minjoon Seo, Alice Oh
ACL 2023
[paper]
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Joel Jang, Dongkeun Yoon, Sohee Yang, Sungmin Cha, Moontae Lee, Lajanugen Logeswaran, Minjoon Seo
ACL 2023
[paper]
Nonparametric Decoding for Generative Retrieval
Hyunji Lee, Jaeyoung Kim, Hoyeon Chang, Hanseok Oh, Sohee Yang, Vlad Karpukhin, Yi Lu, Minjoon Seo
ACL 2023 Findings
[paper]
Fixed Input Parameterization for Efficient Prompting
Eunbi Choi, Yongrae Jo, Joel Jang, Joonwon Jang, Minjoon Seo
ACL 2023 Findings
[paper]
Comparing and Contrasting Claims on Contentious Issues
Miyoung Ko, Ingyu Seong, Hwaran Lee, Joonsuk Park, Minsuk Chang, Minjoon Seo
ACL 2023 Findings
[paper]
Two Examples are Better than One: Context Regularization for Gradient-based Prompt Tuning
Hyeonmin Ha, Soyoung Jung, Jinsol Park, Minjoon Seo, Seung-won Hwang and Byung-Gon Chun
ACL 2023 Findings
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Joel Jang, Seungone Kim, Seonghyeon Ye, Doyoung Kim, Lajanugen Logeswaran, Moontae Lee, Kyungjae Lee, Minjoon Seo
ICML 2023
[paper]
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
Seonghyeon Ye, Doyoung Kim, Joel Jang, Joongbo Shin, Minjoon Seo
ICLR 2023
[paper]
Generative Multi-hop Retrieval
Hyunji Lee, Sohee Yang, Hanseok Oh, Minjoon Seo
EMNLP 2022
[paper]
TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
Joel Jang, Seonghyeon Ye, Changho Lee, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Minjoon Seo
EMNLP 2022
[paper]
Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts
Joel Jang, Seonghyeon Ye, Minjoon Seo
NeurIPS 2022 Workshop on Transfer Learning for NLP (TL4NLP)
[paper]
EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
Gyubok Lee, Hyeonji Hwang, Seongsu Bae, Yeonsu Kwon, Woncheol Shin, Seongjun Yang, Minjoon Seo, Jong-Yeup Kim, Edward Choi
NeurIPS 2022 Datasets and Benchmarks
[paper]
A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction
Wonseok Hwang, Dongjun Lee, Kyoungyeon Cho, Hanuhl Lee, Minjoon Seo
NeurIPS 2022 Datasets and Benchmarks
[paper]
Is Retriever Merely an Approximator of Reader?
Sohee Yang, Minjoon Seo
ACL 2022 Workshop on Semiparametric Methods in NLP (Spa-NLP)
[paper]
Towards Continual Knowledge Learning of Language Models
Joel Jang, Seonghyeon Ye, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Stanley Jungkyu Choi, Minjoon Seo
ICLR 2022
[paper]
ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation
Aiden Lee, Hanseok Oh, Minjoon Seo
ICCV 2021 Workshop on Closing the Loop between Vision and Language (CLVL)
1st place at ICCV 2021 VALUE Challenge Retrieval Track
[paper]
Cost-effective End-to-end Information Extraction for Semi-structured Document Images
Wonseok Hwang, Hyunji Lee, Jinyeong Yim, Geewook Kim, Minjoon Seo
EMNLP 2021 (short)
[paper]
Spatial Dependency Parsing for Semi-Structured Document Information Extraction
Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Sohee Yang, Minjoon Seo
ACL 2021 Findings
[paper]
Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering
Sohee Yang, Minjoon Seo
NAACL 2021 (short)
1st place at NeurIPS 2020 EfficientQA Challenge 500MB Track Human Eval
[paper]
[code]
Syntactic Question Abstraction and Retrieval for Data-Scarce Semantic Parsing
Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Minjoon Seo
AKBC 2020
[paper]
Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
ACL 2020 (short)
[paper]
[code]
A Comprehensive Exploration on WikiSQL with Table-Aware Word Contextualization
Wonseok Hwang, Jinyeung Yim, Seunghyun Park, Minjoon Seo
NeurIPS 2019 KR2ML Workshop
[paper]
[code]
Mixture Content Selection for Diverse Sequence Generation
Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
EMNLP 2019
[paper]
[code]
[tweet]
[blog]
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
Minjoon Seo*, Jinhyuk Lee*, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
ACL 2019
[paper]
[code]
[demo]
[slides.pdf]
[slides.pptx]
[tweet]
Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension
Minjoon Seo, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
EMNLP 2018 (short)
[paper]
[code]
[slides]
[website]
Neural Speed Reading via Skim-RNN
Minjoon Seo*, Sewon Min*, Ali Farhadi, Hannaneh Hajishirzi
ICLR 2018
[paper]
[poster]
[slides]
Zero-Shot Relation Extraction via Reading Comprehension
Omer Levy, Minjoon Seo, Eunsol Choi, Luke Zettlemoyer
CoNLL 2017
[paper]
[website]
Question Answering through Transfer Learning from Large Fine-Grained Supervision Data
Sewon Min, Minjoon Seo, Hannaneh Hajishirzi
ACL 2017 (short)
[paper]
Are You Smarter than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension
Aniruddha Kembhavi, Minjoon Seo, Dustin Schwenk, Jonghyun Choi, Ali Farhadi, Hannaneh Hajishirzi
CVPR 2017
[paper]
Bidirectional Attention Flow for Machine Comprehension
Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi
ICLR 2017
[paper]
[code]
[website]
[demo]
Query-Reduction Networks for Question Answering
Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
ICLR 2017
[paper]
[code]
A Diagram is Worth a Dozen Images
Aniruddha Kembhavi, Mike Salvato*, Eric Kolve*, Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi
ECCV 2016
[paper]
[dqa-net code]
Solving Geometry Problems: Combining Text and Diagram Interpretation
Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi, Oren Etzioni, Clint Malcolm
EMNLP 2015
[paper]
[code]
[slides]
[website]
[demo]
BiliCam: Using Mobile Phones to Monitor Newborn Jaundice
best paper nomination
Lilian de Greef, Mayank Goel, Minjoon Seo, Eric Larson, James Stout, James Taylor, Shwetak Patel
UbiComp 2014
[paper]
[website]
Diagram Understanding in Geometry Questions
Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi, Oren Etzioni
AAAI 2014
[paper]
[code]
[slides]
[website]