Minjoon Seo (서민준)

Assistant Professor
Kim Jaechul Graduate School of AI
KAIST
Seoul Campus Building 9 Suite 202

minjoon@lklab.io (KAIST)
seominjoon@gmail.com (personal)
[CV] [Google Scholar] [GitHub] [Twitter] [LinkedIn]

Minjoon Seo, 2024

I am an Assistant Professor at KAIST AI, where I am the Director of Language & Knowledge Lab. I am interested in how the language (and multimodal) intelligence is induced and the knowledge is acquired, which often means studying the learning dynamics of (multimodal) language models and coming up with novel methods based on the understanding.

I obtained PhD in Computer Science at the University of Washington where I was fortunate to be advised by Hannaneh Hajishirzi and Ali Farhadi and supported by Facebook Fellowship and AI2 Key Scientific Challenges Award. Before that, I obtained BS in Electrical Engineering & Computer Science at the University of California, Berkeley.

[Prospective students: please read before you email me!]


Selected Papers

Please see my Semantic Scholar or Google Scholar profiles for the full list.

2024

  • How Do Large Language Models Acquire Factual Knowledge During Pretraining?
    Hoyeon Chang, Jinho Park, Seonghyeon Ye, Sohee Yang, Youngkyung Seo, Du-Seong Chang, Minjoon Seo
    NeurIPS 2024

  • Aligning to Thousands of Preferences via System Message Generalization
    Seongyun Lee, Sue Hyun Park, Seungone Kim, Minjoon Seo
    NeurIPS 2024

  • Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
    Seungone Kim, Juyoung Suk, Shayne Longpre, Bill Yuchen Lin, Jamin Shin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo
    EMNLP 2024

  • Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Utilization
    Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo
    EMNLP 2024

  • Exploring the Practicality of Generative Retrieval on Dynamic Corpora
    Soyoung Yoon, Chaeeun Kim, Hyunji Lee, Joel Jang, Sohee Yang, Minjoon Seo
    EMNLP 2024

  • On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning
    Geewook Kim, Minjoon Seo
    EMNLP 2024

  • Rethinking the Role of Proxy Rewards in Language Model Alignment
    Sungdong Kim, Minjoon Seo
    EMNLP 2024

  • Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards
    Hyeonbin Hwang, Doyoung Kim, Seungone Kim, Seonghyeon Ye, Minjoon Seo
    EMNLP 2024 Findings

  • Semiparametric Token-Sequence Co-Supervision
    Hyunji Lee, Doyoung Kim, Jihoon Jun, Se June Joo, Joel Jang, Kyoung-Woon On, Minjoon Seo
    ACL 2024

  • LangBridge: Multilingual Reasoning Without Multilingual Supervision
    Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo
    ACL 2024

  • Aligning Large Language Models by On-Policy Self-Judgment
    Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu
    ACL 2024

  • Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
    Seongyun Lee, Seungone Kim, Sue Hyun Park, Geewook Kim, Minjoon Seo
    ACL 2024 Findings

  • REPLUG: Retrieval-Augmented Black-Box Language Models
    Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih
    NAACL 2024

  • Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision
    Seongyun Lee, Sue Hyun Park, Yongrae Jo, Minjoon Seo
    NAACL 2024

  • KTRL+F: Knowledge-Augmented In-Document Search
    Hanseok Oh, Haebin Shin, Miyoung Ko, Hyunji Lee, Minjoon Seo
    NAACL 2024

  • How Well Do Large Language Models Truly Ground?
    Hyunji Lee, Sejune Joo, Chaeeun Kim, Joel Jang, Doyoung Kim, Kyoung-Woon On, Minjoon Seo
    NAACL 2024

  • Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis
    Sohee Yang, Jonghyeon Kim, Joel Jang, Seonghyeon Ye, Hyunji Lee, Minjoon Seo
    TACL 2024

  • FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets spotlight
    Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo
    ICLR 2024

  • Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
    Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo
    ICLR 2024

  • SuRe: Improving Open-domain Question Answering of LLMs via Summarized Retrieval
    Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha and Jinwoo Shin
    ICLR 2024

  • Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
    Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, Minjoon Seo
    AAAI 2024

2023

  • A Bayesian Approach To Analysing Training Data Attribution In Deep Learning
    Elisa Nguyen, Minjoon Seo, Seong Joon Oh
    NeurIPS 2023

  • The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
    Seungone Kim, Se June Joo, Doyoung Kim, Joel Jang, Seonghyeon Ye, Jamin Shin, Minjoon Seo
    EMNLP 2023

  • Aligning Large Language Models through Synthetic Feedback
    Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, Minjoon Seo
    EMNLP 2023

  • An Integrated Search System for Korea Weather Data
    Jinkyung Jo, Dayeon Ki, Soyoung Yoon, Minjoon Seo
    EMNLP 2023 Industry Track

  • Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt
    Seonghyeon Ye, Joel Jang, Doyoung Kim, Yongrae Jo, Minjoon Seo
    EMNLP 2023 Findings

  • Gradient Ascent Post-training Enhances Language Model Generalization
    Dongkeun Yoon*, Joel Jang*, Sungdong Kim, Minjoon Seo
    ACL 2023 (short)

  • Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation
    Soyoung Yoon, Sungjoon Park, Gyuwan Kim, Junhee Cho, Kihyo Park, Gyu Tae Kim, Minjoon Seo, Alice Oh
    ACL 2023
    [paper]

  • Knowledge Unlearning for Mitigating Privacy Risks in Language Models
    Joel Jang, Dongkeun Yoon, Sohee Yang, Sungmin Cha, Moontae Lee, Lajanugen Logeswaran, Minjoon Seo
    ACL 2023
    [paper]

  • Nonparametric Decoding for Generative Retrieval
    Hyunji Lee, Jaeyoung Kim, Hoyeon Chang, Hanseok Oh, Sohee Yang, Vlad Karpukhin, Yi Lu, Minjoon Seo
    ACL 2023 Findings
    [paper]

  • Fixed Input Parameterization for Efficient Prompting
    Eunbi Choi, Yongrae Jo, Joel Jang, Joonwon Jang, Minjoon Seo
    ACL 2023 Findings
    [paper]

  • Comparing and Contrasting Claims on Contentious Issues
    Miyoung Ko, Ingyu Seong, Hwaran Lee, Joonsuk Park, Minsuk Chang, Minjoon Seo
    ACL 2023 Findings
    [paper]

  • Two Examples are Better than One: Context Regularization for Gradient-based Prompt Tuning
    Hyeonmin Ha, Soyoung Jung, Jinsol Park, Minjoon Seo, Seung-won Hwang and Byung-Gon Chun
    ACL 2023 Findings

  • Exploring the Benefits of Training Expert Language Models over Instruction Tuning
    Joel Jang, Seungone Kim, Seonghyeon Ye, Doyoung Kim, Lajanugen Logeswaran, Moontae Lee, Kyungjae Lee, Minjoon Seo
    ICML 2023
    [paper]

  • Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
    Seonghyeon Ye, Doyoung Kim, Joel Jang, Joongbo Shin, Minjoon Seo
    ICLR 2023
    [paper]

2022

  • Generative Multi-hop Retrieval
    Hyunji Lee, Sohee Yang, Hanseok Oh, Minjoon Seo
    EMNLP 2022
    [paper]

  • TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
    Joel Jang, Seonghyeon Ye, Changho Lee, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Minjoon Seo
    EMNLP 2022
    [paper]

  • Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts
    Joel Jang, Seonghyeon Ye, Minjoon Seo
    NeurIPS 2022 Workshop on Transfer Learning for NLP (TL4NLP)
    [paper]

  • EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
    Gyubok Lee, Hyeonji Hwang, Seongsu Bae, Yeonsu Kwon, Woncheol Shin, Seongjun Yang, Minjoon Seo, Jong-Yeup Kim, Edward Choi
    NeurIPS 2022 Datasets and Benchmarks
    [paper]

  • A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction
    Wonseok Hwang, Dongjun Lee, Kyoungyeon Cho, Hanuhl Lee, Minjoon Seo
    NeurIPS 2022 Datasets and Benchmarks
    [paper]

  • Is Retriever Merely an Approximator of Reader?
    Sohee Yang, Minjoon Seo
    ACL 2022 Workshop on Semiparametric Methods in NLP (Spa-NLP)
    [paper]

  • Towards Continual Knowledge Learning of Language Models
    Joel Jang, Seonghyeon Ye, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Stanley Jungkyu Choi, Minjoon Seo
    ICLR 2022
    [paper]

2021

  • ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation
    Aiden Lee, Hanseok Oh, Minjoon Seo
    ICCV 2021 Workshop on Closing the Loop between Vision and Language (CLVL)
    1st place at ICCV 2021 VALUE Challenge Retrieval Track
    [paper]

  • Cost-effective End-to-end Information Extraction for Semi-structured Document Images
    Wonseok Hwang, Hyunji Lee, Jinyeong Yim, Geewook Kim, Minjoon Seo
    EMNLP 2021 (short)
    [paper]

  • Spatial Dependency Parsing for Semi-Structured Document Information Extraction
    Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Sohee Yang, Minjoon Seo
    ACL 2021 Findings
    [paper]

  • Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering
    Sohee Yang, Minjoon Seo
    NAACL 2021 (short)
    1st place at NeurIPS 2020 EfficientQA Challenge 500MB Track Human Eval
    [paper] [code]

2020

  • Syntactic Question Abstraction and Retrieval for Data-Scarce Semantic Parsing
    Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Minjoon Seo
    AKBC 2020
    [paper]

  • Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
    Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
    ACL 2020 (short)
    [paper] [code]

2019

  • A Comprehensive Exploration on WikiSQL with Table-Aware Word Contextualization
    Wonseok Hwang, Jinyeung Yim, Seunghyun Park, Minjoon Seo
    NeurIPS 2019 KR2ML Workshop
    [paper] [code]

  • Mixture Content Selection for Diverse Sequence Generation
    Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
    EMNLP 2019
    [paper] [code] [tweet] [blog]

  • Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
    Minjoon Seo*, Jinhyuk Lee*, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
    ACL 2019
    [paper] [code] [demo] [slides.pdf] [slides.pptx] [tweet]

2018

  • Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension
    Minjoon Seo, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
    EMNLP 2018 (short)
    [paper] [code] [slides] [website]

  • Neural Speed Reading via Skim-RNN
    Minjoon Seo*, Sewon Min*, Ali Farhadi, Hannaneh Hajishirzi
    ICLR 2018
    [paper] [poster] [slides]

2017

  • Zero-Shot Relation Extraction via Reading Comprehension
    Omer Levy, Minjoon Seo, Eunsol Choi, Luke Zettlemoyer
    CoNLL 2017
    [paper] [website]

  • Question Answering through Transfer Learning from Large Fine-Grained Supervision Data
    Sewon Min, Minjoon Seo, Hannaneh Hajishirzi
    ACL 2017 (short)
    [paper]

  • Are You Smarter than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension
    Aniruddha Kembhavi, Minjoon Seo, Dustin Schwenk, Jonghyun Choi, Ali Farhadi, Hannaneh Hajishirzi
    CVPR 2017
    [paper]

  • Bidirectional Attention Flow for Machine Comprehension
    Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi
    ICLR 2017
    [paper] [code] [website] [demo]

  • Query-Reduction Networks for Question Answering
    Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
    ICLR 2017
    [paper] [code]

-2016

  • A Diagram is Worth a Dozen Images
    Aniruddha Kembhavi, Mike Salvato*, Eric Kolve*, Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi
    ECCV 2016
    [paper] [dqa-net code]

  • Solving Geometry Problems: Combining Text and Diagram Interpretation
    Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi, Oren Etzioni, Clint Malcolm
    EMNLP 2015
    [paper] [code] [slides] [website] [demo]

  • BiliCam: Using Mobile Phones to Monitor Newborn Jaundice best paper nomination
    Lilian de Greef, Mayank Goel, Minjoon Seo, Eric Larson, James Stout, James Taylor, Shwetak Patel
    UbiComp 2014
    [paper] [website]

  • Diagram Understanding in Geometry Questions
    Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi, Oren Etzioni
    AAAI 2014
    [paper] [code] [slides] [website]

* denotes equal contribution.