Minjoon Seo (서민준)

Assistant Professor
Kim Jaechul Graduate School of AI
Seoul Campus Building 9 Suite 202

Chief Scientist
Twelve Labs

seominjoon@gmail.com (permanent)
[CV] [Google Scholar] [GitHub] [Twitter] [LinkedIn]

Minjoon Seo, 2022

I am an Assistant Professor at KAIST AI, where I am the Director of Language & Knowledge Lab, and the Chief Scientist of Twelve Labs. I am interested in how the world knowledge can be encoded (e.g. nonparametric memory and large language model) and accessed (e.g. chat interface and instruction fine-tuning), and how new knowledge can be discovered (e.g. reasoning, entailment).

I obtained PhD in Computer Science at the University of Washington where I was fortunate to be advised by Hannaneh Hajishirzi and Ali Farhadi and supported by Facebook Fellowship and AI2 Key Scientific Challenges Award. Before that, I obtained BS in Electrical Engineering & Computer Science at the University of California, Berkeley.

[Prospective students: please read before you email me!]

Selected Papers

Please see my Semantic Scholar or Google Scholar profiles for the full list.


  • Semiparametric Token-Sequence Co-Supervision
    Hyunji Lee, Doyoung Kim, Jihoon Jun, Se June Joo, Joel Jang, Kyoung-Woon On, Minjoon Seo
    ACL 2024

  • LangBridge: Multilingual Reasoning Without Multilingual Supervision
    Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo
    ACL 2024

  • Aligning Large Language Models by On-Policy Self-Judgment
    Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu
    ACL 2024

  • Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
    Seongyun Lee, Seungone Kim, Sue Hyun Park, Geewook Kim, Minjoon Seo
    ACL 2024 Findings

  • REPLUG: Retrieval-Augmented Black-Box Language Models
    Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih
    NAACL 2024

  • Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision
    Seongyun Lee, Sue Hyun Park, Yongrae Jo, Minjoon Seo
    NAACL 2024

  • KTRL+F: Knowledge-Augmented In-Document Search
    Hanseok Oh, Haebin Shin, Miyoung Ko, Hyunji Lee, Minjoon Seo
    NAACL 2024

  • How Well Do Large Language Models Truly Ground?
    Hyunji Lee, Sejune Joo, Chaeeun Kim, Joel Jang, Doyoung Kim, Kyoung-Woon On, Minjoon Seo
    NAACL 2024

  • Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis
    Sohee Yang, Jonghyeon Kim, Joel Jang, Seonghyeon Ye, Hyunji Lee, Minjoon Seo
    TACL 2024

  • FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets spotlight
    Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo
    ICLR 2024

  • Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
    Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo
    ICLR 2024

  • SuRe: Improving Open-domain Question Answering of LLMs via Summarized Retrieval
    Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha and Jinwoo Shin
    ICLR 2024

  • Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
    Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, Minjoon Seo
    AAAI 2024


  • The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
    Seungone Kim, Se June Joo, Doyoung Kim, Joel Jang, Seonghyeon Ye, Jamin Shin, Minjoon Seo
    EMNLP 2023

  • Aligning Large Language Models through Synthetic Feedback
    Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, Minjoon Seo
    EMNLP 2023

  • An Integrated Search System for Korea Weather Data
    Jinkyung Jo, Dayeon Ki, Soyoung Yoon, Minjoon Seo
    EMNLP 2023 Industry Track

  • Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt
    Seonghyeon Ye, Joel Jang, Doyoung Kim, Yongrae Jo, Minjoon Seo
    EMNLP 2023 Findings

  • Gradient Ascent Post-training Enhances Language Model Generalization
    Dongkeun Yoon*, Joel Jang*, Sungdong Kim, Minjoon Seo
    ACL 2023 (short)

  • Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation
    Soyoung Yoon, Sungjoon Park, Gyuwan Kim, Junhee Cho, Kihyo Park, Gyu Tae Kim, Minjoon Seo, Alice Oh
    ACL 2023

  • Knowledge Unlearning for Mitigating Privacy Risks in Language Models
    Joel Jang, Dongkeun Yoon, Sohee Yang, Sungmin Cha, Moontae Lee, Lajanugen Logeswaran, Minjoon Seo
    ACL 2023

  • Nonparametric Decoding for Generative Retrieval
    Hyunji Lee, Jaeyoung Kim, Hoyeon Chang, Hanseok Oh, Sohee Yang, Vlad Karpukhin, Yi Lu, Minjoon Seo
    ACL 2023 Findings

  • Fixed Input Parameterization for Efficient Prompting
    Eunbi Choi, Yongrae Jo, Joel Jang, Joonwon Jang, Minjoon Seo
    ACL 2023 Findings

  • Comparing and Contrasting Claims on Contentious Issues
    Miyoung Ko, Ingyu Seong, Hwaran Lee, Joonsuk Park, Minsuk Chang, Minjoon Seo
    ACL 2023 Findings

  • Two Examples are Better than One: Context Regularization for Gradient-based Prompt Tuning
    Hyeonmin Ha, Soyoung Jung, Jinsol Park, Minjoon Seo, Seung-won Hwang and Byung-Gon Chun
    ACL 2023 Findings

  • Exploring the Benefits of Training Expert Language Models over Instruction Tuning
    Joel Jang, Seungone Kim, Seonghyeon Ye, Doyoung Kim, Lajanugen Logeswaran, Moontae Lee, Kyungjae Lee, Minjoon Seo
    ICML 2023

  • Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
    Seonghyeon Ye, Doyoung Kim, Joel Jang, Joongbo Shin, Minjoon Seo
    ICLR 2023


  • Generative Multi-hop Retrieval
    Hyunji Lee, Sohee Yang, Hanseok Oh, Minjoon Seo
    EMNLP 2022

  • TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
    Joel Jang, Seonghyeon Ye, Changho Lee, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Minjoon Seo
    EMNLP 2022

  • Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts
    Joel Jang, Seonghyeon Ye, Minjoon Seo
    NeurIPS 2022 Workshop on Transfer Learning for NLP (TL4NLP)

  • EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
    Gyubok Lee, Hyeonji Hwang, Seongsu Bae, Yeonsu Kwon, Woncheol Shin, Seongjun Yang, Minjoon Seo, Jong-Yeup Kim, Edward Choi
    NeurIPS 2022 Datasets and Benchmarks

  • A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction
    Wonseok Hwang, Dongjun Lee, Kyoungyeon Cho, Hanuhl Lee, Minjoon Seo
    NeurIPS 2022 Datasets and Benchmarks

  • Is Retriever Merely an Approximator of Reader?
    Sohee Yang, Minjoon Seo
    ACL 2022 Workshop on Semiparametric Methods in NLP (Spa-NLP)

  • Towards Continual Knowledge Learning of Language Models
    Joel Jang, Seonghyeon Ye, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Stanley Jungkyu Choi, Minjoon Seo
    ICLR 2022


  • ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation
    Aiden Lee, Hanseok Oh, Minjoon Seo
    ICCV 2021 Workshop on Closing the Loop between Vision and Language (CLVL)
    1st place at ICCV 2021 VALUE Challenge Retrieval Track

  • Cost-effective End-to-end Information Extraction for Semi-structured Document Images
    Wonseok Hwang, Hyunji Lee, Jinyeong Yim, Geewook Kim, Minjoon Seo
    EMNLP 2021 (short)

  • Spatial Dependency Parsing for Semi-Structured Document Information Extraction
    Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Sohee Yang, Minjoon Seo
    ACL 2021 Findings

  • Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering
    Sohee Yang, Minjoon Seo
    NAACL 2021 (short)
    1st place at NeurIPS 2020 EfficientQA Challenge 500MB Track Human Eval
    [paper] [code]


  • Syntactic Question Abstraction and Retrieval for Data-Scarce Semantic Parsing
    Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Minjoon Seo
    AKBC 2020

  • Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
    Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
    ACL 2020 (short)
    [paper] [code]


  • A Comprehensive Exploration on WikiSQL with Table-Aware Word Contextualization
    Wonseok Hwang, Jinyeung Yim, Seunghyun Park, Minjoon Seo
    NeurIPS 2019 KR2ML Workshop
    [paper] [code]

  • Mixture Content Selection for Diverse Sequence Generation
    Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
    EMNLP 2019
    [paper] [code] [tweet] [blog]

  • Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
    Minjoon Seo*, Jinhyuk Lee*, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
    ACL 2019
    [paper] [code] [demo] [slides.pdf] [slides.pptx] [tweet]


  • Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension
    Minjoon Seo, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
    EMNLP 2018 (short)
    [paper] [code] [slides] [website]

  • Neural Speed Reading via Skim-RNN
    Minjoon Seo*, Sewon Min*, Ali Farhadi, Hannaneh Hajishirzi
    ICLR 2018
    [paper] [poster] [slides]


  • Zero-Shot Relation Extraction via Reading Comprehension
    Omer Levy, Minjoon Seo, Eunsol Choi, Luke Zettlemoyer
    CoNLL 2017
    [paper] [website]

  • Question Answering through Transfer Learning from Large Fine-Grained Supervision Data
    Sewon Min, Minjoon Seo, Hannaneh Hajishirzi
    ACL 2017 (short)

  • Are You Smarter than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension
    Aniruddha Kembhavi, Minjoon Seo, Dustin Schwenk, Jonghyun Choi, Ali Farhadi, Hannaneh Hajishirzi
    CVPR 2017

  • Bidirectional Attention Flow for Machine Comprehension
    Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi
    ICLR 2017
    [paper] [code] [website] [demo]

  • Query-Reduction Networks for Question Answering
    Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
    ICLR 2017
    [paper] [code]


  • A Diagram is Worth a Dozen Images
    Aniruddha Kembhavi, Mike Salvato*, Eric Kolve*, Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi
    ECCV 2016
    [paper] [dqa-net code]

  • Solving Geometry Problems: Combining Text and Diagram Interpretation
    Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi, Oren Etzioni, Clint Malcolm
    EMNLP 2015
    [paper] [code] [slides] [website] [demo]

  • BiliCam: Using Mobile Phones to Monitor Newborn Jaundice best paper nomination
    Lilian de Greef, Mayank Goel, Minjoon Seo, Eric Larson, James Stout, James Taylor, Shwetak Patel
    UbiComp 2014
    [paper] [website]

  • Diagram Understanding in Geometry Questions
    Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi, Oren Etzioni
    AAAI 2014
    [paper] [code] [slides] [website]

* denotes equal contribution.