cv
Taehun Cha is a Ph.D. candidate at the Department of Mathematics, Korea University. His main research area is Natural Language Processing, especially mathematically analyzing the current success of PLMs and LLMs. He is also interested in causal inference and sequential decision making.
Basics
Name | Taehun Cha |
cth127@korea.ac.kr | |
Url | https://cth127.github.io/ |
Education
-
2022.03 - Present -
2020.03 - 2022.02 -
2012.03 - 2019.08
Work
-
2023.07 - 2023.08 PhD Student Research Intern
Korea Telecom
Researched the hallucination problem in large language models (LLM). Built an automatized pipeline to construct a hallucination dataset using ChatGPT and a reward model to train LLM with RL. Selected as an outstanding intern.
Publications
-
2025.02.25 ABC3: Active Bayesian Causal Inference with Cohn Criteria in Randomized Experiments.
In The 39th Annual AAAI Conference on Artificial Intelligence
We introduce ABC3, a highly efficient active learning method for randomized experiment, with several theoretical properties.
-
2024.11.16 Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts.
Findings of the Association for Computational Linguistics: EMNLP 2024 (EMNLP 2024 Findings)
We show the pre-trained language models return distinguishable generation probability and uncertainty distribution to unfaithfully hallucinated texts, regardless of their size and structure.
-
2024.08.15 Evaluating Extrapolation Ability of Large Language Model in Chemical Domain.
Language + Molecules Workshop at ACL 2024. (Lang+Moles@ACL 2024)
LLM can extrapolate to new unseen material utilizing its chemical knowledge learned through massive pre-training.
-
2024.03.22 SentenceLDA: Discriminative and Robust Document Representation with Sentence Level Topic Model
In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2024)
We propose SentenceLDA, a sentence-level topic model. We combine modern SentenceBERT and classical LDA to extend the semantic unit from word to sentence.
-
2022.10.17 Noun-MWP: Math Word Problems Meet Noun Answers
In Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022)
We introduce a new type of problems for math word problem (MWP) solvers, named Noun-MWPs, whose answer is a non-numerical string containing a noun from the problem text.
Awards
- 2024.12.15
Concordia Contest @ Neruips 2024 - First Place out of 197 participants
Cooperative AI Foundation and Google Deepmind
Designed and developed an LLM-based AI agent that maximizes expected reward while maintaining a cooperative stance.
- 2023.12.08
AI Grand Challenge - 7nd Place
Ministry of Science and ICT
Designed and led the development of an open-domain, multi-hop, multi-modal,document-based report-generating system. (team leader)
- 2023.09.01
AI Grand Challenge Open Track - 2nd Place
Ministry of Science and ICT
Developed an open-domain, multi-hop, multi-modal document QA system and achieved 2nd place out of 12 teams.
- 2022.09.01
Korean AI Competition - 4th Place
Ministry of Science and ICT
Developed Automatic Speech Recognition model for Korean and achieved 4th place out of 103 teams. (team leader)
- 2021.09.01
AI Grand Challenge
Ministry of Science and ICT
Developed an NLP model to solve elementary math word problems based on KLUE-RoBERTa, and we were selected for follow-up research. (team leader)
- 2020.09.31
HAAFOR NLP Challenge 2020 - 3rd place
HAAFOR
Achieved 3rd place(70.27% accuracy) on text order prediction task with ALBERT model.
Certificates
Financial Risk Manager | ||
Global Association of Risk Professionals | 2022-04-30 |