About Me
Nice to meet you! I am a first year MSCS (Master of Science in Computer Science) student from Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign. My research interest includes Natural Language Processing, Large Language Models, and Text Mining.
I am very fortunate to be advised by Prof. Jiawei Han from Data Mining Group at UIUC. I was adivsed by Prof. Zhiyuan Liu from Natural Language Processing Lab at Tsinghua University.
You can find my CV here: Runchu’s Curriculum Vitae.
Education
- M.S. in Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign, 2026 (expected)
- B.S. in Weiyang Colledge, Tsinghua University, 2024
Research experience
DebugBench: Evaluating Debugging Capability of Large Language Models
- Timeline: Sept 2023 - Feb 2024
- Institute: Natural Language Processing Lab, Tsinghua University
- Adviser: Zhiyuan Liu
- Constructed a comprehensive debugging benchmark with source data from LeetCode and bug implantation with GPT-4
- Evaluated the debugging capabilities of different models with different bugs under different scenarios
- Compared the relationship between the task of code generation and debugging
Enhancing Gene Embedding with knowledge from LLMs
- Timeline: June 2023 - Sept 2023
- Institute: Paul G. Allen School of Computer Science & Engineering, University of Washington
- Adviser: Sheng Wang
- Enhanced gene graph embedding with text embedding vector from LLMs and retrieved literature by feature fusing
- Designed a dynamic literature retrieval framework with LLMs
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world API
- Timeline: Mar 2023 - June 2023
- Institute: Natural Language Processing Lab, Tsinghua University
- Adviser: Zhiyuan Liu
- Proposed a prototype to boost LLM tool calling efficiency based on Tree of Thought
Exploring Format Consistency for Instruction Tuning
- Timeline: Jan 2023 - June 2023
- Institute: Natural Language Processing Lab, Tsinghua University
- Adviser: Zhiyuan Liu
- Evaluated the effect of format consistency of instructions among 3 major multi-tasking datasets
- Designed a pipeline for conversion of different instruction styles in diverse dataset with LLM
- Proposed a denoising technique to ensure the quality of instruction conversion by multi-sample and perplexity probe