I’m a linguistics graduate interested in how computational methods can deepen our analysis of natural language. My experience includes managing Austronesian lexical datasets under Dr. Alexander D. Smith to leading prompt engineering in an LLM annotation project. I have written for the National Library of Singapore's BiblioAsia. I am keenly interested in teaching models to distinguish pragmatic intent in interactional discourse, especially in low-resource languages.
In collaboration with PhD students Jing Liu and Siqi Xie, I developed a prompt engineering framework to improve Gemini's (2.5 Flash-Lite) annotation of child-adult dialogues in English. The dialogues were taken from the CHILDES corpus.
This work represents the first stage of a larger pipeline. Later stages, including supervised fine-tuning (SFT) and additional model evaluation, will be carried out by the research team.
Project repository:
https://github.com/wuthree00/LLM-ChildTalk-prompt-engineering
My primary research interests are in applying natural language processing to the documentation of low-resource languages. I am curious about how techniques like data augmentation and optimised segmentation can address the underlying data scarcity in Austronesian languages. I am also interested in improving how computational models interpret pragmatics in conversation, especially cross-culturally.
Prompt Engineering: Full-cycle development including prompt design and iterative refinement, and rubric-based evaluation of model output
Data: Managing sparse lexical data (Austronesian languages, eg. Lebo' Vo' Kenyah) · IPA transcription · Corpus annotation (FLEx, SayMore)
Programming: Python (actively completing University of Helsinki MOOC)
Research & Writing: Archival research · Post-Editing · Editorial writing for BiblioAsia