Ruth Wee

I’m a linguistics graduate working at the intersection of language and AI. My work focuses on evaluating AI-generated outputs, prompt engineering, and structuring language data for computational use. I currently work as an AI Knowledge Engineer, where I test and refine internally developed systems for legal applications by comparing model outputs against real-world documents, identifying errors, and suggesting improvements. Alongside this, I have experience in Austronesian language documentation under Dr. Alexander D. Smith, and editorial writing for the National Library’s BiblioAsia. I’m particularly interested in how linguistic structure and analysis can be translated into working NLP systems, especially in modelling pragmatic meaning in real-world interaction.


Current Work

AI Knowledge Engineer

I evaluate system outputs in applied settings by comparing generated content against real-world use cases, identifying error patterns, and suggesting improvements to improve performance and usability.


LLM Prompt Engineering & Annotation (Child-Adult Dialogues)

In collaboration with PhD students Jing Liu and Siqi Xie, I developed a prompt engineering framework for annotating child-adult dialogues from the CHILDES corpus.

This work represents the first stage of a larger pipeline. Later stages, including supervised fine-tuning (SFT) and additional model evaluation, will be carried out by the research team.

Project repository:
https://github.com/wuthree00/LLM-ChildTalk-prompt-engineering


Research Interests

My research interests are applying NLP to the documentation of low-resource languages. I am curious about how techniques like data augmentation and optimised segmentation can address the underlying data scarcity in Austronesian languages. I am also interested in improving how computational models interpret pragmatics in human conversation, especially with cross-cultural nuances in mind.


Technical Skills & Tools

Prompt Engineering: Designing, testing, and refining prompt frameworks, including structured evaluation of LLM outputs

Data: Managing sparse lexical data (Austronesian languages, eg. Lebo' Vo' Kenyah) · IPA transcription · Corpus annotation (FLEx, SayMore)

Programming: Python (actively completing the University of Helsinki's MOOC)

Research & Writing: Archival research · Post-Editing · Editorial writing for BiblioAsia


Writing & Editorial Work


Contact

GitHub · LinkedIn · Email: ruthwee00 [at] gmail [dot] com

Return to top