In Natural Language Processing (NLP), understanding the contextual subtleties that modulate a word's meaning is a significant challenge.
Exploring how to use grammar constraints to improve LLM output for Bible translation
Exploring language similarity metrics to improve machine translation for low-resource languages
Using statistical methods to detect linguistic anomalies in translations
A straightforward machine translation pipeline designed to tackle the challenges of languages with limited resources.
Exploring how semantic similarity between source verses can be used to suggest and disambiguate translations from multiple possible options.
Exploring three promising approaches to tokenizing low-resource languages: adding new tokens to LLMs, cipher-based methods, and logits warping.