Exploring language similarity metrics to improve machine translation for low-resource languages
A straightforward machine translation pipeline designed to tackle the challenges of languages with limited resources.
Exploring three promising approaches to tokenizing low-resource languages: adding new tokens to LLMs, cipher-based methods, and logits warping.
Exploring alternative approaches to machine translation for low-resource languages, focusing on in-context learning and LLM-based predictions rather than traditional transfer learning methods.