A method to improve translation precision using valid tokens from sample translation pairs, focusing on back-translation and token coverage.
Exploring three promising approaches to tokenizing low-resource languages: adding new tokens to LLMs, cipher-based methods, and logits warping.
Exploring alternative approaches to machine translation for low-resource languages, focusing on in-context learning and LLM-based predictions rather than traditional transfer learning methods.
If you are working with Bible translation data, you know how challenging it can be to deal with different types of data sources, formats, and structures.
This article argues for mapping semantic similarity between translations rather than relying on problematic token-to-token alignments.
Exploring the two fundamental user interfaces for AI interaction: Chatbots and Contextual Suggestions, and how they can be combined effectively.
Exploring the possibility of using VS Code as a base for an AI-native Bible translation app, inspired by successful forks like Cursor.
How AI alignment in Meta's Llama 2 prevents basic religious text queries from being answered accurately.
In Natural Language Processing (NLP), understanding the contextual subtleties that modulate a word's meaning is a significant challenge.
A simple example of how to declare a namespace in XQuery so you can quickly and easily run XPath and XQuery on namespace elements.