How and Where to Translate? The Impact of Translation Strategies in Cross-lingual LLM Prompting
Aman Gupta, Yingying Zhuang, Zhou Yu, Ziji Zhang, Anurag Beniwal
View Original →Abstract
Despite advances in the multilingual capabilities of Large Language Models (LLMs), their performance varies substantially across different languages and tasks. In multilingual retrieval-augmented generation (RAG)-based systems, knowledge bases (KB) are often shared from high-resource languages (such as English) to low-resource ones, resulting in retrieved information from the KB being in a different language than the rest of the context. In such scenarios, two common practices are pre-translation to create a mono-lingual prompt and cross-lingual prompting for direct inference. However, the impact of these choices remains unclear. In this paper, we systematically evaluate the impact of different prompt translation strategies for classification tasks with RAG-enhanced LLMs in multilingual systems. Experimental results show that an optimized prompting strategy can significantly improve knowledge sharing across languages, therefore improve the performance on the downstream classification task. The findings advocate for a broader utilization of multilingual resource sharing and cross-lingual prompt optimization for non-English languages, especially the low-resource ones.
Related Articles
Impacts of National Cultures on Managerial Decisions of Engaging in Core Earnings Management
This study investigates the impact of Hofstede's cultural dimensions on abnormal core earnings management in multiple national cultural contexts. We employ an Ordinary Least Squares (OLS) regression...
EVOR: Evolving Retrieval for Code Generation
Recently the retrieval-augmented generation (RAG) has been successfully applied in code generation. However, existing pipelines for retrieval-augmented code generation (RACG) employ static knowledge...
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
We introduce Autoregressive Retrieval Augmentation (AR-RAG), a novel paradigm that enhances image generation by autoregressively incorporating knearest neighbor retrievals at the patch level. Unlike...