Multilingual Translation into Indian Languages based on Domain Specific Lexicons
Domain-specific neural machine translation (NMT) systems (e.g, in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. It is desirable that such NMT systems be lexically constrained and draw from domain-specific dictionaries. Dictionaries could present multiple candidate translations for a source word/phrase due to the polysemous nature of words. The onus is then on the machine translation model and the ecosystem to choose the contextually most appropriate candidate. Prior work has largely ignored this problem and focused on the single candidate setting where the target word or phrase is replaced by a single constraint. The IP includes a lexically constrained human-in-the-loop NMT system that can disambiguate between multiple candidate translations derived from dictionaries. We also have indigenous and accurate human-in-the-loop approaches to derive standardized and consistent technical terminology in multiple Indian languages..
Prof. G.Ramakrishnan, Computer Science and Engineering