Tokenisation in rule-based machine translation [PDF]
Tokenisation is a process, where text is converted into such form, where each item is separated from the rest of the text. Words, for example, are such items, and they must be separated from punctuation marks and diacritics. The most convenient way to do
Hurskainen, Arvi
core
Minimization Strategies for Maximally Parallel Multiset Rewriting Systems [PDF]
Maximally parallel multiset rewriting systems (MPMRS) give a convenient way to express relations between unstructured objects. The functioning of various computational devices may be expressed in terms of MPMRS (e.g., register machines and many variants ...
Alhazov, Artiom, Verlan, Sergey
core +3 more sources
An efficient execution method for rule-based machine translation [PDF]
A rule based system is an effective way to implement a machine translation system because of its extensibility and maintainability. However, it is disadvantageous in processing efficiency. In a rule based machine translation system, the grammar consists of a lot of rewriting rules.
openaire +2 more sources
ARABIC-MALAY MACHINE TRANSLATION USING RULE-BASED APPROACH [PDF]
Arabic machine translation has been taking place in machine translation projects in recent years. This study concentrates on the translation of Arabic text to i ts equivalent in Malay language. The problem of thi s research is the syntactic and morphological differe nces between Arabic and Malay adjective sentences.
Mohd Juzaiddin Ab Aziz+1 more
openaire +2 more sources
A North Saami to South Saami Machine Translation Prototype
The paper describes a rule-based machine translation (MT) system from North to South Saami. The system is designed for a workflow where North Saami functions as pivot language in translation from Norwegian or Swedish. We envisage manual translation from
Lene Antonsen+2 more
doaj +1 more source
Automatic Methods and Neural Networks in Arabic Texts Diacritization: A Comprehensive Survey
Arabic diacritics are signs used in Arabic orthography to represent essential morphophonological and syntactic information. It is a common practice to leave out those diacritics in written Arabic. Most Arabic electronic texts lack such diacritics.
Manar M. Almanea
doaj +1 more source
Comparing rule-based and data-driven approaches to Spanish-to-Basque machine translation [PDF]
In this paper, we compare the rule-based and data-driven approaches in the context of Spanish-to-Basque Machine Translation. The rule-based system we consider has been developed specifically for Spanish-to-Basque machine translation, and is tuned to ...
Labaka, Gorka+3 more
core
Retrosynthetic reaction prediction using neural sequence-to-sequence models
We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem.
Gomes, Joseph+9 more
core +2 more sources
Why Catalan-Spanish Neural Machine Translation? Analysis, comparison and combination with standard Rule and Phrase-based technologies [PDF]
Catalan and Spanish are two related languages given that both derive from Latin. They share similarities in several linguistic levels including morphology, syntax and semantics. This makes them particularly interesting for the MT task.
Ruiz Costa-Jussà, Marta
core +1 more source
Social context prevents heat hormetic effects against mutagens during fish development
This study shows that sublethal heat stress protects fish embryos against ultraviolet radiation, a concept known as ‘hormesis’. However, chemical stress transmission between fish embryos negates this protective effect. By providing evidence for the mechanistic molecular basis of heat stress hormesis and interindividual stress communication, this study ...
Lauric Feugere+5 more
wiley +1 more source