Machine Translation Introduction

Which one is human translation?

Moses is an implementation of the statistical (or data-driven) approach to machine translation (MT). This is the dominant approach in the field at the moment, and is employed by the online translation systems deployed by the likes of Google and Microsoft.

Mojžíš je implementace statistické (nebo řízené daty) přístupu k strojového překladu (MT). To je převládajícím přístupem v oblasti v současné době, a je zaměstnán pro on-line překladatelských systémů nasazených likes Google a Microsoft.

Moses je implementace statistického (nebo daty řízeného) přístupu k strojovému překladu (MT). V současné době jde o převažující přístup v rámci strojového překladu, který je použit online překladovými systémy nasazenými Googlem a Microsoftem.

Mojžíš je provádění statistické (nebo aktivovaný) přístup na strojový překlad (mt). To je dominantní přístup v oblasti v tuto chvíli, a zaměstnává on - line překlad systémů uskutečněné takové, Google a Microsoft.



Translation I

Translation is a transfer of a text from a source language to a target language. Interpreting is oral translation of spoken language.

Translation II

The context is crucial for translation. —Maimonidés (12th century)

Each word is an element pulled out from a complex language system and its relations to other segments of the system differ in different languages. Each meaning (sense) is an element from a complex system of segments which a speaker divides reality into. —Werner Winter (1923–2010)

Which properties of a source should be preserved?

J. Levý img

Translation (Levý)



What should a good translator know (Levý):

Levý on machine translation and artistic translation

“Machine Translation’s goal is to fragment a sentence to the simplest comparable elements; artistic translation’s goal is the opposite: transfering of the highest units.”

Types of translations according to Roman Jakobson


Example of hard words


“Dammit I’m Mad”

Dammit I’m mad. Evil is a deed as I live. God, am I reviled? I rise, my bed on a sun, I melt. To be not one man emanating is sad. I piss. Alas, it is so late. Who stops to help? Man, it is hot. I’m in it. I tell. I am not a devil. I level “Mad Dog”. Ah, say burning is, as a deified gulp, In my halo of a mired rum tin. I erase many men. Oh, to be man, a sin. Is evil in a clam? In a trap? No. It is open. On it I was stuck. Rats peed on hope. Elsewhere dips a web. Be still if I fill its ebb. Ew, a spider… eh? We sleep. Oh no! Deep, stark cuts saw it in one position. Part animal, can I live? Sin is a name. Both, one… my names are in it. Murder? I’m a fool. A hymn I plug, deified as a sign in ruby ash, A Goddam level I lived at. On mail let it in. I’m it. Oh, sit in ample hot spots. Oh wet! A loss it is alas (sip). I’d assign it a name. Name not one bottle minus an ode by me: “Sir, I deliver. I’m a dog” Evil is a deed as I live. Dammit I’m mad.

Linguistic relativity

The limits of my language mean the limits of my world. —Ludwig Wittgenstein

If Aristotle had spoken Chinese or Dakota he would have arrived at a totally different logic. —Fritz Mauthner

Wikipedia on Linguistic relativity


Where does linguistic relativity belong?

Sapir-Whorf hypotesis

Machine Translation—definition

A discipline of computational linguistics dealing with design, implementation and application of automatic systems (software) for translating texts with minimal human invervention.

E.g. a translation with an electronic dictionary does not belong to the field of machine translation.

Machine translation

We consider only technical / specialized texts:

Nuances on different language levels in art literature are out of scope of current MT systems.

Machine translation: issues

In fact an output of MT is always revised. We distinguish pre-editing and post-editing.

MT systems make different types of errors.

These mistakes are characteristic for human translators:

For computers, errors in meaning are characteristic:

Taxonomy of MT errors

Costa, Ângela, et al. “A linguistically motivated taxonomy for Machine Translation error analysis.” Machine Translation 29.2 (2015): 127-161.

Lexical choice

A choice of a proper translational equivalent:

Word order I

Word order

Free word order

Word order rule:

The more morphologically rich language, the freer word order it has.

Katka snědla kousek koláče.

Free word order in Czech

How their meanings differ?

Direct methods for improving MT quality

Basic terms

Classification based on approach

Vauquois’s triangle

Vauquois’s triangle

Interaction with a user

Direction and arity



Systems of Machine Translation I

Systems of Machine Translation II

Conferences, workshops, institutions



History of MT

Motivations for MT

In 1947 RAM could store 100 numbers and $a + b$ took 1/8 s!

Early MT believes

Warren Weaver: When I look at an article in Russian, I say: This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.

First impulses

In 1950 Weaver sended a memorandum to 200 addressees in which he outlined some problems of MT.

An early interest in MT held at several departments. At first at University of London (Andrew D. Booth). Soon after at MIT, University of Washington, University of California, Harvard, …

Topics and first exchanges of experience

Turing’s Test: using language as humans do is a sufficient operational test for intelligence.

Georgetown experiment

The first working prototype of MT.

Progress in 50’s

Progress in 50’s

60’s, Disappointments from poor results

Progress in 60’s

ALPAC report, 1966









Too optimistic prognosis

So called “hype”. Similar now with Artificial inteligence (Watson, Go, robotics) Evolution of MT

Machine Translation nowadays I

Machine Translation nowadays II

Machine translation nowadays III

Motivation in 21st century