Скачать презентацию Postgraduate Diploma in Translation Introduction to Machine Translation Скачать презентацию Postgraduate Diploma in Translation Introduction to Machine Translation

6cd8d820b34604efdb0b711c5d21ae71.ppt

  • Количество слайдов: 29

Postgraduate Diploma in Translation Introduction to Machine Translation IV The Translator’s Workstation March 2005 Postgraduate Diploma in Translation Introduction to Machine Translation IV The Translator’s Workstation March 2005 Intro to MT IV 1

Recap: MT Methods MT Direct MT Rule-Based MT Data-Driven MT Transfer Interlingua March 2005 Recap: MT Methods MT Direct MT Rule-Based MT Data-Driven MT Transfer Interlingua March 2005 Intro to MT IV EBMT SMT 2

Different Styles of MT n FAMT: fully automatic machine translation q q n n Different Styles of MT n FAMT: fully automatic machine translation q q n n FAHQMT FALQMT MAHT: machine aided human translation HAMT: human aided machine translation March 2005 Intro to MT IV 3

The Proper Place of. Men and Machines in Language Translation n n Martin Kay, The Proper Place of. Men and Machines in Language Translation n n Martin Kay, 1980 [1997] Machine translation is an excellent research vehicle but stands no chance of filling actual needs for translators. Answer is to develop cooperative man-machine systems Start with word processing and add translation specific enhancements to approach the goal of automatic tranlation. Be modest: be humble. March 2005 Intro to MT IV 4

The Translator’s Workstation Origins & Development n n Main idea of TW attributed to The Translator’s Workstation Origins & Development n n Main idea of TW attributed to Martin Kay (author of “Proper Place of Men and Machines in Machine Translation”, (1980) Basic ingredients include q q q n n Glossaries Multilingual termbanks Translation Memories (TM) Built on word processing environment Progressive automation of dictionary lookup and access to TM March 2005 Intro to MT IV 5

Standard Word Processing Environment includes n n n Spell Check Grammar Check Thesaurus Word Standard Word Processing Environment includes n n n Spell Check Grammar Check Thesaurus Word Counting Archiving and retrieval of documents March 2005 Intro to MT IV 6

Translation-Oriented Editing n n Basic Idea: add a certain level of linguistic awareness to Translation-Oriented Editing n n Basic Idea: add a certain level of linguistic awareness to editing functions. Translation-oriented word substitution q q e. g. replace “purchase” with “buy” system: n n q q purchasing → buying purchased → bought e. g. replace “brume” with “brouillard” system: n March 2005 brouilard épais → brume épaisse Intro to MT IV 7

Integration with Desktop Publishing Translation of Captions March 2005 Intro to MT IV 8 Integration with Desktop Publishing Translation of Captions March 2005 Intro to MT IV 8

Mark Up Languages n n n Markup is anything added to the content of Mark Up Languages n n n Markup is anything added to the content of the document that describes the text. Formatting instructions: typeface, fonts, paragraphs, bulletted lists. HTML More abstract levels of content description. XML March 2005 Intro to MT IV 9

TMX n n n TMX (Translation Memory e. Xchange) is the vendor-neutral open XML TMX n n n TMX (Translation Memory e. Xchange) is the vendor-neutral open XML standard for the exchange of Translation Memory The purpose of TMX is to allow easier exchange of translation memory data between tools and/or translation vendors http: //www. lisa. org/tmx/specification. html March 2005 Intro to MT IV 10

Access to Lexical Resources n Online Dictionaries q q q n On screen version Access to Lexical Resources n Online Dictionaries q q q n On screen version of traditional printed dictionary Exploitation of hypertext links Editing facilities cf. French Assistant system from Lernhout and Hauspie Term banks q q q Gazetteers Encyclopaedic knowledge World Wide Web March 2005 Intro to MT IV 11

March 2005 Intro to MT IV 12 March 2005 Intro to MT IV 12

March 2005 Intro to MT IV 13 March 2005 Intro to MT IV 13

March 2005 Intro to MT IV 14 March 2005 Intro to MT IV 14

Commercially Available Systems n n Typically designed for non-linguists. . . as an extension Commercially Available Systems n n Typically designed for non-linguists. . . as an extension of a familiar word processing environment March 2005 Intro to MT IV 15

A Typical MAHT n n n Separate windows for source and target text Source A Typical MAHT n n n Separate windows for source and target text Source text initially shown in target window, to be overwritten by translation User highlights a portion of text to be machine translated. Draft translation is then pasted in, ready for post-editing. User decides what will be translated by machine, and can develop a modus operandi. March 2005 Intro to MT IV 16

March 2005 Intro to MT IV 17 March 2005 Intro to MT IV 17

Interactive Translation n Most systems allow user a choice of interactive translation in which Interactive Translation n Most systems allow user a choice of interactive translation in which systems stops and asks translator to make choices. Can be annoying. Machine may keep asking the same question. Difficult to resolve this problem in general case. March 2005 Intro to MT IV 18

March 2005 Intro to MT IV 19 March 2005 Intro to MT IV 19

Translation Memory n n n First proposed in 1970 s, but not generally available Translation Memory n n n First proposed in 1970 s, but not generally available until 1990 s. Database of previous translations Sentence by sentence translation If exact match for new sentence is found, it is pasted in. If not, TM may highlight those parts of the new sentence which differ from the stored one. March 2005 Intro to MT IV 20

Translation Memory – Higlighting Difference March 2005 Intro to MT IV 21 Translation Memory – Higlighting Difference March 2005 Intro to MT IV 21

Translation Memory n n Keys to success are Efficient storage of sentences Efficient matching Translation Memory n n Keys to success are Efficient storage of sentences Efficient matching scheme Most current commercial systems are based on character string similarity March 2005 Intro to MT IV 22

Similarity between sentences 1. 2. 3. 4. When the paper tray is empty, remove Similarity between sentences 1. 2. 3. 4. When the paper tray is empty, remove it and refill it with paper of the appropriate size When the tray is empty, remove it and fill it with the appropriate paper. When the bulb remains unlit, remove it and replace with a new bulb You have to remove the paper tray in order to refill it when it is empty. March 2005 Intro to MT IV 23

Other Corpus Based Resources n Concordance: is a list of words (called keywords, e. Other Corpus Based Resources n Concordance: is a list of words (called keywords, e. g. here ‘sin’), taken from a corpus displayed in the centre of the page and shown in contexts in which they occur q q n Monolingual Bilingual Other Corpus tools q Word sense profilers - WASPS March 2005 Intro to MT IV 24

Monolingual Concordance Example 1 1 2 2 2 4 6 6 7 7 8 Monolingual Concordance Example 1 1 2 2 2 4 6 6 7 7 8 8 8 9 10 10 10 11 11 12 12 hed it off. * * * ‘What a curious feeling!’ said Alice; ‘I must b against herself, for this curious child was very fond of pretendi ‘Curiouser and curiouser!’ cried Alice (she was so muc Eaglet, and several other curious creatures. Alice led the way, -- and yet – it’s rather curious, you know, this sort of life! eir heads. She felt very curious to know what it was all about, out a cat! It’s the most curious thing I ever saw in my life!’ S ht into it. ‘That's very curious!’ she thought. ‘But everything’ hought. ‘But everything's curious today. I think I may as well g Alice thought this a very curious thing, and she went nearer to w she had never seen such a curious croquet-ground in her life; it seen, when she noticed a curious appearance in the air: it puzz next, and so on. ’ ‘What a curious plan!’ exclaimed Alice. ‘That’s : ‘and I do so like that curious song about the whiting!’ ‘Oh, th, and said ‘That’s very curious. ’ ‘It's all about as curious a ous. ’ ‘It’s all about as curious as it can be, ’ said the Gryphon moment Alice felt a very curious sensation, which puzzled her a er the list, feeling very curious to see what the next witness wo ad!’ ‘Oh, I’ve had such a curious dream!’ said Alice, and she tol her, and said, ‘It was a curious dream, dear, certainly: but no March 2005 Intro to MT IV 25

Bilingual Concordance March 2005 Intro to MT IV 26 Bilingual Concordance March 2005 Intro to MT IV 26

Bilingual Concordance Original text Translation 1. Ainsi, quand il aperçut POUR la première fois Bilingual Concordance Original text Translation 1. Ainsi, quand il aperçut POUR la première fois mon avion [. . . ] 1. The first time he saw my aeroplane, for instance [. . . ] 2. Alors elle avait forcé sa toux POUR 2. Then she forced her cough a little lui infliger quand même des more SO THAT he should suffer remords. from remorse just the same. 3. -Approche-toi que je te voie mieux, 3. “Approach, so that I may see you lui dit le roi qui était tout fier better, ” said the king, who felt d’être enfin roi POUR quelqu’un. consumingly proud of being at last a king OVER somebody. 4. Car, POUR les vaniteux, les autres 4. For, TO conceited men, all other hommes sont des admirateurs. men are admirers. 5. C’est comme POUR la fleur. “ March 2005 5. It is just as it is WITH the flower. Intro to MT IV 27

WASPS n n n A Semi-Automatic Lexicographer's Workbench for Writing Word Sense Profile. S WASPS n n n A Semi-Automatic Lexicographer's Workbench for Writing Word Sense Profile. S Adam Kilgarriff, David Tugwell et. al, ESRC 1999 -2002 Remit was to explore the synergy between the lexicographer's task of identifying and describing word senses, and the computational task of word sense disambiguation (WSD). March 2005 Intro to MT IV 28

Summary n n n Translator’s workstation represents the most cost effective facility for the Summary n n n Translator’s workstation represents the most cost effective facility for the professional translator working in a large organisation. Range of integrated services that are relevant to translation. Translator remains in control March 2005 Intro to MT IV 29