Скачать презентацию Speech and Language Technologies in the Next Generation Скачать презентацию Speech and Language Technologies in the Next Generation

3b9e51f02983bcc38cb61e11aa00b529.ppt

  • Количество слайдов: 45

Speech and Language Technologies in the Next Generation Localisation CSET Prof. Andy Way, School Speech and Language Technologies in the Next Generation Localisation CSET Prof. Andy Way, School of Computing, DCU

Overview of Presentation Speech & Language Technologies in the NGL CSET Overview of Presentation Speech & Language Technologies in the NGL CSET

Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual NGL Applications

Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual NGL Applications Key Research Challenges

Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual NGL Applications Key Research Challenges Novel Research Tracks

Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual NGL Applications Key Research Challenges Novel Research Tracks Typical LSP’s Translation Process

Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual NGL Applications Key Research Challenges Novel Research Tracks Typical LSP’s Translation Process Key Integration Challenges

Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual Overview of Presentation Speech & Language Technologies in the NGL CSET Facilitating Optimal Multilingual NGL Applications Key Research Challenges Novel Research Tracks Typical LSP’s Translation Process Key Integration Challenges Concluding Remarks

ILT - Integrated Language Technologies Next Generation Localisation Personalised Localisation Digital Content Management Enterprise ILT - Integrated Language Technologies Next Generation Localisation Personalised Localisation Digital Content Management Enterprise Localisation Unified Model Systems Framework Prof. Andy Way ILT Area Coordinator

ILT: Facilitating Optimal Multilingual NGL Applications Text Output Text Processing Machine Translation Text Input ILT: Facilitating Optimal Multilingual NGL Applications Text Output Text Processing Machine Translation Text Input e. g. bulk localisation

ILT: Facilitating Optimal Multilingual NGL Applications Text Output Text Processing Speech Output Machine Translation ILT: Facilitating Optimal Multilingual NGL Applications Text Output Text Processing Speech Output Machine Translation Text Input e. g. bulk localisation Speech Technologies Speech Input e. g. personalisation

Machine Translation: Significance For our industrial partners, volume of material needing translation increasing, while Machine Translation: Significance For our industrial partners, volume of material needing translation increasing, while budgets remain the same In the EU, now 23 official languages (506 language pairs), and expanding … In the US, huge investment in translation between Arabic , Chinese and Urdu English …

Machine Translation: Significance For our industrial partners, volume of material needing translation increasing, while Machine Translation: Significance For our industrial partners, volume of material needing translation increasing, while budgets remain the same In the EU, now 23 official languages (506 language pairs), and expanding … In the US, huge investment in translation between Arabic , Chinese and Urdu English … Automation the only option (especially for PL) …

MT: Key Research Challenges Enhanced Translation Quality Faster Translation Times Scalability Other Modalities (Speech, MT: Key Research Challenges Enhanced Translation Quality Faster Translation Times Scalability Other Modalities (Speech, SMS etc. )

The State-of-the-Art Source: Reference: The two sides highlighted the role of the World Trade The State-of-the-Art Source: Reference: The two sides highlighted the role of the World Trade Organization (WTO) Baseline: The two sides on the role of the WTO

Improving the State-of-the-Art Source: Reference: The two sides highlighted the role of the World Improving the State-of-the-Art Source: Reference: The two sides highlighted the role of the World Trade Organization (WTO) Baseline: The two sides on the role of the WTO Our System: The two sides reaffirmed the role of the WTO Our MT systems have knowledge of syntax Parts of speech (nouns, verbs etc. ) Roles in sentences (subject, object etc. ) better translation quality

The State-of-the-Art Source: Reference: Mahmoud Abbas: The wall and settlements will not bring Israel The State-of-the-Art Source: Reference: Mahmoud Abbas: The wall and settlements will not bring Israel security Baseline: Mahmoud Abbas, the wall and settlements will provide security to Israel Our System: Mahmoud Abbas, the wall and settlements will not provide security for Israel

Improving the State-of-the-Art Source: Reference: Mahmoud Abbas: The wall and settlements will not bring Improving the State-of-the-Art Source: Reference: Mahmoud Abbas: The wall and settlements will not bring Israel security Baseline: Mahmoud Abbas, the wall and settlements will provide security to Israel Our System: Mahmoud Abbas, the wall and settlements will not provide security for Israel better translation quality (especially where end-users are concerned) DCU Arabic English system ranked first at international MT evaluation in Oct. 2007

MT Novel Research: Handling Different Types of Text Translating patent applications, or doctors’ prescriptions, MT Novel Research: Handling Different Types of Text Translating patent applications, or doctors’ prescriptions, or visa applications: different tasks, as the content is different … So is the form …

MT Novel Research: Handling Different Types of Text Translating patent applications, or doctors’ prescriptions, MT Novel Research: Handling Different Types of Text Translating patent applications, or doctors’ prescriptions, or visa applications: different tasks, as the content is different … So is the form … Build different MT systems for each different task, using our industrial partners’ documentation

Text Processing: Significance and Challenges If texts are automatically annotated with: syntactic information (e. Text Processing: Significance and Challenges If texts are automatically annotated with: syntactic information (e. g. subject, object), today’s MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM)

Text Processing: Significance and Challenges If texts are automatically annotated with: syntactic information (e. Text Processing: Significance and Challenges If texts are automatically annotated with: syntactic information (e. g. subject, object), today’s MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM) text-type and genre information, this helps our MT systems disambiguate text and improve translation quality

Text Processing: Significance and Challenges If texts are automatically annotated with: syntactic information (e. Text Processing: Significance and Challenges If texts are automatically annotated with: syntactic information (e. g. subject, object), today’s MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM) text-type and genre information, this helps our MT systems disambiguate text and improve translation quality localisation information (e. g. Andy Way), then the workflows of our industrial partners (currently done manually) can be significantly improved (cf. LOC)

Speech Technology: Significance Speech interfaces for eyes-busy, hands-busy scenairos access Speech recognition and synthesis Speech Technology: Significance Speech interfaces for eyes-busy, hands-busy scenairos access Speech recognition and synthesis systems which can deal with potentially an unlimited vocabulary volume & scalability multiple (and non-native) speakers multiple languages and can be tightly integrated with MT localisation & personalisation

Speech Technology: Challenges them ore its nows them ore it goes? themo rei tsn Speech Technology: Challenges them ore its nows them ore it goes? themo rei tsn ow sthe mo reitg o es? themoreitsnows themoreitgoes the more it snows the more it goes…

Speech Technology: Challenges them ore its nows them ore it goes? themo rei tsn Speech Technology: Challenges them ore its nows them ore it goes? themo rei tsn ow sthe mo reitg o es? themoreitsnows themoreitgoes demoreisnows demoregoes the more it snows the more it goes…

Speech Technology: Challenges them ore its nows them ore it goes? themo rei tsn Speech Technology: Challenges them ore its nows them ore it goes? themo rei tsn ow sthe mo reitg o es? themoreitsnows themoreitgoes “rules” and vocabulary of system demoreisnows demoregoes the more it snows the more it goes… performance of (native) speaker linguistic competence of native speaker

Speech Technology: themoreitsnows themoreitgoes Innovations Robust & Novel Speech Recognition Engine which integrates explicit Speech Technology: themoreitsnows themoreitgoes Innovations Robust & Novel Speech Recognition Engine which integrates explicit linguistic knowledge demoreisnows demoregoes performance of (native) speaker the more it snows the more it goes… them ore its nows them ore it goes? themo rei tsn ow sthe mo reitg o es? linguistic competence of native speaker “rules” and vocabulary of system

Innovations: Speech Recognition & MT Jemehreschneit destomehres geht Tight coupling with MT Engines Robust Innovations: Speech Recognition & MT Jemehreschneit destomehres geht Tight coupling with MT Engines Robust & Novel Speech Recognition Engine which integrates explicit linguistic knowledge themoreitsnows themoreitgoes detverkarhavarite nstormhurmån the more it snows the more it goes… them ore its nows them ore it goes? themo rei tsn ow sthe mo reitg o es? linguistic competence of native speaker “rules” and vocabulary of system

Innovations: MT & Speech Synthesis Jemehreschneit destomehres geht themoreitsnows themoreitgoes detverkarhavarite nstormhurmån Tight coupling Innovations: MT & Speech Synthesis Jemehreschneit destomehres geht themoreitsnows themoreitgoes detverkarhavarite nstormhurmån Tight coupling with MT Engines Robust & Novel Speech Synthesis Engine which integrates explicit linguistic knowledge

Typical LSP’s Translation Process Incoming documents (segmented) Requirement: minimal disruption of this process Partially Typical LSP’s Translation Process Incoming documents (segmented) Requirement: minimal disruption of this process Partially Translated Documents, with confidence rating for segments Step 3: Documents Validation & Finalization Freelance Translators Translation Memory DB Step 1: Translation Memory In-house Translators & Machine Translation TM match score cheap Step 2: Post. TM match score < 50 %: 50 % < > 70 %: TM match score < 70 %: expensive editing & medium translation

Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]

Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008] Linking MT automatic evaluation metrics with postediting cost

Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008] Linking MT automatic evaluation metrics with postediting cost Ensuring that MT omissions are highlighted

Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008] Linking MT automatic evaluation metrics with postediting cost Ensuring that MT omissions are highlighted Enforcing customer terminology

Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008] Linking MT automatic evaluation metrics with postediting cost Ensuring that MT omissions are highlighted Enforcing customer terminology Deal with markup, tags …

Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008] Linking MT automatic evaluation metrics with postediting cost Ensuring that MT omissions are highlighted Enforcing customer terminology Deal with markup, tags … Produce true-cased translations

Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper Key Integration Challenges Use MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008] Linking MT automatic evaluation metrics with postediting cost Ensuring that MT omissions are highlighted Enforcing customer terminology Deal with markup, tags … Produce true-cased translations Integrate into pre-existing workflows!

Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and Ph. D students

Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and Ph. D students Large interest from industrial partners, both large and small

Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and Ph. D students Large interest from industrial partners, both large and small Input from LOC, DCM and SF

Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and Ph. D students Large interest from industrial partners, both large and small Input from LOC, DCM and SF Significant role in CNGL demonstrators

Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and Ph. D students Large interest from industrial partners, both large and small Input from LOC, DCM and SF Significant role in CNGL demonstrators Research tools Industrial prototypes

Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in Concluding Remarks For ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and Ph. D students Large interest from industrial partners, both large and small Input from LOC, DCM and SF Significant role in CNGL demonstrators Research tools Industrial prototypes Well placed to succeed in going ‘beyond TMs’ …

Speech & Language Technologies in the NGL CSET Thanks for listening! Questions? http: //www. Speech & Language Technologies in the NGL CSET Thanks for listening! Questions? http: //www. cngl. ie away@computing. dcu. ie