e557b5adb8e58223ef3ef22c2c6f3547.ppt
- Количество слайдов: 22
China Patent Information For Western Users Huabing Liu liuhuabing@cnipr. com Intellectual Property Publishing House, SIPO
Main Topics § Chinese patent information: new challenge to us § Language barriers faced by western users § About machine translation § Our efforts
Looking at Chinese Patent § Essential Patent Information Resource Type Invention Patent Utility Model Design Granted Patent Total Published Amount 1, 000 (approximately) 950, 000 (approximately) 700, 000 (approximately) 340, 000 (approximately) 3, 000 (approximately) • § Third largest country of patent fillings in 2005 Fast Increasing Rate Year 2004 2005 2006 2007 Filling Amount 130, 133 173, 327 210, 490 149, 455 (Jan. Aug) • § Highest increasing speed in the world Improving Quality • • Service application is increasing sharply Hotspot:Automobile, Electronics, Natural Medicine
What happens in China? § 2007 • Shanghai Stock Index has increased by more than 2 times in 2007 • 17 th National Congress of CPC § Attaches greater importance to IP Protection and Technology Innovation • GDP is predicted to rank third in the world. It grew by 11. 5% in the first half of 2007. § 2008 • China Patent Law is revised for the 3 rd time. • Beijing 2008 Olympic Games
Your Needs on Chinese Patent Information § Patent Filling and Litigation in China • Booming economy creates wider IP attention and leads to more IP lawsuit • Patent search is necessary when you prepare to apply patent or litigate in China § Patent Examination • More than 0. 5 million patent and 0. 9 million utility model (in Chinese language only) • Annual increase of more than 0. 1 million patent and 0. 15 million utility model (in Chinese language only) § Technical/Competitor Watch • Domestic patent fillings are growing fast
§ Everything is getting better. We all should be prepared for the challenge from China !
However, There are Problems of Patent Data…… § Poor Quality of English Data § Shortage of effective Search Tools § Lack of Machine Translation
Condition of English Data § What is in English: • Bibliographic Data (invention patent, utility model) • Abstract (invention patent only) • Legal Status (invention patent, utility model, design ) § What is not in English • Abstract of Utility Model • Claims • Description § What is missing or wrong • Applicant/Inventor § Mistranslated/missing • Title § Missing • Abstract § Missing § Poor Quality • Priority item § Inefficient Search
Poor Translation Sample § Missed Information § Translation Mistake
Searching System Contrast
C-E Patent MT is not So Successful
Patent Information Asymmetry EP US JP WO CN KR JP WO KR others Chinese User Western User
How to Cross Language Barriers —Commercial Vendors’ Efforts § Improve quality of English data § Develop powerful patent search tools with more effective search entries § Provide Chinese patent research service to western users § Machine translation • Might be a “mission impossible”, but it is up to us to make it possible.
MT: Which Approach is More Intelligent Classical approach Rule-based MT Example-based MT Potential approach Statistic-based MT Mixed approach Human value-added approach Hybrid MT MT+TM HAMT MAHT
A Prototype of RBMT for C-E Patent Translation Conversion Grammar Analysis Syntax Analysis Structure Selection Phrase Analysis Pre-processing Rule & Knowledge Base Part-of-speech Tagging Dictionary 1 Morphology Analysis Patent Input Dictionary 2 Dictionary 3 Dictionary 4 Terminology Selection Format conversion …………… IPC-driven Dictionary Output
Barriers of C-E MT § Chinese Language: Ambiguous Grammar • Lack of tense, voice and part-of-speech identifier • Variable expression methods • Contains highly complex logical structure § Problems of Morphology • Word Segmentation • Part of Speech • Terminology § Easy to be affected by small errors • Interpunction mistakes • Wrongly written characters Even a minor error can result in poor translation
Is MT Possible? § Pros § Cons • Updated rules and knowledge • Richer terminology • Pre-processing § The quality is improving. • Poor syntax analysis result • Insufficient term amount • Ambiguous grammar § Machine is not smart enough. § Conclusion • At this stage, we can not solve all the barriers in CE MT, but human-aided MT can help us.
Blueprint of HAMT Patent Input Special element tagging Morphology & syntax analysis Term extraction Translation Forecasting Yes MT Patent Output Post-editing (Manual) No Pre-processing (Human and machine)
Cost vs Quality: Where is the Balance? The Key is: how to set up the quality standards?
Our Efforts on Chinese Patent Information -BJ. Zhongxian Tuofang § C-E Patent Machine Translation Project § Aim • Make C-E patent MT useful § Our works • • • Chinese linguist and NLP experts Cooperate with China Academy of Science Three years of R&D 3. 5 M IPC-driven C-E dictionary Large scale syntax rule base tailored for Chinese patent § Achievement • MT can demonstrate high readability levels across certain technology fields § Next step • • More terms from patent are urgent in demand How far we have progressed, we need your suggestion and evaluation
Our Efforts on Chinese Patent Information § Optimize English data • Legal Status/Designs complementarity • Missing item translation( such as utility model abstract) • Error Correction § Provide Integrated Chinese patent web service in English • C-PAT search system (Claims and Specification are also searchable) • Patent Translation (manual translation, MT, HAMT) • Patent research service by Chinese experts (Specialized in Chinese patent)
Thank You !
e557b5adb8e58223ef3ef22c2c6f3547.ppt