Скачать презентацию CLAIMS CLassification Automated Infor Mation System Computer-Assisted Categorisation Скачать презентацию CLAIMS CLassification Automated Infor Mation System Computer-Assisted Categorisation

ffe8e3c78fe9a5678b76b5548623554b.ppt

  • Количество слайдов: 19

CLAIMS CLassification Automated Infor. Mation System Computer-Assisted Categorisation of Patent Documents in the International CLAIMS CLassification Automated Infor. Mation System Computer-Assisted Categorisation of Patent Documents in the International Patent Classification Patrick Fiévet, CLAIMS Project Manager WIPO & Caspar J. Fall, CLAIMS Consultant ELCA ICIC’ 03, Nîmes, 22 October 2003 0 © WIPO – 2003 PF & CJF

Agenda § Introduction to CLAIMS project (PF) § Computer-assisted categorization prototypes (CJF) § CLAIMS Agenda § Introduction to CLAIMS project (PF) § Computer-assisted categorization prototypes (CJF) § CLAIMS Categorizer perspectives 1 (PF) © WIPO – 2003 PF & CJF

1. Introduction to CLAIMS Project 2 © WIPO – 2003 PF & CJF 1. Introduction to CLAIMS Project 2 © WIPO – 2003 PF & CJF

1. 1 CLAIMS Context World Intellectual Property Organization (WIPO) International Patent Classification (IPC) Classification 1. 1 CLAIMS Context World Intellectual Property Organization (WIPO) International Patent Classification (IPC) Classification Automated Information System (CLAIMS) 3 © WIPO – 2003 PF & CJF

1. 2 CLAIMS Project Objectives IT support for the promotion of the IPC • 1. 2 CLAIMS Project Objectives IT support for the promotion of the IPC • IPC Reform and revision support • IPC Tutorials • Translation and Natural Language Search in the IPC • IPC Categorization assistance to Patent Offices 4 © WIPO – 2003 PF & CJF

2. Computer-assisted Categorization 5 © WIPO – 2003 PF & CJF 2. Computer-assisted Categorization 5 © WIPO – 2003 PF & CJF

2. 1 Objectives § Develop a solution for predicting International Patent Classification (IPC) codes 2. 1 Objectives § Develop a solution for predicting International Patent Classification (IPC) codes § Facilitate accurate classification in small and medium patent offices § Support for documents in multiple languages § Categorization assistance tool § Open questions § Depth of computer-assisted categorization § What accuracy? 6 © WIPO – 2003 PF & CJF

2. 1 Key issues § Survey of automated categorization research § Patent categorization § 2. 1 Key issues § Survey of automated categorization research § Patent categorization § The IPC is a hierarchical classification » 120 classes, 628 subclasses, 69’ 000 groups » Patents have secondary IPC codes § The categories are modified over time § Vocabulary very diverse and technical 7 © WIPO – 2003 PF & CJF

2. 1 Patent categorization approach § Machine-learning method to recognize categories » Statistical distribution 2. 1 Patent categorization approach § Machine-learning method to recognize categories » Statistical distribution of words § Establish training data » Training documents with good IPC codes » 210’ 000 to 830’ 000 documents Advantages • No need for keywords • Easy to train the tools • Can support many languages 8 Disadvantages • Never absolute certainty in the results • Difficult to have reliable full automation © WIPO – 2003 PF & CJF

2. 2 Prototype § Custom development § State-of-the-art algorithm § Language independent § Measure 2. 2 Prototype § Custom development § State-of-the-art algorithm § Language independent § Measure categorization success § Compare the predictions with other manually classified documents 9 © WIPO – 2003 PF & CJF

2. 2 Prototype results 10 © WIPO – 2003 PF & CJF 2. 2 Prototype results 10 © WIPO – 2003 PF & CJF

2. 2 Improving accuracy with category refining Scenario 1 Scenario 2 validate direct 11 2. 2 Improving accuracy with category refining Scenario 1 Scenario 2 validate direct 11 refine © WIPO – 2003 PF & CJF

2. 3 Conclusions § It works well! § Useful user assistance § Direct categorization 2. 3 Conclusions § It works well! § Useful user assistance § Direct categorization at subclass level possible § IPC codes can be refined accurately to main group level § To get accurate results, one needs: § Large datasets § Good category coverage § Accurate IPC codes § Read the proceedings for more details § Demonstration available after the presentation 12 © WIPO – 2003 PF & CJF

3. IPCCAT 13 © WIPO – 2003 PF & CJF 3. IPCCAT 13 © WIPO – 2003 PF & CJF

3. 1 CLAIMS Categorizer Perspectives 1. Implementation : IPCCAT 2. Training sets for IPC 3. 1 CLAIMS Categorizer Perspectives 1. Implementation : IPCCAT 2. Training sets for IPC Categorization: English, French, Spanish and Russian, German possibly chinese 3. IPC Data sets improvement & Categorizer Retraining 14 © WIPO – 2003 PF & CJF

3. 2 CLAIMS Categorizer Perspectives 4. Improve integration of the IPC Categorizer with other 3. 2 CLAIMS Categorizer Perspectives 4. Improve integration of the IPC Categorizer with other CLAIMS tools 5. CLAIMS policy for distribution of data sets in various Languages 15 © WIPO – 2003 PF & CJF

3. 2 Access to IPCCAT for PCT Login: IBGST 01 Password: clobterib 16 © 3. 2 Access to IPCCAT for PCT Login: IBGST 01 Password: clobterib 16 © WIPO – 2003 PF & CJF

Questions / Answers Patrick Fiévet: patrick. fievet@wipo. in 17 © WIPO – 2003 PF Questions / Answers Patrick Fiévet: patrick. fievet@wipo. in 17 © WIPO – 2003 PF & CJF

CLAIMS CLassification Automated Infor. Mation System Thank you for your attention 18 © WIPO CLAIMS CLassification Automated Infor. Mation System Thank you for your attention 18 © WIPO – 2003 PF & CJF