Скачать презентацию Find What I Mean Not What I Скачать презентацию Find What I Mean Not What I

e489c3e3571f9b06f580c5e0d385872a.ppt

  • Количество слайдов: 20

® “Find What I Mean, Not What I Say ® “Find What I Mean, Not What I Say" Mike Moran IBM Distinguished Engineer November, 2007 © 2007 IBM Corporation 1

Information Management software | Enterprise Content Management Why do companies use search? Business Benefit Information Management software | Enterprise Content Management Why do companies use search? Business Benefit Value of Search Increase productivity Enable employees to more quickly find information needed to complete their business activities Achieve greater insight Analyze free-form, text-based information for insight into customer behavior and business performance Decrease costs Empower customers and partners to support themselves and perform their own research Increase revenue Ensure customers can easily find products and services, driving higher sales and increasing customer retention © 2007 IBM Corporation 2

Information Management software | Enterprise Content Management How does IBM Omni. Find meet those Information Management software | Enterprise Content Management How does IBM Omni. Find meet those needs? Omni. Find Yahoo! Edition Basic, No-Charge Search Insight Solutions with Omni. Find Content Analytics © 2007 IBM Corporation Omni. Find Enterprise Edition Scalable and Secure Enterprise Search for Corporate Intranets Omni. Find Discovery Edition Search for Self-Service and e. Commerce 3

Information Management software | Enterprise Content Management Why is search so difficult? § It Information Management software | Enterprise Content Management Why is search so difficult? § It is harder to think of words than to make choices 1 to 10 of 10 zillion § Choosing the same words as the author is not easy § Words are ambiguous © 2007 IBM Corporation 4

Information Management software | Enterprise Content Management The classic search model Task Misconception I Information Management software | Enterprise Content Management The classic search model Task Misconception I need to tell Pat. Information Need Mistranslation How do I contact Pat? Verbal form Misformulation What’s Pat’s phone number? Query Ambiguity Pat phone Search Engine © 2007 IBM Corporation 5

Information Management software | Enterprise Content Management Sometimes your word is used too often Information Management software | Enterprise Content Management Sometimes your word is used too often Searching for “neon” finds signs and cars © 2007 IBM Corporation 6

Information Management software | Enterprise Content Management Sometimes your word isn’t used at all Information Management software | Enterprise Content Management Sometimes your word isn’t used at all Pat phone © 2007 IBM Corporation Searching for “Pat phone” finds nothing 7

Information Management software | Enterprise Content Management Analytics bridge unstructured and structured data Text Information Management software | Enterprise Content Management Analytics bridge unstructured and structured data Text Analysis Unstructured Information Structured Information Indices Text, Chat, Email, Audio, Video DBs KBs • • High-value Most current Fastest growing. . . BUT. . . Buried in huge volumes (noise) Implicit semantics Inefficient search © 2007 IBM Corporation • Explicit semantics • Efficient search • Focused content. . . BUT. . . • Slow growing • Narrow coverage • Less current/relevant 8

Information Management software | Enterprise Content Management Find what I mean, not what I Information Management software | Enterprise Content Management Find what I mean, not what I say Rate for Rate Billboard SEARCH: Going rate for leasing a billboard near Triborough Bridge Bronx No keywords in common, but a good answer Located in Rate for Rate Billboard “…We were offered $250, 000/year in 2001 for an outdoor sign in Hunts Point overlooking the Bruckner expressway. …” Bronx Located in © 2007 IBM Corporation 9

Information Management software | Enterprise Content Management Without semantic search, it’s not a pretty Information Management software | Enterprise Content Management Without semantic search, it’s not a pretty picture Rate for Rate Billboard SEARCH: Going rate for leasing a billboard near Triborough Bridge Bronx Common keywords Bad semantic match Located in Song Title Queens “…Simon and Garfunkel's "The 59 th Street Bridge Song" was rated highly by the Billboard magazine in the 60's…” Magazine © 2007 IBM Corporation 10

Information Management software | Enterprise Content Management News example § Search: “Bush trip to Information Management software | Enterprise Content Management News example § Search: “Bush trip to Middle East” Relationship Annotator Located At Arg 1: Entity Arg 2: Location Gov Official Named Entity Annotator Title Syntactic Annotator NP President © 2007 IBM Corporation Country Person VP Bush visits PP shrine in Israel 11

Information Management software | Enterprise Content Management Financial services example § Search: “Fred Center’s Information Management software | Enterprise Content Management Financial services example § Search: “Fred Center’s title” § Search: “head of Center Micros” Ceo. Of Relationship Arg 2: Org Arg 1: Person Named Entity Organization NP Parser Fred © 2007 IBM Corporation Center PP VP is the CEO of Center Micros 12

Information Management software | Enterprise Content Management Law enforcement example § Search: Neon car Information Management software | Enterprise Content Management Law enforcement example § Search: Neon car § Search: “Higgins’ car” Relationship Annotator Driven By Arg 1: Car Named Entity Annotator Car Person NP Syntactic Annotator A © 2007 IBM Corporation Arg 2: Person Neon VP was PP driven by Timothy. Higgins 13

Information Management software | Enterprise Content Management How does semantic search find a phone Information Management software | Enterprise Content Management How does semantic search find a phone number? © 2007 IBM Corporation 14

Information Management software | Enterprise Content Management When you search for “IBM phone number” Information Management software | Enterprise Content Management When you search for “IBM phone number” Expanded Query @xmlf 2: : ‘ibm <. or>phone <#phonenumber/> "phone nbr" "telephone number" <. or>number <#phonenumber/> "phone nbr" "telephone number" ' Synonyms Results © 2007 IBM Corporation 15

Information Management software | Enterprise Content Management Customers need a platform, not just samples Information Management software | Enterprise Content Management Customers need a platform, not just samples § To create domain-specific knowledge, create a new annotator or modify one already shipped § Or configure any regular expression with no coding § And it needs to work in many natural languages © 2007 IBM Corporation 16

Information Management software | Enterprise Content Management Customers need an open, extensible framework § Information Management software | Enterprise Content Management Customers need an open, extensible framework § Text analysis is a complex, multi-step process Text § No one vendor can satisfy every need you’ll have in text analysis § That’s why you need an open framework Omni. Find Enterprise Edition UIMA Identify Language © 2007 IBM Corporation Parse Words Categorize Annotate Search Index 17

Information Management software | Enterprise Content Management UIMA is an open standard framework § Information Management software | Enterprise Content Management UIMA is an open standard framework § IBM has submitted the Unstructured Information Management Architecture (UIMA) specification to the Organization for the Advancement of Structured Information Standards (OASIS) § The UIMA source code has been contributed to the Apache Software Foundation and an Apache Incubator project has been established to foster collaborative, consensus based development of new software based on UIMA © 2007 IBM Corporation 18

Information Management software | Enterprise Content Management Support for UIMA and Omni. Find Deliver Information Management software | Enterprise Content Management Support for UIMA and Omni. Find Deliver content to platform for analysis © 2007 IBM Corporation Provide components that perform text analysis Provide applications that leverage text analysis and enhanced search 19

Information Management software | Enterprise Content Management Read all about it § “Buy this Information Management software | Enterprise Content Management Read all about it § “Buy this book, read it, and then read it again. ” The search marketing best seller -- Chris Sherman, Search Engine Watch § “Indispensable guide” --Kirkus Reports § Updated every printing Internet Marketing For more information about the books, and for the free Biznology newsletter and blog: www. mikemoran. com © 2007 IBM Corporation § “Act now and read it” —Bryan Eisenberg § “Great book” --Robert Scoble § “Bravo” --Search Engine Watch 20