
e489c3e3571f9b06f580c5e0d385872a.ppt
- Количество слайдов: 20
® “Find What I Mean, Not What I Say" Mike Moran IBM Distinguished Engineer November, 2007 © 2007 IBM Corporation 1
Information Management software | Enterprise Content Management Why do companies use search? Business Benefit Value of Search Increase productivity Enable employees to more quickly find information needed to complete their business activities Achieve greater insight Analyze free-form, text-based information for insight into customer behavior and business performance Decrease costs Empower customers and partners to support themselves and perform their own research Increase revenue Ensure customers can easily find products and services, driving higher sales and increasing customer retention © 2007 IBM Corporation 2
Information Management software | Enterprise Content Management How does IBM Omni. Find meet those needs? Omni. Find Yahoo! Edition Basic, No-Charge Search Insight Solutions with Omni. Find Content Analytics © 2007 IBM Corporation Omni. Find Enterprise Edition Scalable and Secure Enterprise Search for Corporate Intranets Omni. Find Discovery Edition Search for Self-Service and e. Commerce 3
Information Management software | Enterprise Content Management Why is search so difficult? § It is harder to think of words than to make choices 1 to 10 of 10 zillion § Choosing the same words as the author is not easy § Words are ambiguous © 2007 IBM Corporation 4
Information Management software | Enterprise Content Management The classic search model Task Misconception I need to tell Pat. Information Need Mistranslation How do I contact Pat? Verbal form Misformulation What’s Pat’s phone number? Query Ambiguity Pat phone Search Engine © 2007 IBM Corporation 5
Information Management software | Enterprise Content Management Sometimes your word is used too often Searching for “neon” finds signs and cars © 2007 IBM Corporation 6
Information Management software | Enterprise Content Management Sometimes your word isn’t used at all Pat phone © 2007 IBM Corporation Searching for “Pat phone” finds nothing 7
Information Management software | Enterprise Content Management Analytics bridge unstructured and structured data Text Analysis Unstructured Information Structured Information Indices Text, Chat, Email, Audio, Video DBs KBs • • High-value Most current Fastest growing. . . BUT. . . Buried in huge volumes (noise) Implicit semantics Inefficient search © 2007 IBM Corporation • Explicit semantics • Efficient search • Focused content. . . BUT. . . • Slow growing • Narrow coverage • Less current/relevant 8
Information Management software | Enterprise Content Management Find what I mean, not what I say Rate for Rate Billboard SEARCH: Going rate for leasing a billboard near Triborough Bridge Bronx No keywords in common, but a good answer Located in Rate for Rate Billboard “…We were offered $250, 000/year in 2001 for an outdoor sign in Hunts Point overlooking the Bruckner expressway. …” Bronx Located in © 2007 IBM Corporation 9
Information Management software | Enterprise Content Management Without semantic search, it’s not a pretty picture Rate for Rate Billboard SEARCH: Going rate for leasing a billboard near Triborough Bridge Bronx Common keywords Bad semantic match Located in Song Title Queens “…Simon and Garfunkel's "The 59 th Street Bridge Song" was rated highly by the Billboard magazine in the 60's…” Magazine © 2007 IBM Corporation 10
Information Management software | Enterprise Content Management News example § Search: “Bush trip to Middle East” Relationship Annotator Located At Arg 1: Entity Arg 2: Location Gov Official Named Entity Annotator Title Syntactic Annotator NP President © 2007 IBM Corporation Country Person VP Bush visits PP shrine in Israel 11
Information Management software | Enterprise Content Management Financial services example § Search: “Fred Center’s title” § Search: “head of Center Micros” Ceo. Of Relationship Arg 2: Org Arg 1: Person Named Entity Organization NP Parser Fred © 2007 IBM Corporation Center PP VP is the CEO of Center Micros 12
Information Management software | Enterprise Content Management Law enforcement example § Search: Neon car § Search: “Higgins’ car” Relationship Annotator Driven By Arg 1: Car Named Entity Annotator Car Person NP Syntactic Annotator A © 2007 IBM Corporation Arg 2: Person Neon VP was PP driven by Timothy. Higgins 13
Information Management software | Enterprise Content Management How does semantic search find a phone number? © 2007 IBM Corporation 14
Information Management software | Enterprise Content Management When you search for “IBM phone number” Expanded Query @xmlf 2: : ‘ibm <. or>phone <#phonenumber/> "phone nbr" "telephone number" <. or>number <#phonenumber/> "phone nbr" "telephone number" ' Synonyms Results © 2007 IBM Corporation 15
Information Management software | Enterprise Content Management Customers need a platform, not just samples § To create domain-specific knowledge, create a new annotator or modify one already shipped § Or configure any regular expression with no coding § And it needs to work in many natural languages © 2007 IBM Corporation 16
Information Management software | Enterprise Content Management Customers need an open, extensible framework § Text analysis is a complex, multi-step process Text § No one vendor can satisfy every need you’ll have in text analysis § That’s why you need an open framework Omni. Find Enterprise Edition UIMA Identify Language © 2007 IBM Corporation Parse Words Categorize Annotate Search Index 17
Information Management software | Enterprise Content Management UIMA is an open standard framework § IBM has submitted the Unstructured Information Management Architecture (UIMA) specification to the Organization for the Advancement of Structured Information Standards (OASIS) § The UIMA source code has been contributed to the Apache Software Foundation and an Apache Incubator project has been established to foster collaborative, consensus based development of new software based on UIMA © 2007 IBM Corporation 18
Information Management software | Enterprise Content Management Support for UIMA and Omni. Find Deliver content to platform for analysis © 2007 IBM Corporation Provide components that perform text analysis Provide applications that leverage text analysis and enhanced search 19
Information Management software | Enterprise Content Management Read all about it § “Buy this book, read it, and then read it again. ” The search marketing best seller -- Chris Sherman, Search Engine Watch § “Indispensable guide” --Kirkus Reports § Updated every printing Internet Marketing For more information about the books, and for the free Biznology newsletter and blog: www. mikemoran. com © 2007 IBM Corporation § “Act now and read it” —Bryan Eisenberg § “Great book” --Robert Scoble § “Bravo” --Search Engine Watch 20