84d48966475019d65c44cb753426734e.ppt
- Количество слайдов: 26
An Expertise Finder Application Built on Enterprise Search Robert Joachim, rjoachim@mitre. org MITRE Corporation, Mc. Lean VA Gilbane San Francisco 2008 June 19, 2008 MITRE Corporation Fortune Magazine “ 100 best companies to work for” (2002 -2008) Computerworld “ 100 best places to work in IT” (2005 -2008) Approved for Public Release A National Resource Working in the Public Interest © 2008 The MITRE Corporation. All rights reserved.
Overview n About MITRE, our Intranet, our enterprise searchitecture n MITRE Expertise Finder implementation – Expertise finding models (APQC) – MITRE expertise finding history – This product n Interface details n How its built / How it works – System validation / Usage metrics – Nearer and longer term enhancements – Conclusion / Recommendations n Background – Sources / Resources n Other recent ‘real world’ expertise finder implementations – Commercial (COTS) software for expertise finding – Community finding prototype example – Use of social bookmarks at MITRE Page 2 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
About MITRE n About MITRE – Not-for-profit; operates 3 Federally Funded Research and Development Centers (FFRDCs), for Do. D, FAA, and IRS – Application of expertise in systems engineering, information technology, operational concepts, and enterprise modernization – 6000 employees located at Bedford, MA, Mc. Lean VA, plus other domestic and international sites; 65% of staff have Masters or Ph. D. degrees n Our role – Problem solving / rapid response for our sponsors – ‘Reachback’ into the corporation for knowledge is key n Long standing history of information sharing practices – Embedded and reinforced in our corporate culture Page 3 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
MITRE Intranet & Enterprise Search n MITRE Intranet is called the ‘MII’ – MITRE Information Intranet – Early adoption of web technology & web search – Our intranet consists of multiple content repositories on various platforms, including n n Intranet content server & multiple distributed content servers n Microsoft Share. Point for team site management and collaboration n n Oracle Portal & Oracle application servers Listserv lists for collaborative communication Google Enterprise is MITRE’s intranet search engine – The expertise finder application described here is based on Google enterprise search Page 4 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
MITRE Google architecture Content repositories Application interfaces GSA 5005 MITRE intranet search & ‘focused search’ interfaces’ Intranet Expertise Finder Email List Search (2. 2 M URLs total) MITRE Intranet server & Share. Point document libraries - URLs: 400 K Web-enabled file system + distributed MITRE Webservers (40) - URLs 1. 4 M Social Bookmarks Technical Exchange Meeting Search MITRE List messages - URLs: 450 K Database crawls XML feeds Page 5 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
MITRE Expertise Finding History n Current system is based, in part, on earlier MITRE research and prototype work – MITRE staff: Maybury, House, D’Amore n First developments of this system, based on Google enterprise search results, were also prototypes – Then, released as a pilot project to collect user feedback – Subsequently productized – Then enhanced over multiple releases n n n Additional functional search and display features Additional content resources for expertise identification Architectural focus – created using – Service-oriented componentized architecture n Loosely-coupled building block pieces, that can be swapped in/out, if necessary (“What if we were to replace -- our enterprise search system , our staff directory system”, etc. ) – Extensible n Can be extended to other content repositories or could be used for alternate ‘finding’ applications Page 6 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Expertise finding systems -- characterization APQC (formerly American Productivity Quality Center) characterization of Expertise Finding systems: n APQC ‘Model 1’ -- Linking knowledge seekers with knowledge providers – No a priori designation of ‘experts’ – This is our approach n APQC ‘Model 2’ – assigned discipline managers are responsible for knowing levels of expertise in their area n APQC ‘Model 3’ – designated ‘validated’ experts Page 7 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
MITRE Expertise Finder What it looks like Main view §Helps answer the question: “Who at MITRE knows about topic X“ § Results are based on Google relevancy ranking, in conjunction with author/owner attribution & document counts Organizational view Page 8 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Expertise Finder – Results details Main view Source options Email contact options Display options Content ‘evidence’ (with title, links to object & repository, ‘keywords in context’, object date) Person, with job title and link to phonebook Page 9 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Expertise Finder – Results details Organizational view and content display by organization ubble size and position indicates contributors and contributions by corporate center or division Clicking on a single bubble displays people and content from that organization Page 10 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
How it works MII Google query / results 1 User Query MII Google (Google Search Appliance) http query XM L Expertise Finder (Java-based application on Oracle Application Server) 2 Key MITRE Web-based repositories contributing to Expertise Finder / owner Resource Author attribution Web-enabled file system (Employeeshare transfer folders, ‘about-me’ resumes) LDAP Expertise Finder results 4 (staff and organizations) HTML Author meta tag SUI (email name) Meeting Point of contact MITRE Institute Courses up Look 3 SUI (email name) MITRE Technical Exchange Meetings 4. Returns results set, ranked by contributions, with hits ‘evidence’ and keyword context MS Office Property Author Field MITRE List Messages 3. Performs LDAP lookup for full staff name and organization Community. Share (MS Share. Point) MITRE Sourceforge Staff attribution 2. Identifies author / owner attributes Standard User Identifier (SUI) in folder path MITRE blogs 1. Based on keyword query, retrieves ranked results set from MII Google search (XML output) Course instructor (future) Onomi social bookmarks Bookmark contributor (future) LDAP (staff / organization lookup) Page 11 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Resource and staff attribution details From step 2, 2 previous slide Resource Author / owner attribution Person Metadata Quality Web-enabled file system (Employeeshare transfer folders, ‘about-me’ resumes) Standard User Identifier (SUI = email name) in folder path Excellent Community. Share (MS Share. Point) MS Office Property Author Field Varies MITRE blogs Standard user identifier (email name) Excellent MITRE Sourceforge (Software projects) HTML Author meta tag Excellent MITRE List Messages Standard user identifier (email name) Excellent MITRE Technical Exchange Meetings Meeting Point of contact Excellent MITRE Institute Courses (future resource) Social bookmarks (Onomi) (future resource) HTML pages from distributed webservers Course instructor Excellent Bookmark contributor Excellent HTML Author meta tag Varies MS Office documents from distributed webservers MS Office Property Author Field Varies Page 12 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
This system architecture: Advantages / Disadvantages n Advantages – Uses full-text indexed content (concepts are ‘fluid’ – especially for new technologies, products, projects) – Uses the same online content contributors share in the course of their day-to-day work – No requirement for users to maintain a registry of expertise – Incentivizes staff to share content online in open repositories – Incentivizes staff to use correct metadata, especially authorship metadata n Disadvantages – Results are only as good as the quality of authorship metadata used to associate information objects with staff – Results are dependent on the underlying search system relevancy ranking n Although, we also force specific repository results by sending multiple parallel queries to multiple content repositories – Users could ‘game’ the system (by arbitrarily putting large numbers of documents online) Approved for Public Release Page 13 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Query characteristics -- observations n Best performing queries – Specific (‘term specificity’): Products, programs, projects, standards – query terms that are ‘good discriminators’ – Query examples: n n Products: Cognos, App. Worx n Projects: Next-generation airspace n n Standards/Compliance: IEEE-1061, fisma Topics: ontologies, second life, biometrics Worst performing queries – Extremely general terms, whether single or multiple words – Query examples: n n ‘ Engineering’ – in a corporation where a majority of staff function in some engineering capacity and are performing engineering-related tasks ‘Software’ – in a corporation of where a significant portion of our work focuses on some aspects of software engineering – But – consider – these very general queries may not perform well in general full-text retrieval anyway Approved for Public Release Page 14 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Results evaluation/validation methods n Validation methods – Informal: Send in a query based on a topic where the user knows a set of experts/specialists, see how many come back in results n Many staff take this on themselves, as a check to see if they are included in results – Informal: when a user submits an email to contacts identified, informal email probe to that user n “Was this system helpful in identifying knowledgeable staff” n “Did you get an answer” – Metrics: Continued usage by staff n Query metrics have held steady over time, paralleling general search query metrics Page 15 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Usage metrics and query analysis n Usage metrics tracking by – Basic usage n n From Google query logs -- Expertise Finder interface queries are coded with a specific parameter for identification in query logs From internal Web. Trends web analytic reports (user visits, page views) – Query analysis n n We can identify specific queries sent to the system by analyzing query log data Monthly usage metrics – General MITRE Google queries: 60 K - 75 K queries per month – Expertise Finder queries: 2 K - 4 K queries per month – Usage ratio average of general search to the expertise finder application -- ~25 : 1 Page 16 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Expertise Finder enhancements n New enhancement in development: ‘Presence awareness’ identification using a Microsoft Office Active. X Web service Shows staff availability online, free/busy status, access to Office Communicator chat, other tools n System enhancements under consideration – Limit by date/date range (to find most staff based on most recent contributions) – Limit by more detailed level of contributing repositories n Beyond just ‘Documents and Webpages’ and ‘Lists’ n E. g. , Share. Point, Technical Exchange Meetings, Social Bookmarks – Permit user to limit and/or sort by staff classification/role, e. g. , n ‘AC’ Technical; PRO ‘professional level support’; PSS Administrative support Page 17 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Longer term: where we may be going n Exploration of social networking for expertise finding – Use of staff profiles based on social networking models (similar to My. Space, Facebook, Linked. In) – Use social network connections as an additional dimension in expertise finding n Hybrid approach – base expertise finding on – Text from document content n As we are doing – there will continue to be value in identifying expertise from content objects and – Staff profiles, which may be n User-generated, auto-generated, or a combination – Let the user decide, per query, how to focus the results based on these resources Page 18 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Conclusion/recommendations: expertise finding implementation If you are considering expertise finding implementation n Consider which APQC model fits your organization’s environment and requirements n If implementing a software application – Evaluate metadata quality for staff attribution – Decisions n Staff identification based on registry vs. content, or hybrid n Build vs. buy n Build in conjunction with enterprise search – Use of service-oriented architecture n n For swap-in/swap out of code base, directory resources, and content resources For future feature enhancements Page 19 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Background info Page 20 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Sources/Resources n Google Scholar results – MITRE expertise finding research – MITRE authors: M. Maybury, D. House, R. D’Amore http: //scholar. google. com/scholar? q=Maybury, +House, +D%E 2 %80%99 Amore n Ackerman, Mark and Mc. Donald, David “Just Talk to Me: A Field Study of Expertise Location” Proceedings of the 1998 ACM conference on computer supported cooperative work, Seattle, November 14 -18, 1998 – http: //portal. acm. org/citation. cfm? id=289506 n Hughes, Gareth and Crowder, Richard “Experiences in designing highly adaptable expertise finder systems” Proceedings of the DETC 03, Chicago, September 2 -6, 2003 – http: //eprints. ecs. soton. ac. uk/8206/ n Expertise Locator Systems: Finding the Answers. APQC Publications: 2003 – http: //www. apqc. org/portal/apqc/ksn? paf_gear_id=contentgear home&paf_dm=full&pageselect=detail&docid=123338 Page 21 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Sources/Resources n Maybury, Mark Expert Finding Systems, MITRE Corporation: MITRE Technical Report, MTR 06 B 00040, September 2006 – http: //www. mitre. org/work/tech_papers_06/06_111 5/06_1115. pdf n Maybury, Mark “Discovering Distributed Expertise” AAAI Fall Symposium Series, Regarding the “Intelligence” in Distributed Intelligent Systems, November 9, 2007 – http: //www. mitre. org/work/tech_papers_07/07_073 0/07_0730. pdf n Damianos, Laurie, et al. Onomi: Social Bookmarking on a Corporate Intranet, MITRE Corporation, May 2006 – http: //www. mitre. org/work/tech_papers_06/06_035 2/06_0352. pdf n Author contact: rjoachim@mitre. org Page 22 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Sources/Resources: Expertise finding recent real-world implementations n Presented or cited at Enterprise Search Summit, New York, May 20 -21, 2008 – “Mining additional value from enterprise search”, Trent Parkhill, Haley & Aldrich n Based on search product: Coveo – “Search connections in context”, Oz Benamram, Morrison & Foerster n Based on search product: Recommind – Google, Inc. internal expertise finder n n Based on search product: Google enterprise Montague Institute, January 17, 2008 – “Enterprise mashups for expertise location”, Qin Zhu, HP Labs n Based on search product: Inktomi Page 23 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Commercial software products for enterprise expertise finding -- examples n Enterprise search with expertise finding components – – – n Autonomy IDOL FAST (Partnering with Ask. Me) Endeca Recommind Microsoft Share. Point Enterprise search (with MOSS 07 “Knowledge Network”) Dedicated/specialized systems – TACIT Active. Net – Ask. Me – Triviumsoft SEE-K Page 24 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
Community Finding (Prototype) Based on MITRE Expertise Finder code n Identifies MITRE online communities (Email lists & Share. Point communities) n. Uses search results of n – List messages – Documents associated with a community – Community descriptions Page 25 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
“Onomi” Social Bookmarks –Tagging/Sharing Onomi (rhymes with of Web Resources by MITRE staff ‘Taxonomy’) – based on open-source tool Scuttle – Lets MITRE staff bookmark content resources and tag content of interest with topical terms –Helps me “find this again” –Builds communities of interest – And – contributes to expertise finding: Users tagging/contributing are ‘experts’ in the content/topics bookmarked and tagged Page 26 Approved for Public Release 15 -Mar-18 © 2008 The MITRE Corporation. All rights reserved.
84d48966475019d65c44cb753426734e.ppt