Скачать презентацию Taxonomy Strategies LLC 4 Myths about Taxonomies ITIMG Скачать презентацию Taxonomy Strategies LLC 4 Myths about Taxonomies ITIMG

b05d21d58f55d05b75dc7c184fabb79f.ppt

  • Количество слайдов: 38

Taxonomy Strategies LLC 4 Myths about Taxonomies ITIMG – Industrial Technical Information Managers Group Taxonomy Strategies LLC 4 Myths about Taxonomies ITIMG – Industrial Technical Information Managers Group Meeting Newport Beach, CA April 11, 2005 Copyright 2005 Taxonomy Strategies LLC. All rights reserved.

Who I am Over 25 years in the business of organized information v v Who I am Over 25 years in the business of organized information v v v v Founder & Principal, Taxonomy Strategies Director, Solutions Architecture, Interwoven VP, Infoware, Metacode Technologies Program Manager, Getty Foundation Manager, Pricewaterhouse Assistant Director for Technical Services, Hampshire College Chief, Technical Services, Paul Weiss Rifkind Wharton & Garrison Metadata & taxonomies community leadership. President, American Society for Information Science & Technology Trustee, Dublin Core Metadata Initiative Co-Founder, Networked Knowledge Organization Systems/Services Adviser, National Research Council Computer Science and Telecommunications Board v Reviewer, National Science Foundation Division of Information and Intelligent Systems v v TAXONOMY STRATEGIES LLC The business of organized information 2

Recent & current projects Government v Commodity Futures Trading Commission v Defense Intelligence Agency Recent & current projects Government v Commodity Futures Trading Commission v Defense Intelligence Agency v ERIC v Federal Aviation Administration v Federal Reserve Bank of Atlanta v Forest Service v GSA Office of Citizen Services (www. firstgov. gov) v Head Start v Infocomm Development Authority of Singapore v NASA (nasataxonomy. jpl. nasa. gov) v Small Business Administration v Social Security Administration v USDA Economic Research Service v USDA e-Government Program (www. usda. gov) TAXONOMY STRATEGIES LLC The business of organized information Commercial v Allstate Insurance v Blue Shield of California v Debevoise & Plimpton v Halliburton v Hewlett Packard v Motorola v People. Soft v Pricewaterhouse Coopers v Siderean Software v Sprint v Time Inc. Commercial subcontracts v Agency. com – Top financial services v Critical Mass – Fortune 50 retailer v Deloitte Consulting – Big credit card v Gistics/OTB – Direct selling giant NGO’s v CEN v IDEAlliance v IMF v OCLC 3

What I do Organize Stuff TAXONOMY STRATEGIES LLC The business of organized information 4 What I do Organize Stuff TAXONOMY STRATEGIES LLC The business of organized information 4

Agenda v Myth #1: The Web has changed everything v Myth #2: Taxonomies are Agenda v Myth #1: The Web has changed everything v Myth #2: Taxonomies are monolithic hierarchies v Myth #3: Literary warrant v Myth #4: Knowledge workers TAXONOMY STRATEGIES LLC The business of organized information 5

Finding information should not be about “Feeling Lucky” TAXONOMY STRATEGIES LLC The business of Finding information should not be about “Feeling Lucky” TAXONOMY STRATEGIES LLC The business of organized information 6

Something is wrong with this picture v “…search is so fundamental that people should Something is wrong with this picture v “…search is so fundamental that people should have been focusing on it all along. The reality of the situation is that there was a great assumption that search was actually working just fine. ” — Harley Manning, Research Director TAXONOMY STRATEGIES LLC The business of organized information 7

Why doesn’t search work? v For search engines to work, they need better stuff Why doesn’t search work? v For search engines to work, they need better stuff to work on! Otherwise it’s Garbage in… …and garbage out. v Correctly matching content with questions (regardless of the technology) requires better content to work on. TAXONOMY STRATEGIES LLC The business of organized information 8

How to fix search … add metadata to search on v “Adding metadata to How to fix search … add metadata to search on v “Adding metadata to unstructured content allows it to be managed like structured content. Applications that use structured content work better. ” v “Enriching content with structured metadata is critical for supporting search and personalized content delivery. ” v “Content that has been adequately tagged with metadata can be leveraged in usage tracking, personalization and improved searching. ” TAXONOMY STRATEGIES LLC The business of organized information 9

Difficult to Generate What is metadata? Another view of Dublin Core Subject metadata – Difficult to Generate What is metadata? Another view of Dublin Core Subject metadata – What & Why: Subject, Description, Coverage Use metadata – How can it be used: Rights & Permissions Better resource description = Better navigation & Asset metadata – discovery metadata – Who, Where & When: Relational Title, Creator, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language Links between and to: Relation Functionality TAXONOMY STRATEGIES LLC The business of organized information 10

Dublin Core is a little more complicated Elements Refinements 1. Identifier 2. Title 3. Dublin Core is a little more complicated Elements Refinements 1. Identifier 2. Title 3. Creator 4. Contributor 5. Publisher 6. Subject 7. Description 8. Coverage 9. Format 10. Type 11. Date 12. Relation 13. Source 14. Rights 15. Language Abstract Access rights Alternative Audience Available Bibliographic citation Conforms to Created Date accepted Date copyrighted Date submitted Education level Extent Has format Has part Has version Is format of Is part of Encodings Types Is referenced by Is replaced by Is required by Issued Is version of License Mediator Medium Modified Provenance References Replaces Requires Rights holder Spatial Table of contents Temporal Valid TAXONOMY STRATEGIES LLC The business of organized information Box DCMIType DDC IMT ISO 3166 ISO 639 -2 LCC LCSH MESH Period Point RFC 1766 RFC 3066 TGN UDC URI W 3 CTDF Collection Dataset Event Image Interactive Resource Moving Image Physical Object Service Software Sound Still Image Text 11

Metadata is a data model– A scheme for e-Forms Element Namespace Source Purpose Identifier Metadata is a data model– A scheme for e-Forms Element Namespace Source Purpose Identifier dc: identifier System supplied Basic accountability Registrar dc: creator LDAP validated Accountability & maintenance Form Name dc: title User Text search, results display Form Number dcterms: alternative User Text search, results display Revision Date dcterms: modified User Filter or rank search results FIPS 95 -2 Key index to retrieve & aggregate assets Agency dc: publisher Subject Form Type dc: type Form Type vocabulary Industry Code us: naics NAICS codes Browse or group search results Jurisdiction dc: coverage FIPS 5 -2 Browse or group search results Purpose us: feabrm FEA Business Ref Model Browse or group search results . . . … . . . TAXONOMY STRATEGIES LLC The business of organized information Browse or group search results 12

How is Dublin Core used in corporate environments? Base: 20 corporate information managers CEN/ISSS How is Dublin Core used in corporate environments? Base: 20 corporate information managers CEN/ISSS Workshop on Dublin Core – Guidance information for the deployment of Dublin Core metadata in Corporate Environments TAXONOMY STRATEGIES LLC The business of organized information 13

Dublin Core framework for corporate use v Not just 15 elements v A framework Dublin Core framework for corporate use v Not just 15 elements v A framework to enable cross-resource exploration and use Dublin Core is framework for “integration metadata” at Bell. South TAXONOMY STRATEGIES LLC The business of organized information 14

Agenda v Myth #1: The Web has changed everything v Myth #2: Taxonomies are Agenda v Myth #1: The Web has changed everything v Myth #2: Taxonomies are monolithic hierarchies v Myth #3: Literary warrant v Myth #4: Knowledge workers TAXONOMY STRATEGIES LLC The business of organized information 15

What is a taxonomy? Systematics view Hierarchical classification of things into a tree structure What is a taxonomy? Systematics view Hierarchical classification of things into a tree structure Animalia Kingdom Chordata Phylum Mammalia Class Carnivora Order Canidae Family Canis Genus C. familiari Species Linnaeus … 44 -Office Equipment and Accessories and Supplies. 12 -Office Supplies. 17 -Writing Instruments. 05 -Mechanical pencils. 06 -Wooden pencils. 07 -Colored pencils Segment Family Class Commodity UNSPSC … TAXONOMY STRATEGIES LLC The business of organized information 16

Taxonomic metadata – e-Forms example Agency 0001 Legislative 1000 Judicial 1100 Executive Office of Taxonomic metadata – e-Forms example Agency 0001 Legislative 1000 Judicial 1100 Executive Office of Pres 0003 Exec Depts 1200 Agriculture 1300 Commerce 9700 Defense 9100 Education 8900 Energy 7500 HHS 7000 DHS 8600 HUD 1400 Interior 1500 Justice 1600 Labor 1900 State 6900 Transport 2000 Treasury 3600 Veterans Ind Agencies Intl Orgs Form Type Industry Impact Application Approval Claim Information request Information submission Instructions Legal filing Payment Procurement Renewal Reservation Service request Test Other input Other transaction 00 Generic 11 Agriculture 21 Mining 22 Utilities 23 Construct 31 -33 Manuf 42 Wholesale 44 -45 Retail 48 -49 Trans 51 Info 52 Finance 54 Profession 55 Mgmt 56 Support 61 Education 62 Health Care 71 Arts 72 Hospitality 81 Other Services 92 Public Admin Jurisdiction Metadata Elements Federal State + Local + Other + BRM Impact Keyword Topic Citizen Srvcs Social Srvs Defense Disasters Econ Dev Education Energy Env Mgmt Law Enf Judicial Correctional Health Security Income Sec Intelligence Intl Affairs Nat Resour Transport Workforce Science Delivery Support Management Agriculture & food Commerce Communications Education Energy Env pro Foreign rels Govt Health & safety Housing & comm dev Labor Law Named grps National def Nat resources Recreation Sci & tech Social pgms Transport Audience All General Citizen Business Govt Employee Native American Nonresident Tourist Special group Taxonomies TAXONOMY STRATEGIES LLC The business of organized information 17

The power of taxonomy facets v 4 independent categories of 10 nodes each have The power of taxonomy facets v 4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10, 000 nodes (104) § Easier to maintain § Can be easier to navigate TAXONOMY STRATEGIES LLC The business of organized information 18

Taxonomic metadata example: Form SS-4. Employer Identification Number (EIN) Facet Agency Values IRS Content Taxonomic metadata example: Form SS-4. Employer Identification Number (EIN) Facet Agency Values IRS Content Type Information Submission Industry Impact Jurisdiction Programs & Services TAXONOMY STRATEGIES LLC Generic Support Delivery of Services/General Government/Taxation Management Commerce/Employment taxes Federal Keyword Topic The business of organized information 19

Methods used to create & maintain metadata Base: 20 corporate information managers CEN/ISSS Workshop Methods used to create & maintain metadata Base: 20 corporate information managers CEN/ISSS Workshop on Dublin Core – Guidance information for the deployment of Dublin Core metadata in Corporate Environments TAXONOMY STRATEGIES LLC The business of organized information 20

Agenda v Myth #1: The Web has changed everything v Myth #2: Taxonomies are Agenda v Myth #1: The Web has changed everything v Myth #2: Taxonomies are monolithic hierarchies v Myth #3: Literary warrant v Myth #4: Knowledge workers TAXONOMY STRATEGIES LLC The business of organized information 21

Literary warrant v The “literature” on which a controlled vocabulary is based. v The Literary warrant v The “literature” on which a controlled vocabulary is based. v The “official names” of people, organizations, events, places, and things has been published sources Type of Entity Author names Authoritative Sources Places Title page US Board on Geographic Names, National Geo-Spatial Intelligence Agency, ISO 3166, UN Statistics Division Subjects Existing literature TAXONOMY STRATEGIES LLC The business of organized information 22

Why vocabulary differences are necessary v Terminology is needed before “literature” establishes warrant. v Why vocabulary differences are necessary v Terminology is needed before “literature” establishes warrant. v Categories are needed for internal purposes such as sorting, analysis, and other ad hoc groupings. v Organizations, places, and other entities change over time. TAXONOMY STRATEGIES LLC The business of organized information 23

Folksonomies: Emergent topics TAXONOMY STRATEGIES LLC The business of organized information 24 Folksonomies: Emergent topics TAXONOMY STRATEGIES LLC The business of organized information 24

Some vocabulary differences are necessary: Grouping ISO 3166 -1 UN Code Internal Code Name Some vocabulary differences are necessary: Grouping ISO 3166 -1 UN Code Internal Code Name Official Name AUT 40 122 Austria Republic of Austria BEL 56 124 Belgium Kingdom of Belgium DNK 208 128 Denmark Kingdom of Denmark FRA 250 132 France French Republic Germany Federal Republic of Germany DEU 276 134 SMR 674 135 San Marino Republic of San Marino ITA 380 136 Italy Italian Republic LUX 442 137 Luxembourg Grand Duchy of Luxembourg … … … TAXONOMY STRATEGIES LLC The business of organized information 25

Some vocabulary differences are necessary: Entities change over time Name Part of Effective Dates Some vocabulary differences are necessary: Entities change over time Name Part of Effective Dates Entity Type Serbia and Montenegro Europe 2003 - Serbia and Montenegro Federal Republic of Yugoslavia 1991 -2003 Republic Yugoslavia Europe 1929 -1991 Independent state TAXONOMY STRATEGIES LLC The business of organized information Independent state 26

Sources for 7 common taxonomies Taxonomy Definition Potential Sources Organizational structure. FIPS 95 -2, Sources for 7 common taxonomies Taxonomy Definition Potential Sources Organizational structure. FIPS 95 -2, U. S. Government Manual, Your organizational structure, etc. Content Type Structured list of the various types of content being managed or used. DC Types, AGLS Document Type, AAT Information Forms , Records management policy, etc. Industry Broad market categories such as lines of business, life events, or industry codes. FIPS 66, SIC, NAICS, etc. Location Place of operations or constituencies. FIPS 5 -2, FIPS 55 -3, ISO 3166, UN Statistics Div, US Postal Service, etc. Functions and processes performed to accomplish mission and goals. FEA Business Reference Model, Enterprise Ontology, AAT Functions, etc. Topic Business topics relevant to your mission and goals. Federal Register Thesaurus, NAL Agricultural Thesaurus, LCSH, etc. Audience Subset of constituents to whom a piece of content is directed or intended to be used. GEM, ERIC Thesaurus, IEEE LOM, etc. Products and Services Names of products/programs & services. ERP system, Your products and services, etc. TAXONOMY STRATEGIES LLC The business of organized information 27

How Dublin Core is extended? Base: 20 corporate information managers CEN/ISSS Workshop on Dublin How Dublin Core is extended? Base: 20 corporate information managers CEN/ISSS Workshop on Dublin Core – Guidance information for the deployment of Dublin Core metadata in Corporate Environments TAXONOMY STRATEGIES LLC The business of organized information 28

Business process document types: Local document type lists are commonly invented Oil & gas Business process document types: Local document type lists are commonly invented Oil & gas services company document types analysis, appraisals, assessments, forecasts, predictions agendas, plans, designs, schedules, workflow applications, proposals, requests, requirements permits, consents, approvals, rejections, certificates work orders, correspondence auditing, compliance, testing, inspections, operations reports lessons learned, after-action reviews, meeting minutes, FAQs policies, procedures, training manuals, standards, best practices research notes, journal articles newsletters, bulletins, press releases ads, brochures, data sheets, technical notes, case studies, price lists checklists, templates, forms, logos, branding software, database forms TAXONOMY STRATEGIES LLC The business of organized information 29

What controlled vocabularies are being used? Base: 20 corporate information managers CEN/ISSS Workshop on What controlled vocabularies are being used? Base: 20 corporate information managers CEN/ISSS Workshop on Dublin Core Language Codes – Guidance information for the deployment of Dublin Core metadata in Corporate Environments TAXONOMY STRATEGIES LLC The business of organized information 30

Agenda v Myth #1: The Web has changed everything v Myth #2: Taxonomies are Agenda v Myth #1: The Web has changed everything v Myth #2: Taxonomies are monolithic hierarchies v Myth #3: Literary warrant v Myth #4: Knowledge workers TAXONOMY STRATEGIES LLC The business of organized information 31

Knowledge workers spend up to 2. 5 hours each day looking for information … Knowledge workers spend up to 2. 5 hours each day looking for information … … But find what they are looking for only 40% of the time. — Kit Sims Taylor TAXONOMY STRATEGIES LLC The business of organized information 32

Knowledge workers spend more time re-creating existing content than creating new content 26% 9% Knowledge workers spend more time re-creating existing content than creating new content 26% 9% — Kit Sims Taylor TAXONOMY STRATEGIES LLC The business of organized information 33

High cost of not finding information v “The amount of time wasted in futile High cost of not finding information v “The amount of time wasted in futile searching for vital information is enormous, leading to staggering costs …” — Sue Feldman, High cost of poor classification v Poor classification costs a 10, 000 user organization $10 M each year—about $1, 000 per employee. — Jakob Nielsen, useit. com TAXONOMY STRATEGIES LLC The business of organized information 34

Opportunities and challenges v 80% of enterprise data is unstructured. v Outputs from back Opportunities and challenges v 80% of enterprise data is unstructured. v Outputs from back office systems are documents— queries & reports. v Avoiding unnecessary recreation of content. v Enabling decision-making transparency. v Promulgating policies & guidelines. v Managing intellectual property. v Supporting product & services throughout their life cycle —development, marketing, sales & support. TAXONOMY STRATEGIES LLC The business of organized information 35

Productivity, loyalty, and revenue have provided the ROI TAXONOMY STRATEGIES LLC The business of Productivity, loyalty, and revenue have provided the ROI TAXONOMY STRATEGIES LLC The business of organized information 36

Intranet has provided the best ROI Intranet Web/online customer sales Web dev infrastructure Web/online Intranet has provided the best ROI Intranet Web/online customer sales Web dev infrastructure Web/online business sales Middleware to link Web to ERP Extranet/supply chain ebilling/payment systems Wireless Web access e-marketplace/ portal None TAXONOMY STRATEGIES LLC The business of organized information 37

Taxonomy Strategies LLC Joseph A. Busch + 415 -377 -7912 jbusch@taxonomystrategies. com http: //ww. Taxonomy Strategies LLC Joseph A. Busch + 415 -377 -7912 [email protected] com http: //ww. taxonomystrategies. com April 11, 2005 Copyright 2005 Taxonomy Strategies LLC. All rights reserved.