Скачать презентацию Taxonomy Strategies LLC Tagging Interfaces Content Organization Скачать презентацию Taxonomy Strategies LLC Tagging Interfaces Content Organization

cd4136ce1e0c5ad450adc64dc336f710.ppt

  • Количество слайдов: 47

Taxonomy Strategies LLC Tagging, Interfaces & Content Organization Infrastructures Joseph A Busch, Principal November Taxonomy Strategies LLC Tagging, Interfaces & Content Organization Infrastructures Joseph A Busch, Principal November 1, 2006 Copyright 2006 Taxonomy Strategies LLC. All rights reserved.

Who I am v Over 25 years in the business of organized information § Who I am v Over 25 years in the business of organized information § Founder & Principal, Taxonomy Strategies § Director, Solutions Architecture, Interwoven § VP, Infoware, Metacode Technologies § Program Manager, Getty Foundation § Manager, Pricewaterhouse § Assistant Director for Technical Services, Hampshire College § Chief, Technical Services, Paul Weiss Rifkind Wharton & Garrison v Metadata & taxonomies community leadership. § President, American Society for Information Science & Technology § Trustee, Dublin Core Metadata Initiative § Co-Founder, Networked Knowledge Organization Systems/Services § Adviser, National Research Council Computer Science and Telecommunications Board § Reviewer, National Science Foundation Division of Information and Intelligent Systems TAXONOMY STRATEGIES LLC The business of organized information 2

Recent & current projects Government Commercial Not-for-Profit TAXONOMY STRATEGIES LLC The business of organized Recent & current projects Government Commercial Not-for-Profit TAXONOMY STRATEGIES LLC The business of organized information 3

What I do Organize Stuff TAXONOMY STRATEGIES LLC The business of organized information 4 What I do Organize Stuff TAXONOMY STRATEGIES LLC The business of organized information 4

For us, taxonomy work includes: v Metadata specification defines the properties needed to describe For us, taxonomy work includes: v Metadata specification defines the properties needed to describe content so that it can be found & used. v Vocabularies are collections of terms that are used to specify some of the metadata properties. § Some vocabularies are big and hierarchical, some are small and flat. v An application profile specifies what metadata & vocabularies are required, and then represents them formally. TAXONOMY STRATEGIES LLC The business of organized information 5

Seven phases of taxonomy and metadata design 1 Identify Objectives 2 Inventory Content Conduct Seven phases of taxonomy and metadata design 1 Identify Objectives 2 Inventory Content Conduct interviews ID sources, spider assets & extract metadata 3 Specify Metadata 4 Model Content 5 Specify Vocabularies 6 Specify Procedures 7 Train Staff TAXONOMY STRATEGIES LLC The business of organized information Define fields & purpose Define content chunks & XML DTDs Compile controlled vocabularies Develop workflow, rules & procedures Develop materials & train staff 6

Use metadata to support core purposes v Metadata can be used to provide enough Use metadata to support core purposes v Metadata can be used to provide enough information for any user, Complexity tool, or program to find out everything needed to find apply any piece of content. Subject metadata – What, Where & Why: Subject, Type, Coverage Use metadata – When & How: Date, Language, Rights Asset metadata – Who: Identifier, Creator, Title, Description, Publisher, Format, Contributor Relational metadata – Links between and to: Source, Relation Enabled Functionality http: //dublincore. org/documents/dces/ TAXONOMY STRATEGIES LLC The business of organized information 7

Use metadata to support core purposes v Metadata can be used to provide enough Use metadata to support core purposes v Metadata can be used to provide enough information for any user, Complexity tool, or program to find out everything needed to find apply any piece of content. Subject metadata – Better Use metadata – What, Where &navigation & Why: When & How: Subject, Type, Coverage Date, Language, Rights discovery Asset metadata. More efficient – Who: Relational metadata – Identifier, Creator, Title, editorial between and to: Links Description, Publisher, process Source, Relation Format, Contributor Enabled Functionality http: //dublincore. org/documents/dces/ TAXONOMY STRATEGIES LLC The business of organized information 8

Agenda v Tagging v Interface v Content Organization TAXONOMY STRATEGIES LLC The business of Agenda v Tagging v Interface v Content Organization TAXONOMY STRATEGIES LLC The business of organized information 9

Tagging Overview v Tagging is better than the words that happen to occur in Tagging Overview v Tagging is better than the words that happen to occur in a piece of content. v All tagging is useful § End user tagging § Tagging by librarians § Automated tagging by OS and algorithms v Content should be tagged throughout its lifecycle, each time the content is handled and used so that it accrues value or its significance is diminished. TAXONOMY STRATEGIES LLC The business of organized information 10

MS Office: File Properties Ho wm any peo ple fi TAXONOMY STRATEGIES LLC The MS Office: File Properties Ho wm any peo ple fi TAXONOMY STRATEGIES LLC The business of organized information ll th is i n ? 11

Flickr: Organize Ho w ma ny pe op le cli ck on t his Flickr: Organize Ho w ma ny pe op le cli ck on t his TAXONOMY STRATEGIES LLC The business of organized information ? 12

Four Tagging Rules Rule Description Use specific terms Apply the most specific terms when Four Tagging Rules Rule Description Use specific terms Apply the most specific terms when tagging content. Specific terms can always be generalized, but generic terms cannot be specialized. Use multiple terms Use as many terms as necessary to describe What the content is about & Why it is important. Use appropriate terms Only fill-in the facets & values that make sense. Not all facets apply to all content. Consider how content will be used Anticipate how the content will be searched for in the future, & how to make it easy to find it. Remember that search engines can only operate on explicit information. TAXONOMY STRATEGIES LLC The business of organized information 13

Agenda v Tagging v Interface v Content Organization TAXONOMY STRATEGIES LLC The business of Agenda v Tagging v Interface v Content Organization TAXONOMY STRATEGIES LLC The business of organized information 14

Requirements for a tagging interface v Automated form fill-in (automatically fills in known data) Requirements for a tagging interface v Automated form fill-in (automatically fills in known data) v Tagging precedents (see tags already assigned by v v v v v others) Controlled vocabularies, e. g. , with pull-down list Multi-valued tags Geo-tagging Group tagging Clean-up tag tools, e. g. , alpha list Batch editing Share/Don’t share (Public/Private) Identified owner (who can be emailed) Almost immediate feedback, e. g. , tag cloud TAXONOMY STRATEGIES LLC The business of organized information 15

Form fill-in: Automatically filled-in known data TAXONOMY STRATEGIES LLC The business of organized information Form fill-in: Automatically filled-in known data TAXONOMY STRATEGIES LLC The business of organized information 16

Form fill-in: Automatically filled-in known data Manual form fill-in w/ check boxes, pull-down lists, Form fill-in: Automatically filled-in known data Manual form fill-in w/ check boxes, pull-down lists, etc. Auto keyword & summarization TAXONOMY STRATEGIES LLC The business of organized information 17

Form fill-in: Automatically filled-in known data Auto-categorization Rules & pattern matching Parse & lookup Form fill-in: Automatically filled-in known data Auto-categorization Rules & pattern matching Parse & lookup (recognize names) TAXONOMY STRATEGIES LLC The business of organized information 18

Tagging precedents: See tags assigned by others TAXONOMY STRATEGIES LLC The business of organized Tagging precedents: See tags assigned by others TAXONOMY STRATEGIES LLC The business of organized information 19

Multi-valued group tagging TAXONOMY STRATEGIES LLC The business of organized information 20 Multi-valued group tagging TAXONOMY STRATEGIES LLC The business of organized information 20

Group geo-tagging TAXONOMY STRATEGIES LLC The business of organized information 21 Group geo-tagging TAXONOMY STRATEGIES LLC The business of organized information 21

Group geo-tagging TAXONOMY STRATEGIES LLC The business of organized information 22 Group geo-tagging TAXONOMY STRATEGIES LLC The business of organized information 22

Clean up tag tools: Alpha list TAXONOMY STRATEGIES LLC The business of organized information Clean up tag tools: Alpha list TAXONOMY STRATEGIES LLC The business of organized information 23

Batch edit TAXONOMY STRATEGIES LLC The business of organized information 24 Batch edit TAXONOMY STRATEGIES LLC The business of organized information 24

Share or don’t share tagging TAXONOMY STRATEGIES LLC The business of organized information 25 Share or don’t share tagging TAXONOMY STRATEGIES LLC The business of organized information 25

Bulk Tagging v ID collection of related content items by pattern or context v Bulk Tagging v ID collection of related content items by pattern or context v Then, apply same attributes to all content items TAXONOMY STRATEGIES LLC The business of organized information 26

Tag a folder v Drag & drop content items into folder v Then, content Tag a folder v Drag & drop content items into folder v Then, content items inherit properties of folder TAXONOMY STRATEGIES LLC The business of organized information 27

Workflow v Approve & improve mindset Create Content Add Metadata Review & Improve TAXONOMY Workflow v Approve & improve mindset Create Content Add Metadata Review & Improve TAXONOMY STRATEGIES LLC The business of organized information Publish Review & Improve 28

Interactive rewards v Almost instantaneous exposure of tags in simple user interfaces on the Interactive rewards v Almost instantaneous exposure of tags in simple user interfaces on the web provides positive reinforcement for user tagging that simply did not exist before. v For example, § Most popular § Tag clouds § Alerts TAXONOMY STRATEGIES LLC The business of organized information 29

Most popular v Another example is most emailed from, e. g. , the NY Most popular v Another example is most emailed from, e. g. , the NY Times. TAXONOMY STRATEGIES LLC The business of organized information 30

Tag cloud TAXONOMY STRATEGIES LLC The business of organized information 31 Tag cloud TAXONOMY STRATEGIES LLC The business of organized information 31

Alerts v New (content selected by date) v Subscriptions (content selected by tags) v Alerts v New (content selected by date) v Subscriptions (content selected by tags) v Interest (content selected by other people) v Individual (content selected for you by other people) TAXONOMY STRATEGIES LLC The business of organized information 32

Agenda v Tagging v Interface v Content Organization TAXONOMY STRATEGIES LLC The business of Agenda v Tagging v Interface v Content Organization TAXONOMY STRATEGIES LLC The business of organized information 33

Content organization models: The Information Architect v Saul Wurman’s 5 ways to categorize things Content organization models: The Information Architect v Saul Wurman’s 5 ways to categorize things § By location (spatially) § By alphabet (alphabetically) § By time (chronologically) § By category (subject) § By hierarchy (BT/NT, etc) Richard Saul Wurman. Information Architects (1996) TAXONOMY STRATEGIES LLC The business of organized information 34

Content organization models: The Records Manager v Archives & business records § By function Content organization models: The Records Manager v Archives & business records § By function (business purpose) § By genre (document type) Brands & Varieties Events Ingredients Locations Nutrients Organizations Functions Accounting Administration Environment Finance Human Resources Legal Marketing & Sales Plant Operations Projects Public Relations Research & Development Tax Treasury TAXONOMY STRATEGIES LLC The business of organized information Doc Types Account Listings Acquisitions Cash Disbursements Cash Receipts Contract Accounting Records Credit Advices Credit Card Charges Donations Employee Expense Reports Invoices Petty Cash Records Permits & Licenses Plans & Forecasts Royalty Payments Sales Receipts 35

Content organization models: The Product Manager v Management (for general business operational purposes) § Content organization models: The Product Manager v Management (for general business operational purposes) § By products and services Systems Peripherals Services Support My Account Handhelds Monitors Printers Projectors TVs CRT Monitors LCD Monitors All-in-One & Photo Printers B/W & Multifunction Laser Printers Color Laser Printers Ink & Printer Accessories TAXONOMY STRATEGIES LLC The business of organized information LCD TVs Plasma TVs Parts All Electronics & Accessories Desktop Accessories Notebook Accessories Digital Photography Handhelds Memory Monitors MP 3 Players Networking Power Printers & Ink Projectors Software & Games Storage & Drives TVs & Home Theater 36

Content organization models: Marketer v Marketing & sales § By psycho social profiles such Content organization models: Marketer v Marketing & sales § By psycho social profiles such as lifestyle stages, personas, etc. § By industry § By location Audience Age Group Aisles Business Consumer Financial Risk Service Standard TAXONOMY STRATEGIES LLC The business of organized information Intention Inquiry Research Support Upgrade Lifecycle Industry Pre-Sales Early Life Purchase Experience & Sales Process Set Up / Installation Billing Experience Support Retain & Renew Construction & Building Field Services Finance & Insurance Financial Services Government Healthcare Higher Education Hospitality Services Insurance K-12 Education Manufacturing Professional Services Real Estate Retail Transportation & Distribution Location Regions ZIP Code 37

Content organization models: Editor v Editorial § By content lifecycle Social Aspects of Digital Content organization models: Editor v Editorial § By content lifecycle Social Aspects of Digital Libraries: Final Workshop Report (Nov 1996) http: //is. gseis. ucla. edu/research/dl/UCLA_DL_Report. doc TAXONOMY STRATEGIES LLC The business of organized information 38

Faceted taxonomy theory & practice v How many terms are needed to provide sufficient Faceted taxonomy theory & practice v How many terms are needed to provide sufficient granularity? § Not as many as you think v Post-coordinate indexing allows several simple controlled vocabularies to be combined, rather than using a single large pre-coordinated vocabulary. TAXONOMY STRATEGIES LLC The business of organized information 39

The power of faceted taxonomy v 4 independent categories of 10 nodes each have The power of faceted taxonomy v 4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10, 000 nodes (104) § Easier to maintain § Easier to tag by content authors § Can be easier to navigate TAXONOMY STRATEGIES LLC The business of organized information Audience Advocacy Contractors & Grantees Environmental Professionals Federal Facilities General Public Industry Kids Researchers & Scientists Small Business Students Health Advisory Exposure Food Safety Health Assessment Health Effect Health Risk Occupational Health Pesticide Effects Sun Protection Toxicity Industry Agriculture & Cattle Automobile Repair Chemical Dry Cleaning Electronics & Computer Energy Extractive Industries Food Processing Leather Tanning & Finishing Metal Finishing Substance Allergen Biological Contaminant Carcinogen Chemical Explosive Liquid Waste Microorganism Ozone Pesticide Radioactive Waste 40

Impact on collection size by increasing number of terms per facet # Docs/Category # Impact on collection size by increasing number of terms per facet # Docs/Category # Facets # Terms/Facet Max Collection Size # Post-coord combos 20 20 20 4 4 4 10 20 30 40 50 200, 000 3, 200, 000 16, 200, 000 51, 200, 000 125, 000 10, 000 160, 000 810, 000 2, 560, 000 6, 250, 000 TAXONOMY STRATEGIES LLC The business of organized information 41

Impact on collection size by increasing number of facets # Docs/Category 20 20 20 Impact on collection size by increasing number of facets # Docs/Category 20 20 20 # Terms/Facet 10 10 10 4 5 6 7 8 200, 000 2, 000, 000 100, 000 1, 000, 000 100, 000 # Facets Max Collection Size # Post-coord combos TAXONOMY STRATEGIES LLC The business of organized information 42

Sources for 7 common taxonomies Taxonomy Definition Potential Sources Organizational structure. FIPS 95 -2, Sources for 7 common taxonomies Taxonomy Definition Potential Sources Organizational structure. FIPS 95 -2, U. S. Government Manual, Your organizational structure, etc. Content Type Structured list of the various types of content being managed or used. DC Types, AGLS Document Type, AAT Information Forms , Records management policy, etc. Industry Broad market categories such as lines of business, life events, or industry codes. FIPS 66, SIC, NAICS, etc. Location Place of operations or constituencies. FIPS 5 -2, FIPS 55 -3, ISO 3166, UN Statistics Div, US Postal Service, etc. Functions and processes performed to accomplish mission and goals. FEA Business Reference Model, Enterprise Ontology, AAT Functions, etc. Topic Business topics relevant to your mission and goals. Federal Register Thesaurus, NAL Agricultural Thesaurus, LCSH, etc. Audience Subset of constituents to whom a piece of content is directed or intended to be used. GEM, ERIC Thesaurus, IEEE LOM, etc. Products and Services Names of products/programs & services. ERP system, Your products and services, etc. TAXONOMY STRATEGIES LLC The business of organized information 43

Facetted tagging v How well can end users (content authors) do this? § Incentives Facetted tagging v How well can end users (content authors) do this? § Incentives help such as almost instantaneous feedback (AIF) § Importance of workflow (new slide? ) – Tagging & re-tagging throughout content life cycle – Show graphic of content lifecycle (from UCLA NSF workshop? ) § Approve & improve mindset § Test & improve TAXONOMY STRATEGIES LLC The business of organized information 44

Summary v There are lessons to be learned from web tagging about how to Summary v There are lessons to be learned from web tagging about how to get good metadata in document and content management applications. v Document and content management system tagging must be simple, and it must be almost instantaneously easier to find relevant work products. TAXONOMY STRATEGIES LLC The business of organized information 45

Taxonomy Strategies LLC Questions? Joseph A. Busch 415 -377 -7912, jbusch@taxonomystrategies. com November 1, Taxonomy Strategies LLC Questions? Joseph A. Busch 415 -377 -7912, jbusch@taxonomystrategies. com November 1, 2006 Copyright 2006 Taxonomy Strategies LLC. All rights reserved.

Tagging Overview v Tagging, any kind of tagging is better than the words that Tagging Overview v Tagging, any kind of tagging is better than the words that happen to occur in a piece of content. End user tagging is useful, so is tagging by librarians, as are tags automatically assigned by operating systems and language processing algorithms. Content should be tagged throughout its lifecycle, each time the content is handled and used so that it accrues value or its significance is diminished. v Almost instantaneous exposure of tags in simple user interfaces on the web provides positive reinforcement for user tagging that simply did not exist before. It should not be surprising that a good user interface improves usability. v As content users flock to websites that help to organize the content on the web, advertisements and value added content services follow. The bottleneck in the semantic web has been not enough tagged content. The end user tagging revolution may begin to address this shortcoming. v There are lessons to be learned from web tagging about how to get good metadata in document and content management applications. Document and content management system tagging must be simple, and it must be almost instantaneously easier to find relevant work products. TAXONOMY STRATEGIES LLC The business of organized information 47