9066a2c367d4bde317fc38e9379823aa.ppt
- Количество слайдов: 50
Expanding Big Data Science: Forward & Backward C. Randall (Randy) Howard, Ph. D. , PMP Big Data Scientist, Thought Leader, Systems Innovation Analyst, Solutions Architect Sr. Data Scientist, Novetta Solutions Adjunct Professor, Mason’s Volgenau School of Engineering choward@gmu. edu http: //www. crhphdconsulting. net/ May 20, 2014 April 4, 2013 Technology Trends, Big Data and Data-Driven Decisions
C. Randall (Randy) Howard, Ph. D. , PMP § Senior Data Scientist, Novetta Solutions § Adjunct Professor, Volgenau School of Engineering, GMU o Big Data Overview o Systems Analysis & Design Determining Needs in Big Data o Big Data, Small Details & Time (Metadata) § 2013 Teaching Excellence Award Nominee § Co-Organizer of Big Data Lecture Series, EIT Award Nominee § Member, Data Science Working Groups & Sub-teams § International Author & Speaker § 30 years IT & systems engineering, architecture, trouble-shooting, change & innovation § Ph. D. , Information Technology, GMU § BS, MS: Information Systems, VCU Expanding Big Data Science: Forward & Backward 2
Agenda § Context: What is Big Data All About? § Forward: Considering Multiple Perspectives § Backward: Refactor/Repurpose Legacy Approaches Expanding Big Data Science: Forward & Backward 3
Context: What is Big Data Science All About? Expanding Big Data Science: Forward & Backward 4
Context of Material § How was the big data collected? o Empirical Observations & Applications o Critical Thinking § Where is it stored? o Case Studies o Feverishly Codifying o Move from Rescuing to Preventing § What are the results? o Clarifying and Connecting Disparate, Contentious Pieces o Still Working… Expanding Big Data Science: Forward & Backward 5
My Positions on Big Data § Big Data Science o Big Data: Problem & Opportunity Space o Data Science: Potential Solution Discipline o Big Data Science: “Applying Data Science to Big Data” § Technology “Reboot” CAN Usher in New Generation of Capabilities o Big Data Today o New “Big Data” Tomorrow § Must Clarify Business Value § Have To Think Horizontally & Corporately § But, I am a professor… § Heresy Now? Genius Tomorrow? Expanding Big Data Science: Forward & Backward 6
IT Disasters & Dilemmas: Possible w/ Big Data? [IT-Failures] l irtua V ogy * ril I’s T se File ped FB Ca rap UK Inland Revenue* $3. 5 B: Software Errors Obama Care? B: ov NSA Tra ilb er-bu dget lazer * boon , ineffe ctive dogg , 7 -y le r Ford’s Purch $400 M asing : Aban System * doned Di sa ste rs $1 c M: S 70 $1. 2 Dilemmas Economic Winter (Do more w/ Less) What is it? Exactly? Expanding Big Data Science: Forward & Backward 7
My Big Concern!! Peak of Inflated Expectations: Early publicity produces a number of success stories—often accompanied by scores of failures. Some companies take action; many do not. Plateau of Productivity: Mainstream adoption starts to take off. Criteria for assessing provider viability are more clearly defined. The technology’s broad market applicability and relevance are clearly paying off. Slope of Enlightenment: More instances of how the technology can benefit the enterprise start to crystallize and become more widely understood. Second- and third-generation products appear from technology providers. More enterprises fund pilots; conservative companies remain cautious. Curve of Complacency: Early successes satisfy stakeholders that the problem or opportunity is handled, and it is time to move on to the next issue. Meanwhile the Plateau of Productivity that is achieved is much lower. [crh] Dr. C. Randall Howard, PMP (Not a position of Gartner or Dr. Aiken-yet) Trough of Disillusionment: Interest wanes as experiments and implementations fail to deliver. Producers of the technology shake out or fail. Investments continue only if the surviving providers improve their products to the satisfaction of early adopters. Technology Trigger: A potential technology breakthrough kicks things off. Early proof-of-concept stories and media interest trigger significant publicity. Often no usable products exist and commercial viability is unproven. Expanding Big Data Science: Forward & Backward [Aiken] [Gartner] 8
Big Data & Data Science “ 1 -Page Summary” § Big Data “V”s[IBM]: o Volume (How much in total) Data V’s Creation o Variety (How many sources) & Collection o Velocity (How fast does it come in) Capabiliti o Veracity, Variability, Complexity, etc. [various] es • § Increases in “Hard” Data Science[various]Sensors • Social Media o Math, Science, Analytics • Mobile Data o Data-Driven Organizations o Creating data products o Looking to the future § “Soft” Data Science? (Hold on) N DE OTIO PIC NA ITI L ON Capability gaps due to surges in data collections Processing & Analytical [Conway] Capabilities Time Expanding Big Data Science: Forward & Backward 9
Soft Data Science [crh] Changing Term to Tacit Data Science, but that’s another talk Data V’s Hardening the “Soft” • Automate “Hard-to. Automate” • Predictable • To-be Performed by Many Creation & Collection Capabiliti es Shrink the Capability Gap ” & a t of at “S / rd” D ce w a n “H Scie NO DE TIO PIC NA ITI L ON • • “Soft Head Start” Processing & Analytical Capabilities ” ard H w/ “ ata D ce en Sci ne Alo Backlogs increase exponentially Signals become noise “Action” windows lost / missed We become bottlenecks to partners • • • Notoriety to date Performed by a few Bottlenecked by a few? Time Expanding Big Data Science: Forward & Backward 10
Big Data Science Value Parameters § Increased Actionable Intelligence § Trends Noticed / Confirmed § Leverage Unstructured § Faster Knowledge / Awareness / Ability to Search Data § Flexibility / Extensibility of Data Utilization § New, More Adaptable HW/SW Acquisition Models § More TBD Expanding Big Data Science: Forward & Backward 11
Other Big Data Considerations § Capabilities Their Own Separate ROI’s § Process Data w/in Acceptable Tolerances: o o o Time Errors Accuracy Reliability Etc. § Accountability: Find Critical Intelligence & Make Time Windows § Thus, Big Data Is “Having more data than you can process and manage within acceptable tolerances (e. g. time, quality, cost)”[crh] Expanding Big Data Science: Forward & Backward 12
Forward: Considering Multiple Perspectives Engineering Objective • Integrate, and automate, vital stakeholders' perspectives throughout the engineering life-cycle Enabling Mechanisms • • Expanding Big Data Science: Forward & Backward Wicked Problems Learning Organizations Changing Culture Education 13
BDLS: A Broader Look Big Data Science § Each channel is difficult § Each complements the other § Complexities are compounded exponentially in cross-sections Expanding Big Data Science: Forward & Backward 14
Multiple Perspectives in Publications § Multi-disciplinary[Gartner-ERDS] teams[Patil] a “broad sample of the population” & involves “teams that frequently partner w/ diverse roles in an organization… to gather, organize, & make use of their data”[EMC-DS] § “Wetware[Gleichauf]” (vs. HW & SW): “People, their skillsets, corporate policies, & organizational structures that define our analytic communities” § Soft Skills[Gartner-ERDS]: o o o Communication Collaboration Leadership Creativity Discipline Passion § Data Scientist can be invaluable…unique combination of Expanding Big Data Science: Forward & Backward 15
Data Science Teams § Data Science Teams[Patil] o o Small-team members should sit close to each other Mix of skill-sets, some experts, some not Train people to fish Functional areas must stay in regular contact and communication. § Impediments o Measuring Performance: Rewarding & Disciplining Teams vs. Individuals o Sharing Intellectual Property w/ Integrated Product Teams (esp. cross- vendor) o “Expert Teams”? ? § “Expert Teams” o May find Big Data Science trivial o Typically • have more control over their environment • Don’t need to have the masses engaged But … Expanding Big Data Science: Forward & Backward 16
Life-Cycle Service Orchestration ? Legal Review Life Cycle Acquisition (FAR) OODA Loop Expanding Big Data Science: Forward & Backward 17
Classroom Exercise Findings Expanding Big Data Science: Forward & Backward 18
Wicked Problems Expanding Big Data Science: Forward & Backward 19
Wicked Problems Tip-off Words[Nixon] Networked Joint Integrated Shared Multi-organizational Interoperable Coalition Cross-organizational Virtual Community Combined Big Data is a Wicked Problem! Expanding Big Data Science: Forward & Backward 20
Wicked Problems[Nixon] § Requires Multiple Stakeholders’ Perspectives § Key Driver: Social Complexity from Integrated Networks § Traditional linear solution styles are not well suited § Needs focus on: o o Social Aspects Gaining Shared Understanding Try Things Let Solution Emerge From Cycle of Adaptation § Thus[crh], o Multiple Perspectives Involves Collaboration o Collaboration Technologies MUST BE INNOVATED Expanding Big Data Science: Forward & Backward 21
Sample Collaboration Innovation[Innovation. Games] Expanding Big Data Science: Forward & Backward 22
Sample Collaboration Innovation[Innovation. Games] http: //innovationgames. com/ Expanding Big Data Science: Forward & Backward 23
Learning Organizations Expanding Big Data Science: Forward & Backward 24
Learning Organization [Senge] § Peter Senge (http: //www. infed. org/thinkers/senge. htm) o Studied how adaptive capabilities developed o The Fifth Discipline(1990) ‘Learning Organization' (LO) § Basic Learning Organization Disciplines: o o o Systems Thinking Personal Mastery Mental Models Building Shared Vision Team Learning Expanding Big Data Science: Forward & Backward 25
Learning Organizations’ Disciplines Discipline Explanation • Cannot understand the parts until you understand the whole[Aiken] Systems Thinking • Balance • Theory w/ Data[Barbara’] • Ideas w/ Tools [Sagan] System Maps Diagrams that show key elements of systems and how they connect. You may have heard them called Landscape or Ecosystem Clarify & deepen our personal vision…of seeing reality Personal Mastery objectively OR Know yourself Mental Models Carry on ‘learningful’ conversations that: • Expose our internal pictures of the world & hold them up to scrutiny • Balance inquiry and advocacy, where people share their thoughts. OR Express Yourself Building Shared Capacity to hold a share picture of the future we seek to create 26 Expanding Big Data Science: Forward & Backward
Changing Culture Expanding Big Data Science: Forward & Backward 27
Culture Obstacles[econ. BD] Expanding Big Data Science: Forward & Backward 28
Changing Culture § Examples: o Hard-drives o Management Visibility of Data Processing o Target’s former CEO? § Leadership needs to foster a culture of: o Increased curiosity about data o Rewarding experimentation o Counting “Assists” § Need ‘democratization’, or open-access, of data”[Patil] o Or Horizontal Orientation / Governance of Data[crh] § Not trivial - Sharing data exposes risks of: o Misinterpretation o Loss of “credit” associated with results from the data Expanding Big Data Science: Forward & Backward 29
Education Expanding Big Data Science: Forward & Backward 30
Education § Establish a new baseline of knowledge to advance § Mason’s Big Data Lecture Series Purpose: o Separate Hype from Reality o Have marquee experts expose what in Big Data: • • Is really working and making a difference? Shows promise? Has failed? Needs another try? Are the impediments? o Convey daunting challenge Is feasible, but still a challenge Expanding Big Data Science: Forward & Backward 31
Big Data Adoption [IBM-Analytics] Expanding Big Data Science: Forward & Backward 32
Learning Revolution [Robinson] § Big Data Science is a REVOLUTION that starts (& continues) w/ LEARNING o Requires new skills o New leadership models § http: //www. ted. com/talks/sir_ken_robinson_bring_on_the_rev olution. html Expanding Big Data Science: Forward & Backward 33
Backward: Refactor / Repurpose Legacy Approaches Engineering Objective • Re-invigorate, introduce to leverage or adjust features of existing approaches to address current needs Enabling Mechanisms • • Expanding Big Data Science: Forward & Backward “Business 101” Principles Structured Systems Analysis Enterprise Architecture Strategic Execution 34
What is Legacy? § What “brought us here” o Business Basics (e. g. , Planning, ROI) o Structured Systems Analysis (e. g. Waterfall methodology, CMMI) § Yes, o Very Cumbersome o Have Failed too But… o Developed by Very Smart People o For Very Similar Issues o Been “Tested” So…. . o Re-invent the Wheel? § To leverage: o Consider Context: Intent & Issues o Re-calibrate / Re-factor For Today o Come Back to “Common Sense”, What Works § Examples: o Meeting Management o Scaled Agile Expanding Big Data Science: Forward & Backward 35
Enterprise Architecture § “Process of translating business vision and strategy into effective enterprise change by creating, communicating and improving the key requirements, principles and models that describe the enterprise's future state and enable its evolution. [Gartner-EA] § Short: Simple Structure & Alignment of Technical & Business Capabilities So…. § Take “Business Back to IT”[crh] § Maintain Line-of-Sight to Value[crh] § Focus on the Mission and Mission Capabilities! Expanding Big Data Science: Forward & Backward 36
Capability Dependencies Hierarchy Example: Tool x requires staff time for training & learning Expanding Big Data Science: Forward & Backward 37
Strategic Planning Survey[Bain] § 14 -year Compilation of: o 11 Surveys o 8, 504 respondents 2006: 88% 3. 93 Expanding Big Data Science: Forward & Backward 38
Strategy to Tactics Line-of-Sight[crh] § Establish Enterprise-wide Decision Criteria § Convey & Carry Commander’s Intent to Execution Levels Expanding Big Data Science: Forward & Backward 39
Engineering “Risky Art” Landscape • Most impactful, hardest to tame, most ignored • Least concrete, hardest to sell / prove • Needs the most “innovation attention” Expanding Big Data Science: Forward & Backward 40
A Big Data Systems Analysis & Engineering “Success” Story Lots of ways to do this. Lots of requirements. Lots of ways to get requirements across lots of different stakeholders Users Big Data Lecture Series Session 4: Solving the Risk Equation Expanding Big Data Science: Forward & Backward Fall 2012 Big Data Systems Analysis & Engineering “So-What” 4 1 41 41
Wrapup Expanding Big Data Science: Forward & Backward 42
Big Data Science Postulates[crh] § If Big Data Science is not a technology problem, then let’s focus on the PROBLEM: the non-technology side, or the human-side. § We must perfect the blending of disciplines to educate & train on Big Data Science (vs. perfecting specific disciplines) § Doing what you are doing will not get you out of the fix you are in since it got you in the fix in the first place – innovate and improve! § Our Big Data Science, Analytics & Intelligence is an ENVIRONMENT and a SYSTEM, not an APP Expanding Big Data Science: Forward & Backward 43
Big Data / Data Science Postulates (cont’d) Final Quiz: Where do we start? LEARNING! Expanding Big Data Science: Forward & Backward 44
One last time… How did we do? Expanding Big Data Science: Forward & Backward 45
References Expanding Big Data Science: Forward & Backward 46
References § § § § § § [1000 v] URL: http: //www. 1000 ventures. com/design_elements/selfmade/quaity_cost-4 components_6 x 4. png [Aiken] Dr. Peter Aiken, Data Blueprint, 2012 -2013 [arcweb] http: //www. arcweb. com/events/arc-orlando-forum/pages/analytics-for-industry. aspx [asq] URL: http: //asq. org/learn-about-quality/cost-of-quality/overview/read-more. html [Bain] http: //www. bain. com/management_tools_and_trends_2007. pdf [Barbara’] Dr. Daniel Barbara’, George Mason University, 2012 Big Data Lecture Series [Batni] Carlo Batini, Cinzia Cappiello, Chiara Francalanci, and Andrea Maurino. 2009. Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41, 3, Article 16 (July 2009), 52 pages. DOI=10. 1145/1541880. 1541883 http: //doi. acm. org/10. 1145/1541880. 1541883 [Conway] http: //www. drewconway. com/zia/? p=2378 [coq] URL: http: //costofquality. org/wp-content/uploads/2011/02/Cost-of-Quality. jpg [crh] Dr. C. Randall Howard, PMP, crh. Ph. DConsulting. net [Crosby] http: //www. philipcrosby. com/25 years/crosby. html [ct-bdtech] http: //cloudtimes. org/2013/06/13/big-data-techniques-for-analyzing-large-data-sets-infographic/ [dddm] http: //www. clrn. org/elar/dddm. cfm [DTIC] http: //www. dtic. mil/doctrine/new_pubs/ [econ. BD] http: //www. economistinsights. com/analysis/evolving-role-data-decision-making, August 12 th 2013 [EMC-DS] http: //www. emc. com/collateral/about/news/emc-data-science-study-wp. pdf [Forbes] http: //www. forbes. com/sites/christopherfrank/2012/03/25/improving-decision-making-in-the-world-ofbig-data/ [FSAM/BAH] http: //www. fsam. gov/about-federal-segment-architecture-methodology. php [Gartner-EA] http: //www. gartner. com/technology/it-glossary/enterprise-architecture. jsp [Gartner-ERDS] "Emerging Role of the Data Scientist and the Art of Data Science", Gartner, 20 March 2012, ID: G 00227058, Douglas Laney, Lisa Kart [Gartner-HC] http: //www. gartner. com/newsroom/id/1763814 Expanding Big Data Science: Forward & Backward 47
References § [gayatri-patele-bay] http: //www. slideshare. net/Aster. Data/gayatri-patele-bay § [Gleichauf] See Bob Gleichauf’s article: http: //www. iqt. org/technology-portfolio/on-ourradar/Big_Data_Advanced_Analytics. pdf § [IBM-using. BD] ftp: //ftp. software. ibm. com/software/tw/Using_Big_Data_for_Smarter_Decision-Making_v. pdf § [IBM] http: //www. ibm. com/developerworks/data/library/dmmag/DMMag_2011_Issue 2/Big. Data/index. html? cmp=dw&cpb=d winf&ct=dwnew&cr=dwnen&ccy=zz&csr=051211 § [IBM-Analytics] http: //www 935. ibm. com/services/multimedia/Analytics_The_real_world_use_of_big_data_in_Financial_services_Mai_2013. pdf § [Infocus] http: //infocus. emc. com/robert_abate/the-business-case-for-big-data-part-1/ § Infostory] http: //infostory. com/2012/03/28/data-information-knowledge-web/ § [Innovation. Games] http: //innovationgames. com/ § [IT-Failures] o § § § § [http: //it-project-failures. blogspot. com http: //it. slashdot. org/submission http: //www. sfgate. com] [Lwanga] The Job of the Information/Data Quality Professional (2010) Lwanga, Walenta, Talburt (IAIDQ Publication) [Madnick] Stuart E. Madnick, Richard Y. Wang, Yang W. Lee, and Hongwei Zhu. 2009. Overview and Framework for Data and Information Quality Research. J. Data and Information Quality 1, 1, Article 2 (June 2009), 22 pages. DOI=10. 1145/1515693. 1516680 http: //doi. acm. org/10. 1145/1515693. 1516680 [Mason-BDLS] George Mason University Volgenau School of Engineering Big Data Lecture Series, 2011 -2012 [MIT] http: //lean. mit. edu/downloads/2010 -theses/view-category. html [Nixon] steven. d. nixon@gmail. com - 08/29/2011, Mason Big Data Lecture Series 2011 [Nonaka, Hirotaka, Knowledge-Creating Company] Nonaka, Ikujiro, and Hirotaka Takeuchi. The knowledge-creating company: How Japanese companies create the dynamics of innovation. Oxford University Press, USA, 1995. [O’Reily] https: //docs. google. com/present/view? hl=en_US&id=0 AXa. XKp 9 bt 6 OXZGd 4 Yzln. Ym. Rf. NThj. Mmo 4 dm 5 ya. A from What is data science? O'Reilly Radar [p 36] http: //information-retrieval. info/taipale/papers/p 36 -popp. pdf [Patil] Patil, D. J. , Building Data Science Teams, 2011 [RG] http: //www. riskglossary. com/link/risk_metric_and_risk_measure. htm [Robinson] http: //www. ted. com/talks/sir_ken_robinson_bring_on_the_revolution. html [Sagan] Dr. Philip Sagan, Infiniti, 2012 Big Data Lecture Series [Senge] http: //www. infed. org/thinkers/senge. htm [Talburt] Dr. John Talburt, 2012 Big Data Lecture Series [Tandem] http: //www. tandemlabs. com/documents/CPSA 2008. pdf o o Expanding Big Data Science: Forward & Backward 48
Backup Slides Expanding Big Data Science: Forward & Backward 49
J. C. R. Lickleider's Man-Computer Symbiosis[Aiken] Best approaches combines manual and automated reconciliation! Humans Generally Better • • • • Sense low level stimuli Detect stimuli in noisy background Recognize constant patterns in varying situations Sense unusual and unexpected events Remember principles and strategies Retrieve pertinent details without a priori connection Draw upon experience and adapt decision to situation Select alternatives if original approach fails Reason inductively; generalize from observations Act in unanticipated emergencies and novel situations Apply principles to solve varied problems Make subjective evaluations Develop new solutions Concentrate on important tasks when overload occurs Adapt physical response to changes in situation Expanding Big Data Science: Forward & Backward Machines Generally Better • • • • Sense stimuli outside human's range Count or measure physical quantities Store quantities of coded information accurately Monitor prespecified events, especially infrequent Make rapid and consisted responses to input signals Recall quantities of detailed information accurately Retrieve pertinent detailed without a priori connection Process quantitative data in prespecified ways Perform repetitive preprogrammed actions reliably Exert great, highly controlled physical force Perform several activities simultaneously Maintain operations under heavy operation load Maintain performance over extended periods of time 50
9066a2c367d4bde317fc38e9379823aa.ppt