Скачать презентацию CPT-S 415 Big Data Yinghui Wu EME B Скачать презентацию CPT-S 415 Big Data Yinghui Wu EME B

02ef584fbf4cb4b6ec6163e1611d3f51.ppt

  • Количество слайдов: 58

CPT-S 415 Big Data Yinghui Wu EME B 45 1 CPT-S 415 Big Data Yinghui Wu EME B 45 1

CPT_S 415 Big Data Visualization and Navigation (Information Visualization) ü Information Visualization ü Graph CPT_S 415 Big Data Visualization and Navigation (Information Visualization) ü Information Visualization ü Graph visualization – Graph drawing and graph visualization – Graph layout courtesy from Ivan Herman

“Close the loop” Data quality -> knowledge quality (potential new area) 3 Data interpretation “Close the loop” Data quality -> knowledge quality (potential new area) 3 Data interpretation Visual analytics Graph visualization

http: //www. matthiasdittrich. com/proj ekte/narratives/visualisation/ 4 http: //www. matthiasdittrich. com/proj ekte/narratives/visualisation/ 4

Information Visualization 5 Information Visualization 5

Linking data to human ü Collecting information is no longer a problem, but extracting Linking data to human ü Collecting information is no longer a problem, but extracting value from information collections has become progressively more difficult. ü Visualization links the human eye and computer, helping to identify patterns and to extract insights from large amounts of information ü Visualization technology shows considerable promise from increasing the value of large-scales collections of information ü Visualization can be classified as scientific visualization, software visualization, and information visualization 6

Visualization Classification ü Scientific Visualization helps understanding physical phenomena in data (Nielson, 1991) – Visualization Classification ü Scientific Visualization helps understanding physical phenomena in data (Nielson, 1991) – Mathematical model plays an essential role – Isosurfaces, volume rendering, and glyphs are commonly used techniques – Isosurfaces depict the distribution of certain attributes – Volume rendering allows views to see the entire volume of 3 -D data in a single image (Nielson, 1991) – Glyphs provides a way to display multiple attributes through combinations of various visual cues (Chernoff, 1973)

Visualization Classification ü Software Visualization helps people understand use computer software effectively (Stasko et Visualization Classification ü Software Visualization helps people understand use computer software effectively (Stasko et al. 1998) – Program visualization helps programmers manage complex software (Baecker & Price, 1998) – Visualizing the source code (Baecer & Marcus, 1990) data structure, and the changes made to the software (Erick et al. , 1992) – Algorithm animation is used to motivate and support the learning of computational algorithms http: //www. algomation. com/algorithm/quicksort-visualization

What is Information Visualization? ü Information visualization helps users identify patterns, correlations, or clusters What is Information Visualization? ü Information visualization helps users identify patterns, correlations, or clusters – Structured information • Graphical representation to reveal patterns • Integration with various data mining techniques (Thealing et al. , 2002; Johnston, 2002) – Unstructured Information • Need to identify variables and construct visualizable structures ü The depiction of information using spatial/graphical representations, to facilitate comparison, pattern recognition, change detection, and other cognitive skills by making use of the visual system.

Information Visualization ü Problem: – HUGE Datasets: How to understand them? ü Solution – Information Visualization ü Problem: – HUGE Datasets: How to understand them? ü Solution – Take better advantage of human perceptual system – Convert information into a graphical representation. ü Issues – How to convert abstract information into graphical form? – Do visualizations do a better job than other methods?

Goals of Information Visualization ü Make large datasets coherent (Present huge amounts of information Goals of Information Visualization ü Make large datasets coherent (Present huge amounts of information compactly) ü Present information from various viewpoints ü Present information at several levels of detail (from overviews to fine structure) ü Support visual comparisons ü Tell stories about the data

“Sci vis” versus “Info vis” ü Scientific visualization: specifically concerned with data that has “Sci vis” versus “Info vis” ü Scientific visualization: specifically concerned with data that has a well-defined representation in 2 D or 3 D space (e. g. , from simulation mesh or scanner). *Adapted from The Para. View Tutorial, Moreland

Information visualization ü Information visualization: concerned with data that does not have a well-defined Information visualization ü Information visualization: concerned with data that does not have a well-defined representation in 2 D or 3 D space (i. e. , “abstract data”).

Data Attributes • Data attributes: Infovis has more data types than numerical values Data Data Attributes • Data attributes: Infovis has more data types than numerical values Data Type Attribute Domain Operations Examples nominal Unordered set Comparison (=) Text, references, syntax elements ordinal Ordered set Ordering (=, <, >) Ratings (e. g. , bad, average, good) discrete Integer arithmetic Line of code continuous real Real arithmetic Code metrics

Info Viz vs Sic Viz Scivis Infovis Data Domain spatial, compact non-spatial, abstract Attribute Info Viz vs Sic Viz Scivis Infovis Data Domain spatial, compact non-spatial, abstract Attribute Type numerical any data type Data Points Samples over the domain Tuples of attributes Cells Support interpolation Describe relations Interpolation Piecewise continuous Can be inexistent 16

Information representation in Info. Viz 17 Information representation in Info. Viz 17

Information Representation ü Shneiderman (1996) proposed seven types of representation methods: • 1 -D, Information Representation ü Shneiderman (1996) proposed seven types of representation methods: • 1 -D, 2 -D, 3 -D • Multidimensional • Tree • Network • Temporal approaches 18

1 -D Tile. Bars (Hearst, 1995) 1 -D Tile. Bars (Hearst, 1995)

2 -D & 3 -D ü 2 D: To represent information as two-dimensional visual 2 -D & 3 -D ü 2 D: To represent information as two-dimensional visual objects – Visualization systems based on self-organizing map (SOM) (Kohonen, 1995) – To help users deal with the large number of categories created for the mass textual data ü 3 D: To represent information as three-dimensional visual objects – Web. Book system folds web pages into three-dimensional books (Card et al. , 1996) – 3 -D version of a tree or network • 3 -D hyperbolic tree to visualize large-scale hierarchical relationships (Munzner 2000) – http: //www. start. umd. edu/gtd/globe/index. html 20

Multidimensional ü To represent information as multidimensional objects and projects them into a three-dimensional Multidimensional ü To represent information as multidimensional objects and projects them into a three-dimensional or a twodimensional space – Dimensionality reduction algorithm will be used • • Multidimensional scaling (MDS) Hierarchical clustering K-means algorithms Principle components analysis – Examples • SPIRE system (Wise et al. 1995) • Vx. Insight System (Boyack et al. 2002) • Glyph representation has been used in various social visualization techniques (Donath, 2002) to describe human behavior during computer-mediated communication (CMC) 21

Table Visualization Simple list; does not support analysis, or insight Table Visualization Simple list; does not support analysis, or insight

Table Visualization Aided by • Sorting, • Bar Graph, • Evolution Icons Dense Pixel Table Visualization Aided by • Sorting, • Bar Graph, • Evolution Icons Dense Pixel Display: • Bar Graph, • Table Lens

Tree ü To represent hierarchical relationship – Challenge: nodes grows exponentially • Different layout Tree ü To represent hierarchical relationship – Challenge: nodes grows exponentially • Different layout algorithms have been applied – Examples • Tree-Map allocates space according to attributes of nodes (Johnson & Shneiderman 1991) • Cone Tree system uses 3 -D visual structure to pack more nodes on the screen (Robertson et al. , 1991) • Hyperbolic Tree projects subtrees on a hyperbolic plane and puts the plane (Lamping et al. , 1995) 24

Tree Visualization ü The Tree. Map (Johnson & Shneiderman ‘ 91) Idea: Show a Tree Visualization ü The Tree. Map (Johnson & Shneiderman ‘ 91) Idea: Show a ü ü hierarchy as a 2 D layout • Fill up the space with rectangles representing objects • Size on screen indicates relative size of underlying objects. Treemap method: visualize the tree structure that use virtually every pixel of the display space to convey information Every subtree is represented by a rectangle, that is partitioned into smaller rectangles with correspond to its children. The position of the slicing lines determines the relative sizes of the child rectangles. For every child, repeat the slicing recursively, swapping the slicing direction from vertical to horizontal or conversely

Tree Visualization: Examples ü Treemap Tree Visualization: Examples ü Treemap

Tree Visualization Ball-and-stick visualization: use the position and appearance of the glyphs Rooted-Tree Layout Tree Visualization Ball-and-stick visualization: use the position and appearance of the glyphs Rooted-Tree Layout Radial-Tree Layout

Graphs and Networks ü To represent complex relationships that a simple tree structure is Graphs and Networks ü To represent complex relationships that a simple tree structure is insufficient to represent – Citation among academic papers( C. Chen & Paul 2001; Mackinlay et al. , 1995) – Documents linked by the internet (Andrews, 1995) – Spring-embedder model (Eades, 1984) along with its variants ( Davidson & Harel, 1996; l Fruchterman & Reingold, 1991) have become the most popular drawing algorithms.

Examples Network visualization (vizster) Examples Network visualization (vizster)

Temporal ü To represent information based on temporal order – Location and animation are Temporal ü To represent information based on temporal order – Location and animation are commonly used visual variables to reveal the temporal aspect of information – Examples • Perspective Wall lists objects along the x-axis based on time sequence and presents attriibutes along the y-axis (Robertson et al. , 1993) • In Vx. Insight system (Boyack et al. , 2002), the landscape changes as the time changes. 30

Examples Geo data mapping Demo Cyber Attacks http: //map. norsecorp. com/#/ Examples Geo data mapping Demo Cyber Attacks http: //map. norsecorp. com/#/

Additional Examples ü http: //map. norsecorp. com/v 1/ ü NY Times words, numbers ü Additional Examples ü http: //map. norsecorp. com/v 1/ ü NY Times words, numbers ü Visual Complexity (from book by Manuel Lima) ü 50 examples (from June 2009, somewhat dated) ü D 3 Gallery

Visualization components ü User-Interface Interaction: Color, Size, Texture, Proximity, Annotation, Interactivity – Immediate interaction Visualization components ü User-Interface Interaction: Color, Size, Texture, Proximity, Annotation, Interactivity – Immediate interaction not only allows direct manipulation of the visual objects displayed but also allows users to select what to be displayed (Card et al. , 1999) – Shneiderman (1996) summarizes six types of interface functionality: Overview, Zoom, Filtering, Details on demand, Relate, history ü Information Analytics – Indexing • Extract the semantics of information – Analysis • Clustering, classification

Visualization pipeline ü Acquire -> Parse -> Filter -> Mine -> Represent -> Refine Visualization pipeline ü Acquire -> Parse -> Filter -> Mine -> Represent -> Refine -> Interact Parse Represent Interact Acquire Filter/Mine Refine 34

Visualization software ü Host language (C/C++/Java/Python) plus Open. GL ü Stat/math package with graphics Visualization software ü Host language (C/C++/Java/Python) plus Open. GL ü Stat/math package with graphics – R – MATLAB ü Special-purpose info viz software – Earth mapping, biological network visualization, etc. ü Browser-enabled graphics/info viz packages – – Google Charts Processing / Processing. js D 3 Java + Flash (becoming rarer)

Graph Drawing and Graph Visualization 36 Graph Drawing and Graph Visualization 36

Information Visualization vs. Graph Drawing ü Graph Drawing – Old topic, many books, etc. Information Visualization vs. Graph Drawing ü Graph Drawing – Old topic, many books, etc. – May have other goals than visualization • E. g. VLSI design ü Graph Visualization – Size key issue – Usability requires nodes to be discernable – Navigation considered

Graph Visualization Hierarchical graph of the evolution of the UNIX operating system Graph Visualization Hierarchical graph of the evolution of the UNIX operating system

Graph Visualization The Call Graph Three concentric rings show containment (1) Files (2) Classes Graph Visualization The Call Graph Three concentric rings show containment (1) Files (2) Classes (3) Methods The curved lines indicate function calls

Graph visualization ü Circle chart Graph visualization ü Circle chart

When is Graph Visualization Applicable? ü Ask the question: is there an inherent relation When is Graph Visualization Applicable? ü Ask the question: is there an inherent relation among the data elements to be visualized? – If YES – then the data can be represented by nodes of a graph, with edges representing the relations. – If NO – then the data elements are “unstructured” and goal is to use visualization to analyze and discover relationships among data. Source: Herman, Graph Visualization and Navigation in Information Visualization: a Survey

Traditional Graph Drawing ü Optimization based on a set of criteria (mathematical aesthetics) – Traditional Graph Drawing ü Optimization based on a set of criteria (mathematical aesthetics) – Minimize edge crossings – Minimize area – Maximize smallest angle – Maximize symmetry – Do all at once is hard. ü Often unsuitable for interactive visualization – Many optimizations are NP-Hard – Approximation algorithms very complex • Precompute layout, or compute once at the beginning of an application then support interaction Slide adapted from Jeff Heer

Traditional Graph Drawing poly-line graphs (includes bends) orthogonal drawing planar, straight-line drawing upward drawing Traditional Graph Drawing poly-line graphs (includes bends) orthogonal drawing planar, straight-line drawing upward drawing of DAGs

Layout Approaches ü ü Tree-ify the graph - then use tree layout Hierarchical graph Layout Approaches ü ü Tree-ify the graph - then use tree layout Hierarchical graph layout Radial graph layout Optimization-based techniques ü ü Adjacency matrices Structurally-independent layout On-demand revealing of subgraphs Distortion-based views – Includes spring-embedding / force-directed layout – Hyperbolic browser ü (this list is not meant to be exhaustive)

Tree-based graph layout ü Select a tree-structure out of the graph – Breadth-first-search tree Tree-based graph layout ü Select a tree-structure out of the graph – Breadth-first-search tree – Minimum spanning tree – Other domain-specific structures ü Use a tree layout algorithm ü Benefits – Fast, supports interaction and refinement ü Drawbacks – Limited range of layouts

Tree-ify the graph Tree-ify the graph

Traditional Tree Layouts ü H-tree layout: best for balanced trees ü Radial view ü Traditional Tree Layouts ü H-tree layout: best for balanced trees ü Radial view ü Balloon view: related to 3 -d cone tree

Hierarchical graph layout ü Use directed structure of graph to inform layout ü Order Hierarchical graph layout ü Use directed structure of graph to inform layout ü Order the graph into distinct levels – this determines one dimension ü Now optimize within levels – determines the second dimension – minimize edge crossings, etc ü The method used in graphviz’s “dot” algorithm ü Great for directed acyclic graphs, but often misleading in the case of cycles

Hierarchical Graph Layout ü Evolution of the UNIX operating system ü Hierarchical layering based Hierarchical Graph Layout ü Evolution of the UNIX operating system ü Hierarchical layering based on descent

Hierarchical graph layout Gnutella network Hierarchical graph layout Gnutella network

Radial Layout ü Animated Exploration of Graphs with Radial Layout, Yee et al. , Radial Layout ü Animated Exploration of Graphs with Radial Layout, Yee et al. , 2001 ü Gnutella network

Optimization-based layout ü Specify constraints for layout – Series of mathematical equations – Hand Optimization-based layout ü Specify constraints for layout – Series of mathematical equations – Hand to “solver” which tries to optimize the constraints ü Examples – Minimize edge crossings, line bends, etc – Multi-dimensional scaling (preserve multi-dim distance) – Force-directed placement (use physics metaphor) ü Benefits – General applicability – Often customizable by adding new constraints ü Drawbacks – Approximate constraint satisfaction – Running time; “organic” look not always desired

Example: Force-Directed Layout Uses physics model to layout graph, Nodes repel each other, edges Example: Force-Directed Layout Uses physics model to layout graph, Nodes repel each other, edges act as springs, and some amount of friction or drag force is used. Special techniques to dampen “jitter”. http: //getspringy. com/demo. ht ml visual wordnet visuwords http: //www. kylescholz. com/projects/wordnet http: //www. visuwords. com/

Hyperbolic Browser: Inspiration Hyperbolic Browser: Inspiration

Using Distortion and Focus + Context ü The Hyperbolic Tree Browser The Hyperbolic Browser: Using Distortion and Focus + Context ü The Hyperbolic Tree Browser The Hyperbolic Browser: A Focus + Context Technique for Visualizing Large Hierarchies, Lamping & Rao, CHI 1996. – http: //www. inxight. com/products/sdks/st/ – Uses non-Euclidean geometry as basis of focus + context technique • The hyperbolic browser is a projection into a Euclidean space – a circle – The circumference of a circle increases at a linearly with radius (2 PI) – The circumference of a circle in hyperbolic space increases exponentially ü Exponential growth in space available with linear growth of radius – Makes tree layout easy ü Size of objects decreases with growth of radius – Reduces expense of drawing trees when cut-off at one pixel

Appearance of Initial Layout ü Root mapped at center ü Multiple generations of children Appearance of Initial Layout ü Root mapped at center ü Multiple generations of children mapped out towards edge of circle ü Drawing of nodes cuts off when less than one pixel

Structurally-Independent Layout ü Ignore the graph structure. ü Base the layout on other attributes Structurally-Independent Layout ü Ignore the graph structure. ü Base the layout on other attributes of the data ü Examples: – Geography – Time ü Benefits – Often very quick layout – Optimizes communication of particular features ü Drawbacks – May or may not present structure well

Structurally Independent Layout ü The “Skitter” Layout – Internet Connectivity ü Angle = Longitude Structurally Independent Layout ü The “Skitter” Layout – Internet Connectivity ü Angle = Longitude – geography ü Radius = Degree – # of connections http: //www. caida. org/research/topology/as_core_network/2007/images/ascore-simple. 2007_big. png Skitter, www. caida. org

References ü David Gotz and Michelle X. Zhou: Characterizing users' visual analytic activity for References ü David Gotz and Michelle X. Zhou: Characterizing users' visual analytic activity for insight provenance. ü Information Visualization 8(1): 42 -55, 2009. ü David Gotz and Zhen Wen: Behavior-driven visualization recommendation. IUI 2009: 315 -324, 2008. ü Eser Kandogan: Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations. IEEE VAST 2012: 73 -82. ü Mengdie Hu, Huahai Yang, Michelle X. Zhou, Liang Gou, Yunyao Li, and Eben Haber: Opinion. Blocks: A Crowd-Powered, Self-Improving Interactive Visual Analytic System for Understanding Opinion Text. To appear in Proc. INTERACT 2013. ü Zhen Wen and Michelle X. Zhou: Evaluating the Use of Data Transformation for Information Visualization. IEEE Trans. Vis. Comp. Graph. 14(6): 1309 -1316, 2008. ü Zhen Wen and Michelle X. Zhou: An optimization-based approach to dynamic data transformation for smart visualization. IUI 2008: 70 -79 ü Zhen Wen, Michelle X. Zhou, and Vikram Aggarwal: An Optimization-based Approach to Dynamic Visual Context Management. INFOVIS 2005: 25 -32. ü Huahai Yang, Yunyao Li, and Michelle X. Zhou: A Crowd-sourced Study: Understanding Users’ Comprehension and Preferences for Composing Information Graphics. In Submission to TOCHI 2013. ü Michelle X. Zhou and Min Chen: Automated Generation of Graphic Sketches by Example. IJCAI 2003: 65 -74 ü Michelle X. Zhou, Min Chen, and Ying Feng: Building a Visual Database for Example-based Graphics Generation. INFOVIS 2002: 23 -30. ü Michelle X. Zhou, Sheng Ma, and Ying Feng: Applying machine learning to automated information graphics generation. IBM Systems Journal 41(3): 504 -523 (2002)