Скачать презентацию University of Washington Department of Electrical Engineering EE Скачать презентацию University of Washington Department of Electrical Engineering EE

bcda429a92d5bd772068a0860b128540.ppt

  • Количество слайдов: 51

University of Washington Department of Electrical Engineering EE 512 Spring, 2006 Graphical Models Jeff University of Washington Department of Electrical Engineering EE 512 Spring, 2006 Graphical Models Jeff A. Bilmes Lecture 1 Slides March 28 th, 2006 Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 1

Outline of Today’s Lecture • Class overview • What are graphical models • Semantics Outline of Today’s Lecture • Class overview • What are graphical models • Semantics of Bayesian networks Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 2

Books and Sources for Today • Jordan: Chapters 1 and 2 Lec 1: March Books and Sources for Today • Jordan: Chapters 1 and 2 Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 3

Class Road Map • L 1: Tues, 3/28: Overview, GMs, Intro BNs. • L Class Road Map • L 1: Tues, 3/28: Overview, GMs, Intro BNs. • L 2: Thur, 3/30 • L 3: Tues, 4/4 • L 4: Thur, 4/6 • L 5: Tue, 4/11 • L 6: Thur, 4/13 • L 7: Tues, 4/18 • L 8: Thur, 4/20 • L 9: Tue, 4/25 • L 10: Thur, 4/27 Lec 1: March 28 th, 2006 • L 11: Tues, 5/2 • L 12: Thur, 5/4 • L 13: Tues, 5/9 • L 14: Thur, 5/11 • L 15: Tue, 5/16 • L 16: Thur, 5/18 • L 17: Tues, 5/23 • L 18: Thur, 5/25 • L 19: Tue, 5/30 • L 20: Thur, 6/1: final presentations EE 512 - Graphical Models - J. Bilmes 4

Announcements • READING: Chapter 1, 2 in Jordan’s book (pick up book from basement Announcements • READING: Chapter 1, 2 in Jordan’s book (pick up book from basement of communications copy center). • List handout: name, department, and email • List handout: regular makeup slot, and discussion section • Syllabus • Course web page: http: //ssli. ee. washington. edu/ee 512 • Goal: powerpoint slides this quarter, they will be on the web page after lecture (but not before as you are getting them hot off the press). Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 5

Graphical Models • A graphical model is a visual, abstract, and mathematically formal description Graphical Models • A graphical model is a visual, abstract, and mathematically formal description of properties of families of probability distributions (densities, mass functions) • There are many different types of Graphical model, ex: – Bayesian Networks – Factor Graph – Markov Random Fields – Chain Graph Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 6

GMs cover many well-known methods Graphical Models em O Chain Graphs Causal Models S GMs cover many well-known methods Graphical Models em O Chain Graphs Causal Models S er th Factor Graphs DGMs Dependency Networks FST cs nti a UGMs AR Bayesian Networks ZMs DBNs Mixture Models HMM Kalman Factorial HMM/Mixed Memory Markov Models Simple Models Decision Trees Segment Models MRFs LDA Gibbs/Boltzman Distributions PCA BMMs Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 7

Graphical Models Provide: • • • Structure Algorithms Language Approximations Data-Bases Lec 1: March Graphical Models Provide: • • • Structure Algorithms Language Approximations Data-Bases Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 8

Graphical Models Provide GMs give us: I. III. Structure: A method to explore the Graphical Models Provide GMs give us: I. III. Structure: A method to explore the structure of “natural” phenomena (causal vs. correlated relations, properties of natural signals and scenes) Algorithms: A set of algorithms that provide “efficient” probabilistic inference and statistical decision making Language: A mathematically formal, abstract, visual language with which to efficiently discuss families of probabilistic models and their properties. Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 9

GMs Provide GMs give us (cont): IV. Approximation: Methods to explore systems of approximation GMs Provide GMs give us (cont): IV. Approximation: Methods to explore systems of approximation and their implications. E. g. , what are the consequences of a (perhaps known to be) wrong assumption? Data-base: Provide a probabilistic “data-base” and corresponding “search algorithms” for making queries about properties in such model families. Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 10

GMs • There are many different types of GM. • Each GM has its GMs • There are many different types of GM. • Each GM has its semantics • A GM (under the current semantics) is really a set of constraints. The GM represents all probability distributions that obey these constraints, including those that obey additional constraints (but not including those that obey fewer constraints). • Most often, the constraints are some form of factorization property, e. g. , f() factorizes (is composed of a product of factors of subsets of arguments). Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 11

Types of Queries • Several types of queries we may be interested in: – Types of Queries • Several types of queries we may be interested in: – Compute: p(one subset of vars) – Compute: p(one subset of vars| another subset of vars) – Find the N most probable configurations of one subset of variables given assignments of values to some other sets – Q: Is one subset independent of another subset? – Q: Is one subset independent of another given a third? • How efficiently can we do this? Can this question be answered? What if it is too costly, can we approximate, and if so, how well? These are questions we will answer this term. • GMs are like a probabilistic data-base (or data structure), a system that can be queried to provide answers to these sorts of questions. Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 12

Example • Typical goal of pattern recognition: – training (say, EM or gradient descent), Example • Typical goal of pattern recognition: – training (say, EM or gradient descent), need query of form: In this form, we need to compute p(o, h) efficiently. – Bayes decision rule, need to find best class for a given unknown pattern: – but this is yet another query on a probability distribution. – We can train, and perform Bayes decision theory quickly if we can compute with probabilities quickly. Graphical models provide a way to reason about, and understand when this is possible, and if not, how to reasonably approximate. Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 13

Some Notation • Random variables Xi, Yi, X, Y (scalar or vector) • Distributions: Some Notation • Random variables Xi, Yi, X, Y (scalar or vector) • Distributions: • Subsets: Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 14

Main types of Graphical Models • Markov Random Fields – a form of undirected Main types of Graphical Models • Markov Random Fields – a form of undirected graphical model – relatively simple to understand their semantics – also, log-linear models, Gibbs distributions, Boltzman distributions, many “exponential models”, conditional random fields (CRFs), etc. • Bayesian networks – a form of directed graphical model – originally developed to represent a form of causality, but not ideal for that (they still represent factorization) – Semantics more interesting (but trickier) than MRFs • Factor Graphs – pure, the assembly language models for factorization properties – came out of coding theory community (LDPC, Turbo codes) Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 15

Main types of Graphical Models • Chain graphs: – Hybrid between Bayesian networks and Main types of Graphical Models • Chain graphs: – Hybrid between Bayesian networks and MRFs – A set of clusters of undirected nodes connected as directed links – Not as widely used, but very powerful. • Ancestral graphs – we probably won’t cover these. Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 16

Bayesian Network Examples Mixture models C X Lec 1: March 28 th, 2006 Markov Bayesian Network Examples Mixture models C X Lec 1: March 28 th, 2006 Markov Chains Q 1 Q 2 Q 3 Q 4 EE 512 - Graphical Models - J. Bilmes 17

GMs: PCA and Factor Analysis Q PCA: Q = , R 0, A = GMs: PCA and Factor Analysis Q PCA: Q = , R 0, A = ortho X R FA: Q = I, R = diagonal Y Other generalizations possible E. g. , Q = gen. diagonal, or capture using general A since Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 18

Independent Component Analysis I 1 X 1 I 2 X 3 Lec 1: March Independent Component Analysis I 1 X 1 I 2 X 3 Lec 1: March 28 th, 2006 X 4 The data X 1: 4 is explained by the two (marginally) independent causes. EE 512 - Graphical Models - J. Bilmes 19

Linear Discriminant Analysis C • Class conditional data has diff. mean but common covariance Linear Discriminant Analysis C • Class conditional data has diff. mean but common covariance matrix. X • Fisher’s formulation: project onto space spanned by the means. Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 20

Extensions to LDA HDA/QDA MDA HMDA C C M M C X X Heteroschedastic Extensions to LDA HDA/QDA MDA HMDA C C M M C X X Heteroschedastic Discriminant Analysis/Quadratic Discriminant Analysis Mixture Discriminant Analsysis Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes X Both 21

Generalized Decision Trees I D 1 D 2 • Generalized Probabilistic Decision Trees • Generalized Decision Trees I D 1 D 2 • Generalized Probabilistic Decision Trees • Hierarchical Mixtures of Experts (Jordan) D 3 O Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 22

Example: Printer Troubleshooting Dechter Lec 1: March 28 th, 2006 EE 512 - Graphical Example: Printer Troubleshooting Dechter Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 23

Classifier Combination Mixture of Experts (Sum rule) X Other Combination Schemes Naive Bayes (prod. Classifier Combination Mixture of Experts (Sum rule) X Other Combination Schemes Naive Bayes (prod. rule) X 1 . . . X 2 C 1 S C Lec 1: March 28 th, 2006 C XN C EE 512 - Graphical Models - J. Bilmes C 2 X 1 X 2 24

Discriminative and Generative Models Generative Model Discriminative Model C X Lec 1: March 28 Discriminative and Generative Models Generative Model Discriminative Model C X Lec 1: March 28 th, 2006 C X EE 512 - Graphical Models - J. Bilmes 25

HMMs/Kalman Filter HMM Q 1 Q 2 Q 3 Q 4 X 1 X HMMs/Kalman Filter HMM Q 1 Q 2 Q 3 Q 4 X 1 X 2 X 3 X 4 Autoregressive HMM Q 1 Q 3 Q 4 X 1 Lec 1: March 28 th, 2006 Q 2 X 3 X 4 EE 512 - Graphical Models - J. Bilmes 26

Switching Kalman Filter Q 1 Q 3 Q 4 S 1 S 2 S Switching Kalman Filter Q 1 Q 3 Q 4 S 1 S 2 S 3 S 4 X 1 Lec 1: March 28 th, 2006 Q 2 X 3 X 4 EE 512 - Graphical Models - J. Bilmes 27

Factorial HMM Q’ 1 Q’ 3 Q’ 4 Q 1 Q 2 Q 3 Factorial HMM Q’ 1 Q’ 3 Q’ 4 Q 1 Q 2 Q 3 Q 4 X 1 Lec 1: March 28 th, 2006 Q’ 2 X 3 X 4 EE 512 - Graphical Models - J. Bilmes 28

Standard Language Modeling • Example: standard 4 -gram Lec 1: March 28 th, 2006 Standard Language Modeling • Example: standard 4 -gram Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 29

Interpolated Uni-, Bi-, Tri-Grams • Nothing gets zero probability Lec 1: March 28 th, Interpolated Uni-, Bi-, Tri-Grams • Nothing gets zero probability Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 30

Conditional mixture tri-grams Lec 1: March 28 th, 2006 EE 512 - Graphical Models Conditional mixture tri-grams Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 31

Bayesian Networks • … and so on • We need to be more formal Bayesian Networks • … and so on • We need to be more formal about what BNs mean. • In the rest of this, and in the next, lecture, we start with basic semantics of BNs, move on to undirected models, and then come back to BNs again to clean up … Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 32

Bayesian Networks • Has nothing to do with “Bayesian statistical models” (there are Bayesian Bayesian Networks • Has nothing to do with “Bayesian statistical models” (there are Bayesian and non-Bayesian networks). valid: Lec 1: March 28 th, 2006 invalid: EE 512 - Graphical Models - J. Bilmes 33

BN: Alarm Network Example • Compact representation: factors of probabilities of children given parents BN: Alarm Network Example • Compact representation: factors of probabilities of children given parents (J. Pearl, 1988) Sub-family specification: Directed acyclic graph (DAG) • Nodes - random variables • Edges - direct “influence” E B P(A | E, B) Burglary Earthquake 0. 1 0. 2 0. 8 e b 0. 9 0. 1 e b Alarm 0. 9 e b Radio e b 0. 01 0. 99 Call Together (graph and inst. ): Defines a unique distribution in a factored form Lec 1: March 28 th, 2006 Instantiation: Set of conditional probability distributions EE 512 - Graphical Models - J. Bilmes 34

Bayesian Networks • Reminder: chain rule of probability. For any order Lec 1: March Bayesian Networks • Reminder: chain rule of probability. For any order Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 35

Bayesian Networks Lec 1: March 28 th, 2006 EE 512 - Graphical Models - Bayesian Networks Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 36

Conditional Independence Lec 1: March 28 th, 2006 EE 512 - Graphical Models - Conditional Independence Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 37

Conditional Independence & BNs • GMs can help answer CI queries: A X B Conditional Independence & BNs • GMs can help answer CI queries: A X B Z C E D No!! Y Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 38

Bayesian Networks & CI Lec 1: March 28 th, 2006 EE 512 - Graphical Bayesian Networks & CI Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 39

Bayesian Networks & CI Lec 1: March 28 th, 2006 EE 512 - Graphical Bayesian Networks & CI Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 40

Bayesian Networks & CI Lec 1: March 28 th, 2006 EE 512 - Graphical Bayesian Networks & CI Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 41

Bayesian Networks & CI Lec 1: March 28 th, 2006 EE 512 - Graphical Bayesian Networks & CI Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 42

Pictorial d-separation: blocked/unblocked paths Blocked Paths Lec 1: March 28 th, 2006 Unblocked Paths Pictorial d-separation: blocked/unblocked paths Blocked Paths Lec 1: March 28 th, 2006 Unblocked Paths EE 512 - Graphical Models - J. Bilmes 43

Three Canonical Cases • Three 3 -node examples of BNs and their conditional independence Three Canonical Cases • Three 3 -node examples of BNs and their conditional independence statements. V 1 V 2 Lec 1: March 28 th, 2006 V 3 V 2 V 3 EE 512 - Graphical Models - J. Bilmes V 1 44

Case 1 • Markov Chain Lec 1: March 28 th, 2006 EE 512 - Case 1 • Markov Chain Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 45

Case 2 • Still a Markov Chain Lec 1: March 28 th, 2006 EE Case 2 • Still a Markov Chain Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 46

Case 3 • NOT a Markov Chain Lec 1: March 28 th, 2006 EE Case 3 • NOT a Markov Chain Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 47

Examples of the three cases SUVs Greenhouse Gasses Global Warming Lung Cancer Smoking Bad Examples of the three cases SUVs Greenhouse Gasses Global Warming Lung Cancer Smoking Bad Breath Genetics Cancer Smoking Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 48

Bayes Ball • simple algorithm, ball bouncing along path in graph, to tell if Bayes Ball • simple algorithm, ball bouncing along path in graph, to tell if path is blocked or not (more details in text). Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 49

What are implied conditional independences? Lec 1: March 28 th, 2006 EE 512 - What are implied conditional independences? Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 50

Two Views of a Family Lec 1: March 28 th, 2006 EE 512 - Two Views of a Family Lec 1: March 28 th, 2006 EE 512 - Graphical Models - J. Bilmes 51