Скачать презентацию Learning from Learning Curves using Learning Factors Analysis Скачать презентацию Learning from Learning Curves using Learning Factors Analysis

c04b610cdf00262fc15eb8e438ec7b16.ppt

  • Количество слайдов: 47

Learning from Learning Curves using Learning Factors Analysis Hao Cen, Kenneth Koedinger, Brian Junker Learning from Learning Curves using Learning Factors Analysis Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction Cen, H. , Koedinger, K. , Junker, B. Learning Factors Institute Analysis - A General Method for Cognitive Model Evaluation and Improvement. the 8 th International Carnegie Mellon University Conference on Intelligent Tutoring Systems. 2006. Science, 26(2) Cen, H. , Koedinger, K. , Junker, B. Is Over Practice Necessary? Improving Learning Efficiency with the Cognitive Tutor. The 13 th International Conference on Artificial Intelligence in Education (AIED 2007). 2007.

Student Performance As They Practice with the LISP Tutor Student Performance As They Practice with the LISP Tutor

Production Rule Analysis Evidence for Production Rule as an appropriate unit of knowledge acquisition Production Rule Analysis Evidence for Production Rule as an appropriate unit of knowledge acquisition

Using learning curves to evaluate a cognitive model n n Lisp Tutor Model q Using learning curves to evaluate a cognitive model n n Lisp Tutor Model q Learning curves used to validate cognitive model q Fit better when organized by knowledge components (productions) rather than surface forms (programming language terms) But, curves not smooth for some production rules q “Blips” in leaning curves indicate the knowledge representation may not be right q Corbett, Anderson, O’Brien (1995) q Let me illustrate …

Curve for “Declare Parameter” production rule What’s happening on the 6 th & 10 Curve for “Declare Parameter” production rule What’s happening on the 6 th & 10 th opportunities? n n How are steps with blips different from others? What’s the unique feature or factor explaining these blips?

Can modify cognitive model using unique factor present at “blips” n n Blips occur Can modify cognitive model using unique factor present at “blips” n n Blips occur when to-be-written program has 2 parameters Split Declare-Parameter by parameter-number factor: q q Declare-first-parameter Declare-second-parameter

Learning curve analysis by hand & eye … n Steps in programming problems where Learning curve analysis by hand & eye … n Steps in programming problems where the function (“method”) has two parameters (Corbett, Anderson, O’Brien, 1995)

Can learning curve analysis be automated? n Learning curve analysis q q q n Can learning curve analysis be automated? n Learning curve analysis q q q n Identify blips by hand & eye Manually create a new model Qualitative judgment Need to automatically: q q q Identify blips by system Propose alternative cognitive models Evaluate each model quantitatively

Overview n n n Learning Factors Analysis algorithm A Geometry Cognitive Model and Log Overview n n n Learning Factors Analysis algorithm A Geometry Cognitive Model and Log Data Experiments and Results

Learning Factors Analysis (LFA): A Tool for KC Analysis n LFA is a method Learning Factors Analysis (LFA): A Tool for KC Analysis n LFA is a method for discovering & evaluating alternative cognitive models q n Finds knowledge component decomposition that best predicts student performance & learning transfer Inputs q q Data: Student success on tasks in domain over time Codes: Hypothesized factors that drive task difficulty n n A mapping between these factors & domain tasks Outputs q q A rank ordering of most predictive cognitive models For each model, a measure of its generalizability & parameter estimates for knowledge component difficulty, learning rates, & student proficiency

Learning Factors Analysis (LFA) draws from multiple disciplines n Machine Learning & AI q Learning Factors Analysis (LFA) draws from multiple disciplines n Machine Learning & AI q q Combinatorial search (Russell & Norvig, 2003) Exponential-family principal component analysis (Gordon, 2002) n Psychometrics & Statistics q q q Q Matrix & Rule Space (Tatsuoka 1983, Barnes 2005) Item response learning model (Draney, et al. , 1995) Item response assessment models (Di. Bello, et al. , 1995; Embretson, 1997; von Davier, 2005) n Cognitive Psychology q Learning curve analysis (Corbett, et al 1995)

Steps in Learning Factors Analysis We’ve talked about some of these steps 1 -4 Steps in Learning Factors Analysis We’ve talked about some of these steps 1 -4 before …

LFA – 1. The Q Matrix n How to represent relationship between knowledge components LFA – 1. The Q Matrix n How to represent relationship between knowledge components and student tasks? q n Tasks also called items, questions, problems, or steps (in problems) Q-Matrix (Tatsuoka, 1983) Item | KC Add Sub Mul Div 2*8 0 0 1 0 2*8 - 3 0 1 1 0 q q 2* 8 is a single-KC item 2*8 – 3 is a conjunctive-KC item, involves two KCs What good is a Q matrix? Used to predict student accuracy on items not previously seen, based on KCs involved 13

LFA – 2. The Statistical Model n n Problem: How to predict student responses LFA – 2. The Statistical Model n n Problem: How to predict student responses from model? Solutions: Additive Factor Model (Draney, et al. 1995, Cen, Koedinger, Junker, 2006)

LFA – 2. An alternative “conjunctive” model n Conjunctive Factor Model (Cen, Koedinger, Junker, LFA – 2. An alternative “conjunctive” model n Conjunctive Factor Model (Cen, Koedinger, Junker, 2008)

LFA - 4. Model Evaluation • How to compare cognitive models? • A good LFA - 4. Model Evaluation • How to compare cognitive models? • A good model minimizes prediction risk by balancing fit with data & complexity (Wasserman 2005) • Compare BIC for the cognitive models • BIC is “Bayesian Information Criteria” • BIC = -2*log-likelihood + num. Par * log(num. Ob) • Better (lower) BIC == better predict data that haven’t seen • Mimics cross validation, but is faster to compute 16

LFA – 5. Expert Labeling & P-Matrix n n Problem: How to find the LFA – 5. Expert Labeling & P-Matrix n n Problem: How to find the potentials to improve the existing cognitive model? Solution: Have experts look for difficulty factors that are candidates for new KCs. Put these in P matrix. Item | Skill Add Q Matrix Sub Mul Item | Skill Deal with P Matrix negative 0 Order of Ops 0 2*8 0 0 1 2*8 – 3 0 1 1 2*8 – 3 0 0 2*8 - 30 0 1 1 2*8 - 30 1 0 3+2*8 1 0 1 3+2*8 0 1 …

LFA – 5. Expert Labeling and PMatrix n Operators on Q and P q LFA – 5. Expert Labeling and PMatrix n Operators on Q and P q q Q + P[, 1] Q[, 2] * P[, 1] Q- Matrix after add P[, 1] Item | Skill Add Sub Mul Div 2*8 0 0 1 0 2*8 – 3 0 1 1 2*8 - 30 0 1 1 Q- Matrix after splitting P[, 1], Q[, 2] neg Item | Skill Add Sub Mul Div 0 2*8 0 0 1 0 Subneg 0 0 0 2*8 – 3 0 1 1 0 0 0 1 2*8 - 30 0 0 1 0 1

LFA – 6. Model Search n Problem: How to find best model given P-matrix? LFA – 6. Model Search n Problem: How to find best model given P-matrix? Solution: Combinatorial search n A best-first search algorithm (Russell & Norvig 2002) n q n Guided by a heuristic, such as BIC Start from an existing model

Goal: Do model selection within the logistic regression model space Steps: 1. Start from Goal: Do model selection within the logistic regression model space Steps: 1. Start from an initial “node” in search graph 2. Iteratively create new child nodes by splitting a model using covariates or “factors” 3. Employ a heuristic (e. g. fit to learning curve) to rank each node 4. Expand from a new node in the heuristic order by going back to step 2 n Combinatorial Search

LFA – 6. Model Search Automates the process of hypothesizing alternative KC models & LFA – 6. Model Search Automates the process of hypothesizing alternative KC models & testing them against data

Overview n n n Learning Factors Analysis algorithm A Geometry Cognitive Model and Log Overview n n n Learning Factors Analysis algorithm A Geometry Cognitive Model and Log Data Experiments and Results

Domain of current study n. Domain of study: the area unit of the geometry Domain of current study n. Domain of study: the area unit of the geometry tutor n. Cognitive model: 15 skills 1. Circle-area 2. Circle-circumference 3. Circle-diameter 4. Circle-radius 5. Compose-by-addition 6. Compose-by-multiplication 7. Parallelogram-area 8. Parallelogram-side 9. Pentagon-area 10. Pentagon-side 11. Trapezoid-area 12. Trapezoid-base 13. Trapezoid-height 14. Triangle-area 15. Triangle-side

Log Data -- Skills in the Base Model Student Step Skill Opportunity A p Log Data -- Skills in the Base Model Student Step Skill Opportunity A p 1 s 1 Circle-area 1 A p 2 s 1 Circle-area 2 A p 2 s 2 Rectangle-area 1 A p 2 s 3 Compose-by-addition 1 A p 3 s 1 Circle-area 3

The Split n Binary Split -- splits a skill with a factor value, & The Split n Binary Split -- splits a skill with a factor value, & a skill without the factor value. After Splitting Circle-area by Embed Student Step Skill Opportunity Factor- Embed Student Step Skill Opportunity A p 1 s 1 Circle-area 1 alone A p 1 s 1 Circle-area-alone 1 A p 2 s 1 Circle-area 2 embed A p 2 s 1 Circlearea-embed 1 A p 2 s 2 Rectangle-area 1 A p 2 s 3 Compose-byaddition 1 A p 3 s 1 Circle-area 3 A p 3 s 1 Circle-area-alone 2 alone

The Heuristics n Good model captures sufficient variation in data but is not overly The Heuristics n Good model captures sufficient variation in data but is not overly complicated balance between model fit & complexity minimizing prediction risk (Wasserman 2005) AIC and BIC used as heuristics in the search q two estimators for prediction risk q balance between fit & parisimony q select models that fit well without being too complex q AIC = -2*log-likelihood + 2*number of parameters q BIC = -2*log-likelihood + number of parameters * number of observations q n

System: Best-first Search n n n an informed graph search algorithm guided by a System: Best-first Search n n n an informed graph search algorithm guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

System: Best-first Search n n n an informed graph search algorithm guided by a System: Best-first Search n n n an informed graph search algorithm guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

System: Best-first Search n n n an informed graph search algorithm guided by a System: Best-first Search n n n an informed graph search algorithm guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

System: Best-first Search n n n an informed graph search algorithm guided by a System: Best-first Search n n n an informed graph search algorithm guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

System: Best-first Search n n n an informed graph search algorithm guided by a System: Best-first Search n n n an informed graph search algorithm guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

System: Best-first Search n n n an informed graph search algorithm guided by a System: Best-first Search n n n an informed graph search algorithm guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

Overview n n n Learning Factors Analysis algorithm A Geometry Cognitive Model and Log Overview n n n Learning Factors Analysis algorithm A Geometry Cognitive Model and Log Data Experiments and Results

Experiment 1 n n Q: How can we describe learning behavior in terms of Experiment 1 n n Q: How can we describe learning behavior in terms of an existing cognitive model? A: Fit logistic regression model in equation above (slide 27) & get coefficients

Experiment 1 Higher intercept of skill -> easier skill Higher slope of skill -> Experiment 1 Higher intercept of skill -> easier skill Higher slope of skill -> faster students learn it Skill Intercept Slope Avg Opportunties Initial Probability Avg Probability Final Probability 2. 14 -0. 01 14. 9 0. 95 0. 94 0. 93 -2. 16 0. 45 4. 3 0. 2 0. 63 0. 84 Parallelogram-area Pentagon-area Student Intercept student 0 student 1 0. 82 student 2 n 1. 18 0. 21 Results: Higher intercept of student -> student initially knew more Model Statistics AIC 3, 950 BIC 4, 285 MAD 0. 083 The AIC, BIC & MAD statistics provide alternative ways to evaluate models MAD = Mean Absolute Deviation

Experiment 2 n n Q: How can we improve a cognitive model? A: Run Experiment 2 n n Q: How can we improve a cognitive model? A: Run LFA on data including factors & search through model space

Experiment 2 – Results with BIC Model 1 Model 2 Model 3 Number of Experiment 2 – Results with BIC Model 1 Model 2 Model 3 Number of Splits: 2 1. 1. 2. 3. Binary split compose-by -multiplication by figurepart segment Binary split circleradius by repeat Binary split compose-by -addition by backward 2. 3. Binary split compose-bymultiplication by figurepart segment Binary split circle-radius by repeat Binary split compose-byaddition by figurepart areadifference 2. Binary split compose-bymultiplication by figurepart segment Binary split circle-radius by repeat Number of Skills: 18 Number of Skills: 17 AIC: 3, 888. 67 BIC: 4, 248. 86 MAD: 0. 071 AIC: 3, 897. 20 BIC: 4, 251. 07 MAD: 0. 075 n Splitting Compose-by-multiplication into two skills – CMarea and CMsegment, making a distinction of the geometric quantity being multiplied

Experiment 3 n n Q: Will some skills be better merged than if they Experiment 3 n n Q: Will some skills be better merged than if they are separate skills? Can LFA recover some elements of original model if we search from a merged model, given difficulty factors? A: Run LFA on the data of a merged model, and search through the model space

Experiment 3 – Merged Model n Merge some skills in the original model to Experiment 3 – Merged Model n Merge some skills in the original model to remove some distinctions, add as a difficulty factors to consider n The merged model has 8 skills: q q q q n Circle-area, Circle-radius => Circle-circumference, Circle-diameter => Circle-CD Parallelogram-area and Parallelogram-side => Parallelogram Pentagon-area, Pentagon-side => Pentagon Trapezoid-area, Trapezoid-base, Trapezoid-height => Trapezoid Triangle -area, Triangle -side => Triangle Compose-by-addition Compose-by-multiplication Add difficulty factor “direction”: forward vs. backward

Experiment 3 – Results Model 1 Model 2 Model 3 Number of Splits: 4 Experiment 3 – Results Model 1 Model 2 Model 3 Number of Splits: 4 Number of Splits: 3 Number of Splits: 4 Number of skills: 12 Number of skills: 11 Number of skills: 12 Circle *area Circle *radius*initial Circle *radius*repeat Compose-by-addition*areadifference Compose-by-multiplication*area -combination Compose-bymultiplication*segment All skills are the same as those in model 1 except that 1. Circle is split into Circle *backward*initial, Circle *backward*repeat, Circle*forward, 2. Compose-by-addition is not split All skills are the same as those in model 1 except that 1. Circle is split into Circle *backward*initial, Circle *backward*repeat, Circle *forward, 2. Compose-by-addition is split into Compose-by-addition and Compose-by-addition*segment AIC: 3, 884. 95 AIC: 3, 893. 477 AIC: 3, 887. 42 BIC: 4, 169. 315 BIC: 4, 171. 523 BIC: 4, 171. 786 MAD: 0. 075 MAD: 0. 079 MAD: 0. 077

Experiment 3 – Results n n Recovered three skills (Circle, Parallelogram, Triangle) => distinctions Experiment 3 – Results n n Recovered three skills (Circle, Parallelogram, Triangle) => distinctions made in the original model are necessary Partially recovered two skills (Triangle, Trapezoid) => some original distinctions necessary, some are not Did not recover one skill (Circle-CD) => original distinction may not be necessary Recovered one skill (Pentagon) in a different way => Original distinction may not be as significant as distinction caused by another factor

Beyond Experiments 1 -3 n Q: Can we use LFA to improve tutor curriculum Beyond Experiments 1 -3 n Q: Can we use LFA to improve tutor curriculum by identifying over-taught or under -taught rules? q n Thus adjust their contribution to curriculum length without compromising student performance A: Combine results from experiments 1 -3

Beyond Experiments 1 -3 -Results n Parallelogram-side is over taught. q q n Trapezoid-height Beyond Experiments 1 -3 -Results n Parallelogram-side is over taught. q q n Trapezoid-height is under taught. q q n high intercept (2. 06), low slope (-. 01). initial success probability. 94, average number of practices per student is 15 low intercept (-1. 55), positive slope (. 27). final success probability is. 69, far away from the level of mastery, the average number of practices per student is 4. Suggestions for curriculum improvement q q Reducing the amount of practice for Parallelogram-side should save student time without compromising their performance. More practice on Trapezoid-height is needed for students to reach mastery.

Beyond Experiments 1 -3 -Results n How about Compose-by-multiplication? Intercept CM -. 15 slope Beyond Experiments 1 -3 -Results n How about Compose-by-multiplication? Intercept CM -. 15 slope Avg Practice Opportunties . 1 10. 2 Initial Probability. 65 Avg Probability Final Probability . 84 . 92 With final probability. 92 students seem to have mastered Compose-by-multiplication.

Beyond Experiments 1 -3 -- Results n However, after split Intercept CM slope Avg Beyond Experiments 1 -3 -- Results n However, after split Intercept CM slope Avg Practice Opportunties Initial Probability Avg Probability Final Probability -. 15 . 1 10. 2 . 65 . 84 . 92 CMarea -. 009 . 17 9 . 64 . 86 . 96 CMsegment -1. 42 . 48 1. 9 . 32 . 54 . 60 CMarea does well with final probability. 96 But CMsegment has final probability only. 60 and an average amount of practice less than 2 Suggestions for curriculum improvement: increase the amount of practice for CMsegment

Conclusions and Future Work n n Learning Factors Analysis combines statistics, human expertise, & Conclusions and Future Work n n Learning Factors Analysis combines statistics, human expertise, & combinatorial search to evaluate & improve a cognitive model System able to evaluate a model in seconds & search 100 s of models in 4 -5 hours q q n Model statistics are meaningful Improved models are interpretable & suggest tutor improvement Planning to use LFA for datasets from other tutors to test potential for model & tutor improvement

END END