72c03f7fed5da250c41f77ff510ffae0.ppt
- Количество слайдов: 67
The Case for A Data Mining Approach to Technical Analysis If I’m so smart how come I’m not rich yet ? ?
The Case for Data Mining You After Finance 9790 You Before Finance 9790
1. TA Is a Multivariate Recurrent Prediction Problem 2. The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 &3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 4 & 5 4. TA Practitioners Should Partner-Up With Data Mining Algorithms
5. TA Practitioners Should Abandon Outdated Methods & Focus On Their Proper Role in a Human / Machine Partnership Data Bases Data Mining Practitioner Data Mining Software
1. TA Is a Multivariate Recurrent Prediction Problem 2. The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 & 3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 3 & 4 4. TA Practitioners Should Partner-Up With Data Mining Algorithms
There Are Two Kinds of Prediction Problems 1. Regression: predicting the FUTURE value of a continuous variable 2. Classification: predicting the class of an object (situation)
In Both Regression & Classification The target variable concerns something that is not yet known!!
In Both Regression & Classification We use information that is known To make the prediction
Two Kinds of Prediction Problems 1. Regression: we wish to predict the FUTURE value of a continuous variable • • This variable is referred to as: the dependant variable, the target variable, Y The target variable in a regression problem is a continuous variable: ü can assume any value within a range ü Example: the % change in the S&P 500 from now (t 0) to a point in time 90 days into the future ( t+90)
Two Kinds of Prediction Problems 2. Classification: we wish to predict the class of an object whose class is not yet known • The target variable in a classification problem is a discrete variable ü Assumes a limited number of discrete values or names ( 0, 1), (+1, 0, -1), (benign / malignant) ü Example 1: the future class of a company with respect to solvency ( bankrupt / non-bankrupt) ü Example 2: the future trend of the market over the next 90 days ( up / down)
What Is A Recurrent Multivariate Prediction Problem? 1. The same type of prediction is required over and over again. 2. The same set of information is available each time a prediction is required • • The information is a set of values for each of a multitude of variables These variables are referred by the name “independent variables, predictors, candidate predictors, indicators, etc.
Examples Decision Problems Recurrent. Classification Problem Does the Object Belong to Class 1 or Class 2 1. The same type of prediction is required over and over again. – Medicine: Is a given tumor malignant or benign – Oil Exploration: At a given location: Is there Oil or No Oil (Drill / Don’t Drill) – Marketing: is given consumer a likely buyer or non-buyer for our product or service – Credit Approval: Is a given loan applicant likely to Repay or Default ( Lend / Don’t Lend) – Technical Analysis: Is the market more likely to advance or decline ( Buy / Sell)
Recurrent Decision Problems Examples Regression Problem The Future Value of A Continuous Y Variable 1. The same type of prediction is required over and over again. – Medicine: survival time for someone with disease X – Oil Exploration: amount of oil a new well is likely to produce – Marketing: What are the likely sales of a product – Technical Analysis: • • How much will the S&P 500 appreciate over the next month By how much will stock A beat the market over the next month
Recurrent Decision Problem 2. The same set of information is available each time a decision is required • Information is a set of values for a multitude of variables
Multivariate Information Set measured values for a multitude of variables § Medicine: set of results on medical tests ü Blood pressure, cholesterol level, blood sugar, etc. § § Oil Exploration: set of values for various geological parameters Marketing: set of demographic factors describing the person ü zip code, owns car yes/no, etc. § Credit Approval: set of credit factors describing the loan applicant ü. # years at current address, number of credit cards, payment history
Technical Analysis Information Set multitude of Indicator Readings at a given point in time 1. 2. 3. 4. 5. 6. 7. close / moving average = $ 1. 075 10 day ma / 50 day ma = 1. 067 RSI Indicator = 74 5 day ma volume / 25 day ma volume VIX (Implied Volatility on Stock Options) Ratio of Insider Sales / Purchases Ratio of Upside / Downside Volume
62. 1, +0. 1, -. 02 This point in time Is characterized by These indicator values 75. 5, -2. 1, -. 55 75. 5 62. 1 -2. 1 +0. 1 -. 55 -. 02 In Other Words: There Are 3 Candidate Predictor Variables.
We can treat this as Classification Problem Class 1: Market Return over the next 20 days is > 0 Class 2: Market Return over the next 20 days is < 0 The Target Variable: The Thing We Wish To Predict Is Discrete Variable that can Assume 2 Values > 0 or < 0 ( we can call this Class 1 or Class 2,
This point in time t 0 Is characterized by 75. 5, -2. 1, -. 55 62. 1, +0. 1, -. 02 75. 5 62. 1 -2. 1 +0. 1 -. 55 -. 02 Do These predictors (indicators ) Enable Us to classify (discriminate) Future Up-Moves from Future Down Moves? Moves Class 1 from Class 2
This point in time t 0 Is characterized by 75. 5, -2. 1, -. 55 62. 1, +0. 1, -. 02 t 0 t+20
Getting Matters of Time Straight t 0 and t+20 • t 0 refers to the date on which the prediction or classification is made – This is date of the most recent values of the predictor variables • t+20 or t+n refers to a time in the future that the target variable (Y) refers to – In the bankruptcy prediction problem it is any time over the following two years. – So the future looking horizon of the target need not be a fixed date.
Value of Y is based on Future Information Values of X’s based on past and current information Future Past Values of Predictors (X) based on What happens Back here & up to from t-n unitl t 0 Value of Target (Y) based on What happens out here From t 0 until t+n t 0 Time
1. TA Is a Multivariate Recurrent Prediction Problem 2. The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 & 3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 4 & 5 4. TA Practitioners Should Partner-Up With Data Mining Algorithms
Task 1: Define The Target Variable (Y) The Single Variable We Wish to Predict 1. Define the type of the problem: Classification or Regression A. Classification (Discrimination): Y defined as a class 2 or more distinct classes • Benign / malignant • Lend / Don’t Lend • Buy / Sell / • Strong Buy / Weak Buy/ Weak Sell / Strong Sell B. Regression: a continuous quantity (linear regression) • Future % increase in the market • Predicted amount of future purchases
Task 2: Propose Candidate Predictors (X’s) § These are merely candidates because we don’t know yet if any will be useful for predicting the target Y § Predictors must be based on data known at the time the prediction is made: § look back in time from present ü Tomorrow’s closing price – No ü Today’s closing price or prior closing prices- Yes § Not all indicators need to be useful, but some must be. ü Success in predictive modeling requires that some candidate predictors have useful information about the quantity or class to be predicted (Y)
Task 2 is crucial!!!!! If not done well…. . all is lost 1. The TASK of the domain expert……(YOU) 2. Expert must know which raw data series may contain relevant information 1. 2. 3. 4. Price Volume Open interest Interest rates, etc – – For example in our problem X’s must be stationary. That expose the information in the raw data series to the data mining algorithm 3. Expert proposes useful ways to transform raw series into indicators
Skipping Task 3 For A moment Building the Data Base Of Solved Examples From Which DM Algorithm Learns the Model
Tasks 4 & 5 4. Selecting Indicators for from Candidate List that warrant a place in the prediction model § § Determining which candidates contain relevant nonredundant information about (Y) The set of indicators that work synergistically 5. Determining the prediction function § § What is mathematical or logical formula for combining the values of the X’s to best estimate the value of Y A complex configural reasoning problem
What Is A Prediction Function • A mathematical or logical formula for combining the selected indicators to produce a best estimate of the target variable. • Simplest : Y – 1 predictor model – linear shape: y = ax 1+b b is value of the Y intercept of line a is the slope of the line X 1
Simplest Prediction Model 1 predictor & flat (no hills or valleys) in model’s surface The model predicts This value of Y Y Y intercept =b For this value of X 1
Multiple Linear Regression Combines Two or More X’s in a linear way to predict the value of Y • In multiple linear regression the combining function is assumed to be linear (weighted sum) • Y= a 1 X 1 + a 2 X 2 + a 3 X 3………. an. Xn + c. Regression coefficients (weights) are found By the method of Least-Squares Modern Data-Miners Need Not Assume A Linear Form They Allow the data mining algorithm to discover it. It May Be Non-Linear & Arbitrarily Complex
Linear Model : Flat Response (Y) Surface Y Is Linear Function of Two Features X 1, X 2 Y “A” slope X 1 X 2 “C” intercept “B” slope Y = A X 1 + B X 2 + C
Linear Model Is Best Fitting Tilted Flat Surface to the Data Y “A” slope X 1 X 2 “C” intercept “B” slope Y = A X 1 + B X 2 + C
The Model’s Prediction is The Altitude of the Y Surface Corresponding to values of X 1 and X 2 The model predicts This value of Y Y en Giv this ue val of X ` X 1 of X e X 2 n ive G th u val is 2
Thinking of A Prediction Model’s Output As A Super Indicator A new indicator that condenses & combines the information In two or more indicators (variables) Into a new or super indicator
Model Output As a “Super Indicator” • The output of a prediction model is a new variable, produced by function found by regression analysis • The function is a weighted sum of the indicators serving as inputs to the model ( X 1, X 2, etc) • The function’s weights been optimized to transform values of inputs into a best estimate of the target (Y). – method of least-squares is used to find optimal weights – Weights cause the line or plane to fit the historical data
Multiple Linear Regression Combines Two or More X’s in a linear way to predict the value of Y • In multiple linear regression the combining function is assumed to be linear (additive) • Y= a 1 X 1 + a 2 X 2 + a 3 X 3………. an. Xn + c. But What If the true shape of the relationship Between the indicators (X 1…. . Xn) is not a tilted Flat Surface…. but something more complex? ?
Multiple Linear Regression Combines Two or More X’s in a linear way to predict the value of Y • In multiple linear regression the combining function is assumed to be linear (additive) • Y= a 1 X 1 + a 2 X 2 + a 3 X 3………. an. Xn + c. Modern Data-Miners Do Not Assume the Model Surface Is Linear (free of hills and valleys) They Allow the data mining algorithm to discover its Shape, Which May Be Non-Linear
Suppose the authentic relationship Between X 1 & X 2 and Y Looks Like This Y X 1 X 2 3 Y = f ( X 1 , X 2 )
Forcing A Linearto Capture The Model Fails to Describe Non-Linear Phenomenon Misses The Boat! The Authentic Patterns in the Data Linear Model’s Predictions Too Low Y – future trend Linear Model’s Predictions Too High 2 X 1 X 2 – TA indicator X 2 Financial Markets Are Most Likely to Be Complex Non-Linear Systems
Tasks 4 & 5 Must Be Performed by Data Mining Software X 1 Candidate Predictors: X 2 A Set X 3 of X 4 Indicators X 5 Proposed By Human Expert 6 Xn Task 4 Which, if any, of the candidate predictors Contain information relevant to Y ? ? f? Y = f (x) Complex Combining System Function Outcome Y To Predict Task 5 What is the shape of the mathematical function best combines the indicators into a Predicted Value of Y
Tasks 4 & 5 Must Be Performed by Data Mining Software X 1 Candidate Predictors: X 2 A Set X 3 of X 4 Indicators X 5 Proposed By Human Expert 6 Xn Task 4 Which, if any, of the candidate predictors Contain information relevant to Y ? ? f? Y = f (x) Complex Combining System Function Outcome Y To Predict Task 5 Note!! What is the shape of the In When the DM method used mathematical function Is Multiple Linear Regression best combines the indicators The Prediction Function Is into a Predicted Value of Y Assumed to Be Linear
1. TA Is a Multivariate Recurrent Prediction Problem 2. The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 & 3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 4 & 5 4. TA Practitioners Should Partner-Up With Data Mining Algorithms
Human Experts & Data Mining Algorithms Have Different But Complementary Information Processing Abilities They Synergize Where Human’s Are Strong, DM Algorithms Weak Where Humans Experts Are Weak, DM Algorithms Strong
Definition: Configural Thinking a multitude of variables (indicators) must be considered simultaneously as an inseparable configuration (pattern). Considering each variable individually will not provide the correct conclusion.
Human Intelligence Strengths • Creative – Posing Problems (Y) – Proposing candidate indicators (Xs) & Weaknesses: • Weak Configural Reasoning – Distinguishing relevant from irrelevant X’s – Combining multiple variables 3
Machine Intelligence (Data Mining) Weaknesses • Lack Creativity – Unable to pose questions (define Y) – Unable to propose candidate indicators (define X’s). & Strengths • Excellent ability to handle numerous variables simultaneously Configural – Can identify relevant non-redundant indicators. – Can formulate multivariate prediction functions. 3
Who or What Should Handle the 5 Tasks? 1. 2. 3. 4. Define Y Propose Candidate Indicators X’s Build Data Base of Solved Cases Indicator Selection: which Candidate X’s Are relevant and non-redundant 5. Determining optimal combining function: a mathematical model that combines useful X’s into a prediction or classification decision A Task for Automated Data Mining Algorithms
The Evidence
Studies of Human Experts Solving Multivariate Recurrent Prediction Problems Shows……. . 1. Experts realize the necessity for configural reasoning (combining variables in complex non -linear fashion) 2. Experts are under the impression that they are combining information in a complex configural manner but studies show…. 3. Experts rely primarily on simple linear rules for combining information 4. Their performance is poor – – Inconsistent–same set of information elicits different decision on different : Correlation. 6 Correlation among experts is also low
Technical Analyst Faced With Large Set Of Conflicting Indicators Bulli sh Bearish Let each 5 bullish factors +13 bearish factorsfactor = -1 bullish factor = : & each bearish Sum bullish factors = +5& Sum bearish factors = -3
Human Experts (Technical Analysts) Rely on Intuitive Linear Combining +5 – 3 = +2 I’m bullish Sum bullish factors = +5 & Sum bearish factors = -3
Comparing the Subjective Predictions of Experts With Multiple Linear Regression Models Studies Began in 1954 The Question How accurate are the predictions of humans compared to multiple linear regression models given the same set of indicators ?
Expert’s Subjective Predictions vs. Multiple Regression Models Model Mean 0. 38 r 2 Predicted Vs. Actual Expert Mean 0. 11 r 2 Sales Effective. Expert Model Academic 1 2 3 Stocks Cancer survival 4 5 6 7 Student Att. Mental ill. 8 9 Teach. effective Business Failure
Meta-Analysis of 135 Similar Studies Draws A Conclusion From Multiple Independent Studies Study 1 Study 2 Study 3 Studyn
Swets, Monahan & Dawes 2000 • meta-analysis of >135 studies comparing 3 decision making methods. 1. Expert / intuitive (subjective) judgment based on anecdotal experience & informal reasoning. 2. Statistical models. 3. Combination of methods #1 & #2.
Wide Variety of Disciplines Were Examined in the 135 Studies. • Fields – Medical diagnosis – Penology (parole recidivism, violence) – Psychology(diagnosis and treatment selection), – Education ( predicting success in academics) – Predicting football game outcomes. • Results were quite consistent across fields
Results of Meta Analysis 135 Studies • In 96% of the studies, regression models beat or were equal to expert judgment. • In medical diagnosis expert judgment was always worse than regression model. • Experts beat statistical models in only 6 studies. The Question: With All This Evidence Why Do Experts Insist on Making Subjective / Intuitive Predictions & Decisions
Bottom Line For Technical Analysis Aronson’s Editorial Opinion When Making Predictions Rely On Objective Statistical Models Not Subjective Judgment 1
1. TA Is a Multivariate Recurrent Prediction Problem 2. The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 & 3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 4 & 5 4. TA Practitioners Should Partner-Up With Data Mining Algorithms
Task #3 Build Data Base Of Solved Examples The Data (Experience) Base Is Used By the Data Mining Algorithms to Learn How to Build The Prediction Model This task often takes 90 -95% of the time when developing A Data Mined Model
Data Base of Solved Examples Known Values of “Y” • What is a “solved example”? : A case (situations, examples, etc) for which the value of the target variable is known as well as the values of the X (candidate predictors) – Value of Y is known because the case happened in the past – Even though Y is a forward looking the case occurred long enough ago so that the value of Y is known. • Each case in the data based is described by 2 kinds of information 1. Value for the target variable Y. 2. The values for the candidate predictors
Examples of A Solved Case A. 1 day of market history for the S&P 500 1. Y value: % change over the month following the date of the case (regression) 2. X values: values of the indicators on the date of the case B. An oil drilling site 1. Y value: did the site produce oil or not (class) 2. X values: values of 10 geophysical parameters characterizing the site C. 1 company 1. Y value: company failed or did not fail within next 2 years 2. X values: values of various financial ratios taken from the most recent balance sheet and income statement
Data Base of Solved Examples • Contains many cases: (typically thousands) – Why so many? - data density. • From the many cases the DM algorithm tries to discover – Which, if any, of the candidate predictors can solve the regression or classification problem • Task #4 – How the selected predictors should be combined mathematically or logically to give the most accurate estimate possible of the value of the target (Y) • Task #5
Candidate Indicators X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9 X 10 X 11 XN Y 1 Examp. 2 1. 2 -2. 5 -5. 1 -2. 5 -5. 1 1. 2 -2. 5 -5. 1 0 -2. 5 -5. 1 1. 2 -2. 5 -5. 1 -2. 5 Examp. 3 1. 2 -2. 5 -5. 1 -2. 5 -5. 1 1. 2 -2. 5 -5. 1 -2. 5 1. 2 -2. 5 Matrix of Examples 1. 2 -2. 5 -5. 1 0 1. 2 -2. 5 -5. 1 Examp. 4 With Known -2. 5 -5. 1 1. 2 -2. 5 -5. 1 0 Values 1. 2 -2. 5 -5. 1 0 Of both Xs & Y -2. 5 -5. 1 1. 2 -2. 5 -5. 1 -2. 5 -5. 1 1. 2 -2. 5 -5. 1 0 1. 2 -2. 5 -5. 1 -2. 5 -5. 1 1. 2 -2. 5 -5. 1 0 Case N Examp. N 1. 2 -2. 5 -5. 1 -2. 5 -5. 1 1. 2 -2. 5 -5. 1 1 Examp. 1 1. 2 -2. 5 -5. 1 -2. 5 -5. 1 1. 2 -2. 5 -5. 1
Human Intelligence: Unchanging Computer Power & Machine Intelligence Growing Exponentially Power Arithmetic Scale Moore’s Law: An Increasing Competitive Advantage to the Data Miners Time
The A, B, C’s of Being An Intelligent Technical Analyst A. Know How to Use Data Mining Tools B. Know how to Define Data Mining Problems ( Define Y) C. Know how to define List of Information Rich Candidate Predictors (X’s)


