Скачать презентацию Mean and Variance Distribution statistics Скачать презентацию Mean and Variance Distribution statistics

df1623aadf6aab81d699837962f27e1f.ppt

  • Количество слайдов: 61

Mean and Variance Mean and Variance

Distribution ? Distribution ?

statistics dist’n of a sample (sample) statistic pop’n dist’n (population) parameter statistics dist’n of a sample (sample) statistic pop’n dist’n (population) parameter

pop’n dist’n X %freq Head 1 0. 5 Tail 0 0. 5 Total dist’n pop’n dist’n X %freq Head 1 0. 5 Tail 0 0. 5 Total dist’n of a sample X %freq Head 1 0. 35 0. 6 Tail 0 0. 65 1. 0 Total X freq %freq Head 1 20 0. 4 Tail 0 30 50 Total 1. 0

Y freq %freq Y %freq 1 10 0. 1 1 1/6 2 20 0. Y freq %freq Y %freq 1 10 0. 1 1 1/6 2 20 0. 2 2 1/6 3 10 0. 1 3 1/6 4 20 0. 2 4 1/6 5 20 0. 2 5 1/6 6 20 0. 2 6 1/6 Total 100 1. 0 Total 1. 0

A new variable X from mseg of credit card data mseg X Low Spender A new variable X from mseg of credit card data mseg X Low Spender Med Low Spender Average Spender Med High Spender 1 2 3 4 5

Variable X of credit card data ? X freq %freq X %freq 1 26 Variable X of credit card data ? X freq %freq X %freq 1 26 0. 26 1 ? 2 20 0. 20 2 ? 3 11 0. 11 3 ? 4 25 0. 25 4 ? 5 18 0. 18 5 ? Total 100 1. 00 Total 1. 00

Mean, Mode Median (truncated, winsorized) Mean Measure for location (center) Mean, Mode Median (truncated, winsorized) Mean Measure for location (center)

Mean Mean

Median Median

50% Median 50% 50% Median 50%

Mode Mode

Hit/Stop Burst Hit/Stop Burst

Dealer's hidden card ? Dealer's hidden card ?

1, 11 2 -9 10 1, 11 2 -9 10

Outlier Outlier

5 4 6 6 Truncated mean / Winsorized mean 5 4 6 6 Truncated mean / Winsorized mean

5 4 6 6 1 4 5 6 6 9 4 5 6 6 5 4 6 6 1 4 5 6 6 9 4 5 6 6 4 4 5 6 6 6 Truncated mean / Winsorized mean

25% 75% Q 1 25 percentile 50% Q 2 50 percentile Median Quartiles 75% 25% 75% Q 1 25 percentile 50% Q 2 50 percentile Median Quartiles 75% 25% Q 3 75 percentile

Wrong housing statistics make wrong real estate policy. While median is better statistic than Wrong housing statistics make wrong real estate policy. While median is better statistic than mean in representing house prices, Korean government publishes statistics calculated by mean on house prices. Mean price can be distorted by just one or two extreme prices. 일러스트=유재일 기자 jae [email protected] com 빗나간 주택통계 부동산 정책도 헛발질 한국의 PIR은 주택의 평균 가격과 도시근로자의 평균 가계소득을 기준으로 계산한다. 반면 미국의 PIR은 미디언 가격(MEDIAN PRICE·중간가격)과 미디언 소득을 기준으로 한다. 미디언 가격은 그 지역에서 거래된 가장 가격이 싼 주택에서부터 가장 비싼 주택을 일렬로 늘어 놓은 뒤 그 중간치를 선택한다. 건설산업전략연구소 김선덕 소장은 “평균가격이나 평균소득은 고가의 주택이나 엄청난 고소득자가 일부 포함되면 통계가 왜곡될 수 있다”고 말했다. 더군다나 한국의 주택가격은 호가(呼價)이고 미국의 주택가격은 실거래가를 기준으로 한다. 차학봉 기자 , [email protected] com 입력 : 2007. 03. 26 23: 31

p% (100 -p)% p-th percentile p% (100 -p)% p-th percentile

Range Inter. Quartile Range (IQR) Variance Standart Deviation Measure for variability Range Inter. Quartile Range (IQR) Variance Standart Deviation Measure for variability

Range Range

variance, standard deviation variance, standard deviation

Y freq %freq Y %freq 1 10 0. 1 1 1/6 2 20 0. Y freq %freq Y %freq 1 10 0. 1 1 1/6 2 20 0. 2 2 1/6 3 10 0. 1 3 1/6 4 20 0. 2 4 1/6 5 20 0. 2 5 1/6 6 20 0. 2 6 1/6 Total 100 1. 0 Total 1. 0 Mean (Y) = 1*0. 1 + 2*0. 20 + 3*0. 1 +. . . + 6*0. 2 = 3. 8 Mean (Y) = 1*(1/6) + 2*(1/6) +. . . + 6*(1/6) = 3. 5

Mean of X X freq %freq Low Spender 1 26 0. 26 Med Low Mean of X X freq %freq Low Spender 1 26 0. 26 Med Low Spender 2 20 0. 20 Average Spender 3 11 0. 11 Med High Spender 4 25 0. 25 High Spender 5 18 0. 18 -----------------------Total 100 1. 00 Mean (X) = 1*0. 26 + 2*0. 20 + 3*0. 11 + 4*0. 25 + 5*0. 18 = 2. 89

A new variable Q = (X – 3)2 X Q %freq Low Spender 1 A new variable Q = (X – 3)2 X Q %freq Low Spender 1 (-2)2 0. 26 Med Low Spender 2 (-1)2 0. 20 Average Spender 3 02 0. 11 Med High Spender 4 12 0. 25 High Spender 5 22 0. 18 -----------------------Total 1. 00 Mean (Q) = (-2)2*0. 26 + (-1)2*0. 20 + 02*0. 11 + 12*0. 25 + 22*0. 18

Let , Let ,

Distribution of a sample Distribution of a sample

Sample mean Sample mean

(O) Sample variance (O) Sample variance

For large n, large enough For large n, large enough

Standard deviation Standard deviation

V = (X – 2. 89 )2 X V freq Low Spender 1 (1 V = (X – 2. 89 )2 X V freq Low Spender 1 (1 -2. 89)2 26 Med Low Spender 2 (2 -2. 89)2 20 Average Spender 3 (3 -2. 89)2 11 Med High Spender 4 (4 -2. 89)2 25 High Spender 5 (5 -2. 89)2 18 -----------------------Total 100 Var*(X)= (1/99)[(1 -2. 89)2*26 + …+ (5 -2. 89)2*18] = 2. 22 sd*(X) = 1. 49

statistics pop’n dist’n of a sample median population median sample mean population mean sample statistics pop’n dist’n of a sample median population median sample mean population mean sample variance population variance ….

no. of teeth no. of phone calls weight of body no. of teeth no. of phone calls weight of body

no. of teeth no. of phone calls weight of body no. of teeth no. of phone calls weight of body

Expected value Expected value

X f(xi) Head 1 0. 5 Tail 0 0. 5 0 1 X f(xi) Head 1 0. 5 Tail 0 0. 5 0 1

Y f(yi) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 Y f(yi) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6

X f(xi) 1 1/2 1 1/4 1 1/8 X f(xi) 1 1/2 1 1/4 1 1/8

X 3 X f(xi) 1 3 1/2 2 6 1/4 3 9 1/8 4 X 3 X f(xi) 1 3 1/2 2 6 1/4 3 9 1/8 4 12 1/8

100 x + 10 x 100 x + 10 x

100 x + 10 x X Y 100 X 10 Y 100 X+10 Y 100 x + 10 x X Y 100 X 10 Y 100 X+10 Y f 1 (H) 1 100 10 1/12 0 (T) 1 0 10 10 1/12 1 (H) 2 100 20 1/12 0 (T) 2 0 20 20 1/12 1 (H) 6 100 60 1/12 0 (T) 6 0 60 60 1/12

For any constant For any constant

Thank you !! Thank you !!