- Количество слайдов: 92
Statistical Literacy at the Reference Desk Why you should care, and what you can do about it.
“Nothing exists until it is measured”. -- Niels Bohr “Innumeracy is the mathematical equivalent of illiteracy”. -- Joel Best
What we’ll cover… • Background and context. • How you can recognize good, reliable, well-reported statistics. • A chance for YOU to interpret some statistics.
What is ‘Statistical Literacy’?
STATISTICAL LITERACY, NUMERACY AND THE FUTURE Peter Holmes, Senior Consultant, RSS Centre for Statistical Education. Nottingham Trent University, Nottingham England, 2003 “I think the whole thing started in England. Brits do start some things. We started with a word. We had a word that you didn’t have. In 1959, there was a government report in England that talked about the numeracy problem. … it was talking about the education of 16 -year-olds saying that they needed to be literate. There was a literacy strand, but they also needed to be numerate. So there was a numeracy strand. So from 1959, we have had a very good English word called numeracy. "
“…There’s now “Statistical Numeracy, ” “Statistical Literacy, ” or “Statistical Reasoning” or “Statistical Thinking”…. ” “But they’re all in the same ballpark. The word numeracy when it was first introduced was in the context of the ability to use numbers in practice. ” “… particularly in the context of statistics that you might have to read and interpret. In fact in that first use of [numeracy] in 1959, it was in terms of reading tables. STATISTICAL LITERACY, NUMERACY AND THE FUTURE Peter Holmes, 2003
A more recent take on Statistical Literacy… “Statistical Literacy studies the use of statistics as evidence in arguments” (Schield, Milo 1998, 1999) "A key element of statistical literacy is assembly: how the statistics are defined, selected and presented" Schield, Milo (2004). "Information Literacy, Statistical Literacy and Data Literacy". IASSIST Quarterly 28 (2 -3): 6 -11.
“Literacy matters. There is no argument about that fundamental statement. But numeracy counts. Research in numeracy trails research in literacy by 50 years. It will never catch up if elected leaders and politically appointed officials continue to exclude numeracy. That means numeracy needs to count more. ” Lynda E. Colgan: Kingston Whig-Standard, January 18, 2006, p. 5
What Librarians Need to Know: • Know about and how to use major statistical sources (print and electronic, national and international) • Know about value-added commercial products that may ‘hide’ statistical details from us. • Be critical consumers of statistics • Be familiar with and able to make informed decisions about the use of charts, graphs, mapping, etc used in the presentation of statistics. Summarized from: Data and Statistical Literacy for Librarians Ann S. Gray IASSIST Quarterly, Summer/Fall 2004 Special Issue: Developing Statistical Literacy Issue 2/3
More Damned Lies and Statistics: How Numbers Confuse Public Issues Published 2001 Published 2004
Statistics The word “statistics” • Origins in the 1600’s • ‘Political arithmetic’ used to calculate population size & life expectancy • A growing population was thought to reflect a healthy ‘state’ – so early number crunchers became known as ‘statists’. • Hence, development of the term ‘statistics’…
Statistics crop up in a variety of circumstances in Libraries… Copyright: Unshelved. com (c) Overdue Media LLC and used with permission
Statistics Create Social Problems Contrary to Laine’s email signoff: “Smoking is a major cause of statistics” statistics are in fact, a major ‘cause’ of social problems. Statistics identify and define social issues (a. k. a. problems) and provide ‘ammunition’ to those who would promote these issues. Belief in ‘the numbers’, especially those reported by ‘experts’, typically solidifies popular conviction that a problem exists.
Statistics Create Social Problems Issue or situation ‘Official’ statistics Defence of policies, interests, etc. Opposition Awareness General public awareness and/or involvement Polls, etc. Number Laundering Measurement Promotion Activists, media, officials, experts, etc.
Best describes three types of people when it comes to statistics: Cynical, Naïve, and Critical Cynical – Suspicious of statistics; as consumers of statistics, not willing to give them much stock. They will often discount or ignore statistics that don’t align with their views. Worse, as producers of statistics, cynics will collect and report statistics in such a way as to support their point of view. Derived from Best, 2001, p 162 -167
Naïve – “Slightly more sophisticated than the Awestruck”; they think they understand something about statistics (but often don’t), and are basically accepting of any numbers they encounter, and accept that they mean what they appear to mean. As consumers of numbers, they are bad enough, but as producers of numbers they can be as dangerous as cynics, if not worse. Derived from Best, 2001, p 162 -167
Critical Thinkers – Not negative or hostile; thoughtful in approaching statistics. Recognize that statistics summarize complex information into relatively simple numbers and that as a consequence “some of the complexity is lost”. Statistics are a product of choices and more specifically a compromise among choices. Given this, approaching statistics with a ‘critical’ eye is only being prudent and responsible. ‘Critical thinkers’ ask questions about statistics. Derived from Best, 2001, p 162 -167
Some Common Problems Geographic comparisons – “there is a good chance statistics gathered from different places are based on different definitions and different measurements”. For example, comparing US and Canadian statistics on ‘race’ is complicated by different perspectives on this issue (i. e. definitions and measurements can vary widely).
Comparing groups “Cult ‘X’ is the fastest growing religion in Canada” On closer examination, the cult grew from 20 to 200 members (a 1000 % increase). To match this, the Catholic Church in Canada would have to grow from 13 million to 130 million – far more than the population of Canada. SIZE MATTERS… (derived from Best, 2001 p. 113)
Numbers vs Percentages • “Most poor people are white” Take, for example, a population of 700 families 600 white families, of which 60 are poor 10% 100 visible minorities, of which 20 are poor 20% Number Percentage In absolute numbers, more white families are poor, but… Proportionally, more visible minority families are poor.
Mutant Statistics “Not all statistics start out bad”. Even good numbers can be “stretched, twisted, distorted, or mangled”… generating “mutant statistics”. There are three main ways “mutant statistics” are created: Generalizations, Transformations, & Confusion Best, 2001, pp. 62 - 95
Generalizations… An Economist, Physicist, and Statistician were driving through Scotland, and they see a brown cow… The Economist says, “Fascinating that the cows in Scotland are brown”. The Physicist says, “I’m afraid you’re overgeneralizing from the evidence. All we know is that some cows in Scotland are brown. ” The Statistician shakes his head at both of them. “Wrong again. Completely unwarranted by the evidence. All we can infer, logically, is that there exists at least one cow in this country, at least one side of which is brown. ” Robert Ludlum, The Ambler Warning 2005, p. 465 -466.
Generalizations Measuring ALL the cases of a given social phenomenon is normally not feasible. We collect samples and generalize, but problems can arise: Definitions Measurements Sampling Best, 2001, pp. 62 - 95
Definitions – In 1996, “. . . news media reported on what was considered to be a rash of arson fires against black churches in the southern U. S. Amid those images were fears of raging racism. ” Statistics were suspect because of poor definitions of what was an ‘appropriate’ church fire to include in the counts. Analysis of six years of federal, state and local data found that the number of arson cases was up, but that these increases applied to both black and white churches in roughly equal proportions. …There was NO dramatic increase in the number of insurance claims made against church fires. http: //www. emergency. com/arsnstat. htm & Best, 2001, pp. 62 - 95
Measurements – Hate crimes statistics are gathered across many jurisdictions. Race Religion Sexual Orientation Ethnicity/National Origin Disability Multiple-Bias Incidents But, ultimately, any crime could be a hate crime. It comes down to a question of ‘motive’ – and how do you objectively and consistently measure ‘motive’? Best, 2001, pp. 62 - 95
Sampling – Bad sampling can give rise to mutant statistics. If you’re in the wrong place, or at the right place at the wrong time, your sample won’t be representative. A report on ‘racial profiling’ by Kingston Police was criticized for this. Calculation of the Police Stop Rate: Number of Stops divided by Population Estimate Times 1, 000 Best, 2001, pp. 62 - 95
BUT… How, when and where was this ‘mini-census’ conducted?
Transformations This form of ‘mutant statistics’ results from transforming the meaning of a number. Take the estimate that 6% of the 52, 000 Roman Catholic Priests in the US are at some point in their adult lives sexually preoccupied with young people Source: A former priest turned psychologist who treated disturbed clergy and derived this estimate from his observations. transformed into 6% of priests are pedophiles. Best, 2001, pp. 62 - 95
Transformations: 1. People forgot that it was an estimate and treated it as fact. 2. The original ‘sample’ was drawn from priests who sought psychological help (hence a biased sample) and generalized to all priests. 3. People turned “Sexual preoccupation” into actual behaviour. 4. “Young people” were morphed into “children” – bringing the word ‘pedophile’ into the mix. Best, 2001, pp. 62 - 95
Confusion “Garbling complex statistics” Wendy Watkins of Carleton University provided an example: Two polling companies, Decima and Compass, surveyed Canadians regarding Harper’s policy on the Middle East. Decima – 30 % approval of policy Statistic based on a single question: “What do you think about Harper’s Middle East policy? ” Compass – 60% approval of policy Statistic based on an amalgam of responses to several questions – Israel’s right to defend itself… Syria flouting UN sanctions… Iran flouting UN sanctions… etc. Compass Survey sponsored by a ‘right-leaning’ Think Tank
“This kind of statistics is about as valid as the one that argues that the average Canadian has one testicle”
• Now, over to Suzette…
How can you recognise good, reliable, well reported statistics?
A critical view Look at: • Who collected the data (source) • Why were they collected • How were they collected • What was counted • When the data were collected • How were the data processed after collection (added up, averaged, grouped etc. ) • How are the data being presented. • Always read the footnotes!
Who? - Formal Organizations • Statistics Canada (National statistical agency) • United Nations Statistics Division (national statistics) • OECD (NGO) • Provincial and Municipal governments – Ontario – City of Toronto • Societies and Associations: – Cancer Society; Amnesty International etc.
Sources • Companies: – Sears Canada; Ford etc. • Consumer advocacy groups: – International Coffee Organization – Dairy Farmers of Canada • Publications (print and electronic) – Annual reports from companies and societies – Journal articles, print and electronic – Newspapers, print and electronic, such as Toronto Star, Globe and Mail – Commercial databases such as Datastream
Sources – Media etc. • Media – Magazines range from National Enquirer to Chatelaine, Mac. Lean’s to the Economist – Newsfeeds - Reuters to more dubious ones • Informal Organizations – Wikipedia – variable content – User groups – again a range from professional ones to casual ones – Blogs, Chatrooms
Good or Quality statistics • If the figures are from a “reputable” source then usually considered “good” • But still consider the “Why? ” Especially for companies, opinion polls, consumer organizations, advocacy organizations such as Greenpeace, United Way etc. • Can get question bias • Can get sample bias
Why were the data collected?
Why were the data collected? • • Government planning at all levels Political reasons (good, bad or neutral) Academic research Commercial reasons (company finances, resellers of data, media, etc. ) • Baseline data (environment, health) • Advocacy organizations (Greenpeace, Amnesty International, Cancer Society)
How were the data collected?
How were the data collected? • Census and Statistics Canada surveys: can be considered a “gold standard” • Academic research • Companies, product associations • Media
How - Newspapers, Magazines • Mac. Leans University issue – “Now in its 16 th year, the annual Mac. Lean's rankings assess Canadian universities on a diverse range of factors “ – “From its inception, Maclean’s has consulted with academic experts about the design, composition and methodology of the rankings. ” – Universities boycotting it now • Globe and Mail University survey – students register themselves therefore self selections – More than 32, 700 students answered over 100 questions – “Our assessment has spread to 49 schools -- up from 37 “ • Toronto Life surveys – Talk to 100 pedestrians about a topic
What is being Counted?
What is being counted? Need to be aware of definitions so you can get comparable data over time and place • If it is a number what does that number represent: – a person, a household, a family? – Total, single or multiple responses? – income or earnings? – a weight, kilograms or pounds? – a currency, Can$ or U. S. $ – Is it a percentage? – Is it in “millions” or does the table have a ‘ 000 sign?
What is the unit of measurement? • Is it a rate e. g. Unemployment rate? • Is it indexed e. g. Consumer price index? – What is the base date – Has the “basket of goods” changed • Is it seasonally adjusted? • Are classifications comparable: – NAICS 2000 vs. SIC 1980, definition of pet food may have changed – Concordances exist
What is being measured? Household internet use at home by internet activity
Internet use by individuals by type of activity
What is the unit of measurement - Geography • Make sure that if data are from different tables or sources that they are for the same geographic area – North America vs. U. S. A. – Maritimes vs. Atlantic Canada – City of Toronto 1998 and before vs. City of Toronto after amalgamation. In the late 1990’s many municipalities amalgamated – Prior to 1949 Newfoundland was not part of Canada – Nunavut included in the Northwest Territories prior to 1999
Date of the Data! • Data are often several years old before publication • There should always be a date that tells you what time period the data are for and the unit of time – monthly, quarterly, annual etc. • Census data – the income information is always for the previous year so the 2006 census will give income for 2005
Presentation of the data • Often crucial for the awareness of the value of statistics • Can be in the form of : • Text • Tables • Graphs and charts • Maps
Text: Mackenzie Investments Burn Rate (RRSP season)
Text: Mac. Kenzie Investments Burn Rate
Table: $ thousands http: //www 40. statcan. ca/l 01/cst 01/comm 02 b. htm
Table: Weight and Footnote http: //www. ico. org/prices/m 1 -a. htm
Graph: Exaggerated Vertical Scale
Map: Change in the variable displayed can make a significant difference to impact the map makes on the user Average income Median income
HELP! • See Bibliography • See Statistics Canada website
Statistics Canada Resources http: //www. statcan. ca/start. html
Statistics Canada Resources http: //www. statcan. ca/english/edu/power/about 2. htm
Statistics Canada Resources http: //www. statcan. ca/english/freepub/11 -533 -XIE 2005001. htm
Discussion Points • What are the responsibilities of reference desk staff in evaluating statistics and educating users? – Do we review the stats with the user when we direct the user to them or is caveat emptor? – Should we direct users to a website or a handout that talks about how to recognize “good” statistics
Discussion points • What are the chances of people actually reading the necessary information? • Does our responsibility vary with the type of library we work in? – School – Public – Post secondary
Statistics Canada Resources http: //www. statcan. ca/english/edu/power/toc/contents. htm
Statistics Canada Resources http: //www. statcan. ca/english/concepts/index. htm
Statistics Canada Resources http: //www. statcan. ca/english/freepub/11 -533 -XIE/2005001/using/reading. htm
Lies, Damn Lies and Statistics! (attributed to Disreali 1804 -1881) Scepticism about statistics has been around for a long time – need to be a critical thinker! What should we look at to get some idea of the validity and reliability of the statistics we or our user have found?
Sources (Who) (adapted from Rice, 2006) Formal Publications Media Organizat. Informal Individuals Organizat. National Govt. Books T. V. Special Interest Statisticians Local Govt Journal Art. Magazines E-Mail Experts Universities Reports Radio User groups Teachers Companies Newspapers Newsfeeds Chatroom Colleagues Non-Govt Organizat. Commercial websites Open Repositories Web Pages (Wikipedia) Librarians Societies Opinion Polls Blogs Family
How were the data collected? • Census and Statistics Canada surveys – Usually a lengthy user guide that gives you details of the methodology http: //www. statcan. ca – Structured questionnaire with carefully phrased questions e. g. Census form – Selected sample – who were selected and why, which populations were over or under sampled e. g. some native communities “opt” out of the census – How and when it was carried out – personal interview, telephone survey, web survey. What the follow-up was to get responses from missed respondents.
How were the data collected? • Academic research – Usually can get methodology from researcher – May be mentioned in book or article – May be web-link to method and data • Companies, product associations – May be somewhere on the website e. g. http: //www. ico. org – May not give much detail • Media often only give “source” and no details e. g. Statistics Canada
Internet use by individuals by type of activity
Reading tables 101 Laine Ruus
Take a table, one that Statistics Canada publishes like this: Source: STC cat no. 71 -001 -XIE 200612 We can now make part of the table look like…
…this (note, it’s a different date, and therefore different numbers from the previous slide): Full vs part-time employment by gender, Canada, 2005 Source: Labour force historical review: table cd 1 t 15 an. [computer file] 2006 ed. And compute some percentages to make it look like…
…this: Full vs part-time employment by gender, Canada, 2005 Source: Labour force historical review: table cd 1 t 15 an. [computer file] 2006 ed. More males work full-time than part-time: True/False More females work full-time than part-time: True/False Three times as many women as men work part-time: True/False Women are three times more likely to work part-time than men: True/False
Full vs part-time employment by gender, Canada, 2005 100% Source: Labour force historical review: table cd 1 t 15 an. [computer file] 2006 ed. Of those who work full-time, 2/3 are men: Of those who work part-time, 2/3 are women: Almost twice as many women work part-time as full-time: True/False
…but the table behind the numbers is… Source: Labour force historical review: table cd 1 t 15 an. [computer file] 2006 ed.
Do you agree with this Toronto Star reporter? Source: Toronto Star, Dec. 9, 2006
Now for a slightly more complex table: Source: Labour force historical review: table cd 1 t 15 an. [computer file] 2006 ed Less than 15% of males who work full time are over 55: Of males who work part time, the largest number are youth: Fewer women 25 -54 work part-time than full-time: True/False
Same table – but where’s the 100% now? Source: Labour force historical review: table cd 1 t 15 an. [computer file] 2006 ed Twice as many young women as young men work part-time: Twice as many women as men over 65 work part-time: Women over 65 are twice as likely to work part-time as men: Most of the men who work part time are under 24 or over 65: True/False
And here’s what the table values/counts are: Source: Labour force historical review: table cd 1 t 15 an. [computer file] 2006 ed
In this table, where’s the 100% total?
Lesson 1: • Can compare sizes of percentages and rates only within the row/column in which they have been computed (ie add up to 100%) • Between rows/columns, can only compare relative proportions or likelihoods, or counts.
Source: Census of Canada, 2001: legal marital status, age groups, and sex for population (Topic based tabulations; 97 f 0004 xcb 2001001) Why are these two numbers so different? Which one is correct? Source: Census of Canada, 2001: legal marital status, common-law status, age groups, sex and household living arrangements for population 15 years and over (Topic based tabulations; 97 f 0004 xcb 2001040)
Lesson 2: make sure you can identify what’s in the denominator as well as what’s in the numerator!
Here’s what the academic called the table
And this is what the original Statistics Canada publication called the same table: Source: Women in Canada. STC cat no. 89 -503, pl. 116 Same table, different titles. Which one would you use?
Employment rate and participation rate are not the same thing: • participation rate = ((labour force) *100 (total population 15 and over) • employment rate = ((employed labour force) *100 (total population 15 and over)
This is the original table from the Labour force historical Review cd-rom Source: Labour force historical review 1999 ed. : table tab 01 an. ivt. p a r t i c i p a t i o n r a t e = ( l a b o u r f o r c e / t o t a
Lesson 3: whenever possible, go back to the original data collector.