
eed00ba69dd21dc51e30219d2db89662.ppt
- Количество слайдов: 51
Evaluating semantic similarity and sameness in studies of polysemy and synonymy Jarno Raukko (U. Helsinki) 1
For a full version of the PPT, see handout distributed Oct 28, 2010. SKY WEBPAGE VERSION 2
Examples 1. Are thrifty and stingy synonyms? EXPECTED ANSWER: ”Well, not quite. ” 2. Are violin and fiddle synonyms? EXPECTED ANSWER: ”Well, almost. ” (SYNONYMY) 3. Does back have the same meaning in My back hurts and I came back? EXPECTED ANSWER: ”Not at all. Different. ” 4. Does back have the same meaning in I came back and I got it back? EXPECTED ANSWER: ”Well, almost. ” (POLYSEMY) 3
Relevance of semantic similarity (vs. difference) • In synonymy: you expect similarity for a pair/(set) of items to be of interest • In polysemy: primarily, you expect difference for a pair/(set) of items to be of interest; secondarily, you group items according to similarity and difference 4
Yet… SIMILARITY DIFFERENCE SYNONYMY POLYSEMY (DEFAULT) Synonymy is about similar meanings of different words. But you are interested in the differences between near-synonyms. Polysemy is about related (and therefore somehow similar) meanings of a word. (DEFAULT) Polysemy is about different (related) meanings of a word. 5
synonymy --- polysemy ? • Dirk Geeraerts tomorrow in Helsinki: ”The problem of synonymy and the problem of polysemy are essentially the same” • Dylan Glynn & Justyna Robinson (eds, in press) Polysemy and Synonymy. Corpus methods and applications in Cognitive Linguistics. Amsterdam: Benjamins. 6
synonymy WORD 1 WORD 2 If their semantic content is similar or the same, this is a case of synonymy. If their semantic content is (very) different, a researcher of synonymy ignores this case. 7
polysemy MEANING 1 OF WORD 1 MEANING 2 OF WORD 1 The starting point is that Word 1 has at least 2 (different) meanings. If meanings 1 and 2 are very similar, this might be a case of vagueness. If meanings 1 and 2 are totally different (and not related semantically), this might be a case of homonymy. If meanings 1 and 2 are somewhat different but somehow relatable (or a bit similar), this is probably a case of polysemy. henceforth W = word M = meaning 8
scale of similarity: synonymy The meaning of W 1 and W 2 is… THE SAME ----------------- DIFFERENT perfect synonymy full synonymy near synonymy semi synonymy weak synonymy quasi synonymy NOT WORTH DISCUSSION 9
scale of similarity: polysemy The meaning of M 1 and M 2 (of W 1) is… THE SAME ----------------- very DIFFERENT two vaguepolysemy homonymy instances ness (ambiguity) of the same meaning two instances of the same of different (yet related) meaning types 10
Main question • Is semantic similarity somehow different when we look at polysemy than we look at synonymy? 11
Differences so far • Which is the default, similarity or difference? • In synonymy, we idealize on the extreme of the scale, but mainly look at the part of the scale which is (fairly) close to the extreme. • In polysemy, we operate pretty much on the whole scale, with focus on the middle. 12
synonymy --- polysemy ? • when you study synonymy, the polysemy of the items gets in the way – can you ever say “W 1 and W 2 are synonymous”? – should you always say “Mx of W 1 and My of W 2” are synonymous? • when you study polysemy, you often use synonyms to talk about meanings – “Are get ‘receive’ and get ‘arrive’ meanings of the same verb? ” 13
synonymy --- polysemy ? • Synonymy occurs when meaning is shared (but form differs) • Polysemy occurs when form is shared (but meaning differs) • Synonymy is a relational lexical-semantic property that unites (parts of the semantic potential of) “accidentally” coinciding words – The forms of words involved in the synonymy relationship are arbitrary (although the relationships might be nonarbitrary, cf. Levin this morning) – The semantic value (that is shared) is motivating enough that two or more forms coincide on it – It is typical that one meaning can be expressed with two different words. 14
synonymy --- polysemy ? • Polysemy is a semantic property of one word at a time that unites meanings. The relationship between them is motivated, but it is only sometimes predictable. – It is not accidental or arbitrary that words acquire polysemy. It is in their nature. : -) – It is typical for semantic value to be flexible, extended, and “multiplied”. – Polysemy is about categorization, both between words (W 1 covers a semantic territory) and within a word (M 1 and M 2 are categories too). • One form : One meaning • a principle that cognition may strive for / take as a default • synonymy breaks it • polysemy breaks it 15
synonymy --- polysemy ? • The role of co(n)text • You can evaluate synonymy in identical co(n)texts: I like to play the fiddle in bars. I like to play the violin in bars. • Usually you evaluate polysemy in non-identical co(n)texts I got to Zabriskie Point. I got to a point in my life where… • But you can use identical co(n)texts as well. I got to be the last one. 16
evaluating • To study shades of semantic similarity, we need to evaluate it. • A corpus cannot tell us if two instances are semantically similar – It requires human judgement • The main use of evaluating in this paper: – How informants / test subjects / speakers evaluate the semantic similarity (or difference) of linguistic items in a more or less experimental setting (e. g. , similarity rating test) ≈ Data elicitation ≈ Population test 17
evaluating • quantitative: – Estimate the degree of synonymy (or semantic distance between two meanings in polysemy) • qualitative: – Justify / explain / explicate the nature of / the reason for semantic similarity 18
evaluating takes place in real life as well • • synonymy (examples) – in linguistic production, you e. g. estimate which of the near-synonyms might suit your needs best – in comprehension, you e. g. estimate whether near-synonyms that you have encountered refer to the same semantic value – in communication, when you negotiate meaning, you e. g. operate with synonymous alternatives polysemy (examples) – in production, you e. g. apply words to new contexts – in comprehension, you e. g. approximate meanings according to related meanings of the same word – jokes often exploit polysemy – polysemy may cause misunderstandings – in communication, when you negotiate meaning, you e. g. cross-check with polysemy of other words 19
(Back to experiments/elicitation. ) Expected difference between synonymy and polysemy, 1 • If an informant is asked to rate the semantic similarity/difference of two words, – the very fact that they are different words might cause her/him to presuppose that there is at least some semantic difference. – Therefore, rating two words ”semantically identical” requires a marked choice. – However, if the informant realizes that the researcher is after synonymy, then evaluating W 1 and W 2 as semantically similar is more likely. 20
Expected difference between synonymy and polysemy, 2 • If an informant is asked to rate the semantic similarity/difference of two meanings of one word, – the very fact that they are uses/instances of the same word might cause her/him to presuppose that there is at least some semantic similarity. – Therefore, rating two words ”semantically totally/very different” requires a marked choice. – However, if the informant realizes that the researcher is after polysemy, then evaluating M 1 and M 2 as semantically different is more likely. 21
Factors that influence • In both cases (synonymy and polysemy) – it matters a great deal • Which test (type) we use • What the instructions (exact phrasings) are • Whethere is an example rating given by the researcher • What the selection of stimuli is • What the linguistic context of each stimulus is • Which types of cases have been placed in the beginning of the test (or, the order in general) 22
Factors that influence • Should we expect (total) consensus? • No. There will be subjective differences. • Why? • The nature of semantics: • Based on intersubjective convention • Based on negotiation and flexibility • Must allow for variability and variation 23
Examples from (more or less) experimental studies on synonymy and polysemy 24
Whitten & al. 1979 (synonymy) • “Indicate the degree to which two words have the same meaning by writing a digit from 1 to 7. ” • 7 =excellent synonymy • 1 = poor synonymy • All 464 stimulus noun pairs were listed as synonyms in standard references. • The rated degree of synonymy ranged from 6. 79 to 2. 24. The median was 5. 08. • If placed within context of nonsynonym pairs, the ratings for the low end might have been higher. 25
Whitten & al. 1979 (synonymy) cont’d • Stimulus pairs at the high end: • purchase – buy 6. 79 • lawyer – attorney 6. 78 • autumn – fall 6. 72 • penny – cent 6. 71 • taxi – cab 6. 71 • Stimulus pairs close to the median • college – university 5. 12 • output – yield 5. 10 • expert – authority 5. 09 • effort – attempt 5. 08 • servant – maid 5. 08 • soldier – warrior 5. 07 26
Whitten & al. 1979 (synonymy) cont’d • Stimulus pairs at the low end: • thunder – clap • patient – invalid • visit – chat • suburb – neighborhood • needle – spike 2. 72 2. 55 2. 52 2. 34 2. 24 • Although instructions said that all stimuli are nouns, some of these are more common as verbs: buy, purchase, visit, chat • The polysemy is obvious in many cases: fall, authority, clap, patient, invalid 27
Whitten & al. 1979 (synonymy) cont’d • The main variable that they paid attention to was the order of the two stimuli: ½ of the informants got “forward order”, ½ got “back order”. – In 1979 one of their main aims was to study the structurings of the mental lexicon and lexical access. – Example: purchase => buy 6. 72 buy => purchase 6. 86 – On average, perceived synonymy was affected by word order. – For 21 word pairs, the effect of the order was significant. 28
Whitten & al. 1979 (synonymy) cont’d • Some of the 21 word pairs where the order played a significant role in the rating of the degree of synonymy: motive => reason quarter => fourth mission => task era => age appetite => hunger nectar => honey aborigine => native 6. 28 6. 24 5. 66 5. 80 5. 18 4. 94 4. 52 reason => motive fourth => quarter task => mission age => era hunger => appetite honey => nectar native => aborigine 5. 56 5. 00 4. 84 4. 60 4. 24 3. 68 3. 22 • Generalization: a more specific, more academic, and less polysemous word prompts a positive synonymy judgement more readily than vice versa. 29
Whitten & al. 1979 (synonymy) cont’d • Variance (between informants) – Mostly. 50– 1. 20 at the end of 50 most synonymous • Exceptionally high variance at the high synonymy end: – murder => homicide 2. 75 (cf. homicide => murder 1. 03) – Mostly 2. 00– 3. 00 at the median of the scale • Exceptionally low variance: province => territory 1. 55 • Exceptionally high variance: congress => legislature 3. 79 – Mostly 2. 50– 4. 00 at the end of 50 least synonymous • That is, there was little consensus at the lower end of 30 the scale.
Raukko 1994 (polysemy) • “Decide whether the word get carries the same meaning or two different meanings in the sentences. ” – 0 = the same meaning – 2 = somewhat different meaning – 4 = very different meaning (heuristic post hoc: 4 might mean homonymy; 0 would refer to two instances of the same meaning type; typical polysemy would be 1. . . 3) 31
Raukko 1994 (polysemy)(cont’d) • Data from my 1994 test, see handout. 32
Comparisons so far • Whitten & al. / synonymy – scale 1. . . 7 (1 = very different meaning, 7 = same meaning) – synonymy ratings ranged 2. 24. . . 6. 79 – median 5. 08 (most pairs were viewed at least somewhat synonymous) • Raukko / polysemy – scale 0. . . 4 (0 = same meaning, 4 = very different meaning) – polysemy ratings ranged 0. 45. . . 3. 13 – average rating 1. 55, median 1. 34 (most pairs were viewed as having fairly similar but not identical meaning) 33
Comparisons so far • Whitten & al. / synonymy – informants saw synonymy where they were supposed to • Raukko / polysemy – informants did not see large meaning difference for the most part => get is polysemous, not homonymous – they saw some similarities, some differences, as predicted => they saw polysemy • both – differing degrees of similarity were apparent – many ratings make sense, some don’t – method is useful but there are skewing effects and irreliability in several details of the setting 34
Conclusions so far • In both synonymy and polysemy studies, semantic intuitions vary. • In both synonymy and polysemy studies, finding a scale of semantic similarity is useful. – Cf. Sandra & Rice 1995: 125 • “[researchers of prepositional polysemy] cannot propose extremely fine-grained distinctions without bothering about empirical data” • “language users’ mental representation [. . . ] is [in fact] characterized by a high degree of granularity” 35
quantitative => qualitative • Whitten & al’s and Raukko’s similarity rating tests did not include informants justifying and explaining their ratings. • E. g. , Liu (this symposium) reports tests with informants explaining their choices. • In Raukko’s study, qualitative results come from other types of tests – sorting test: (1) combine stimuli into categories, (2) give names to categories, etc. – production test: (1) produce examples of the use of polysemy, (2) explain links you find between them, etc. • Vanhatalo 2005 36
Vanhatalo 2005 (synonymy) • her Ph. D, The use of questionnaires in exploring synonymy • several types of tests – choose most likely components – rate components – choose better alternative (cf. Liu) – complete as sentences (only the word given) – define typical frames – spell out semantic differences 37
Vanhatalo 2005 (cont’d) • several factors investigated – 18 Finnish verbs of “nagging”, 17 Estonian verbs of nagging • the gender and age of the portrayed speaker (the subject of “nag”) • the degree of irritation of the portrayed speaker and hearer • the volume of the vocal act – 2 -4 Finnish adjectives ‘important, central, crucial, significant’: open questions mainly 38
Vanhatalo 2005 (cont’d) • main results (Vanhatalo 2005: 40 -45): the questionnaire method – helped to trace differences in the meaning and use of synonyms • many differences not documented before in dictionaries • sometimes consensus, sometimes deviation • useful especially for large groups of semantically similar words • (Vanhatalo did not use the method for placing synonyms on a scale of similarity) • both open questions and ratings should be used 39
Vanhatalo 2005 (cont’d) • main results (Vanhatalo) (cont’d) – helped to find differences between related words in Estonian and Finnish – sociodemographic variables caused fairly little variation • age and education affected a bit more than gender • answers critique 40
Vanhatalo 2005 (cont’d) • main results (Vanhatalo) (cont’d) – when both corpus method and questionnaire method were applicable, they yielded similar results • however, justification of results was different • questionnaire method dug up semantic properties that corpus method could not • in addition, can tackle low-frequency words – results of questionnaire method can be utilized in the production of electronic dictionaries 41
Other studies of synonymy that employ experimental techniques • • • Arppe & Järvikivi 2002, 2007 Divjak & Gries 2008 Liu, in this symposium Oversteegen, in this symposium etc. 42
polysemy / qualitative • In experimental settings (e. g. , the sorting test): – An informant gives a name to a meaning type, a category within polysemy – An informant spells out the semantic link between two meanings – An informant draws a hierarchy between macrotypes and microtypes (more general and more specific meaning types) – An informant pinpoints at cases difficult to evaluate 43
And… • to conclude… 44
Evaluating semantic similarity • Both synonymy and polysemy operate on the scale of semantic similarity vs. difference. • Knowing about the degree of similarity is one useful property of both. • The way to find out about it is to use elicitation/experiments. • There is deviation in informants’ ratings. • A simple explanation: informants use different criteria for evaluation. • Solutions: let them explicate the criteria. use multiple methods. 45
Synonymy vs. polysemy • Evaluating semantic similarity between the meanings of two separate words (synonymy) is a matter of evaluating the match between two separate ”semantic events” – There should be mismatch, but there isn’t. • Evaluating semantic similarity/relatedness/ difference between the meanings of one word (polysemy) is a matter of comparing the applications of one single category. – There should be match between the semantic events. 46
Synonymy vs. polysemy • When you evaluate near-synonyms, you balance between (i) the ideal of what would constitute a perfect match and (ii) the nuances of the near-synonyms • When you evaluate meanings of a polysemous word, you balance between (i) the assumption that some meaning should be shared and (ii) the actual semantic profile of the uses 47
Synonymy vs. polysemy • In evaluating synonymy, the idealized equivalence can be taken from the semantic description of either of the two words. • In evaluating polysemy, the common factor (”core meaning”, ”shared meaning”) may be hard to find, or become too abstract. Maybe the first task is easier? 48
General relevance • ”Insights in the equality or similarity of meaning may shed light on meaning itself” (Oversteegen / SKY 2010, Helsinki) • The question of “identical meaning” is a crucial basis for e. g. typology and language comparisons: the problem of tertium comparationis – Cf. Haspelmath’s plenary on Saturday 49
References Arppe, Antti & Juhani Järvikivi 2007. Every method counts – Combining corpus-based and experimental evidence in the study of synonymy. Corpus Lingustics and Linguistic Theory 3: 2: 131 -159. Colombo, Lucia & Giovanni B. Flores d’Arcais 1984. The meaning of Dutch prepositions: a psycholinguistic study of polysemy. Linguistics 22: 51 -98. Divjak, Dagmar & Stefan Gries 2008: Clusters in the mind? Converging evidence from near-synonymy in Russian. The Mental Lexicon 3: 2: 188 -213. Geeraerts, Dirk – in this symposium Liu, Dilin – in this symposium Oversteegen, Eleonore – in this symposium Raukko, Jarno 2003. Polysemy as flexible meaning: experiments with English get and Finnish pitää. In Brigitte Nerlich & al (eds) Polysemy. Flexible patterns of meaning in mind and language. 161 -193. CONTINUED. . . 50
References Author’s cont’d contact Sandra, Dominiek & Sally Rice 1995. information Network analyses of prepositional meaning: mirroring whose mind – the • linguist’s or the language user’s? Cognitive Linguistics 6: 89 -130. Vanhatalo, Ulla 2005. Kyselytestit synonymian selvittämisessä (etc. ) [The • use of questionnaires in exploring synonymy, etc. ] Ph. D thesis, U-Helsinki. http: /ethesis. helsinki. fi/julkaisut/hum/ • suoma/vk/vanhatalo/kyselyte. pdf Whitten, William B. II, W: Newton Suter, and Michael L. Frank 1979. Bidirectional Synonym Ratings of 464 Noun Pairs. Journal of Verbal Learning and Verbal Behavior 18: 109 -127. e-mail: See handout and list of participants. home postal address See handout. affiliation Department of Modern Languages Metsätalo (Unioninkatu 40 B) FIN-00014 University of Helsinki Finland 51