CL pres 2.ppt
- Количество слайдов: 26
Lexical Content and Lexical Form
Lexical Content n n n Characterizations of Lexical Content While considering definitions of the lexicon, its functions and linguistic importance the question of what linguistic information it should contain becomes important. In practice, what information the lexicon should contain depends on the purpose for which the lexicon was built.
Lexical Content n n n Chomsky suggests that ‘the lexicon presents, for each lexical item its abstract phonological form and all the semantic properties associated with it’. He maintains that ‘the lexicon is a set of lexical elements’. It must specify, for each element, the phonetic, semantic and syntactic properties which are idiosyncratic to it. In addition, depending on the sophistication of the overall grammar, the lexicon should contain information as to the subcategory of the word (such as whether a particular verb is transitive or intransitive). Other syntactic properties (such the gender of a noun in a language that makes gender distinctions.
Lexicon for Verbs n n Ch. Fillmore, the founder of theory of semantic roles, insisted on the fact that a set of semantic roles should also be included into the lexicon. This idea is especially important for the lexicon of a verb, because only verbs can take such arguments as Actor, Goal and Recipient. In such a way verb semantics is treated in the most appropriate way. But the lexicon is not just verbs.
The Qualia Structure n n n J. Pustejovsky points a set of four semantic roles called the qualia structure to handle the semantic information carried by nouns and adjectives. They are: constitutive – the role that specifies the relation between the lexical item and its constituent parts; formal – the role that distinguishes the lexical item within a larger domain of items with the same constituent parts; telic – the role that specifies the role and the function of the lexical item; agentive – the role that specifies whatever brings the lexical item about
Mapping the Lexicon to the Syntax n n n According to Pustejovsky, there are 4 levels of mapping the lexicon to the syntax. These 4 levels of representation may be listed as: Argument structure – the behavior of a word as a function. It indicates how the word is mapped to syntactic expressions. Event structure - identification of the particular word or phrase as state, process or transition. Qualia structure – the essential attributes of an object as defined by the lexical item. Inheritance structure – how the word is globally related to other concepts in the lexicon.
All-Inclusive Lexicon n n The conception of a richer lexicon leads to a combination and integration of phonological, morphological, collocational, syntactic, semantic and pragmatic information in various ways. In 1995 R. Hudson introduced a checklist for an All. Inclusive lexicon. This A-I lexicon would state the distinction between the lexicon and the grammar. , reflecting the trend towards lexicalism. The checklist of information looks like following:
All-Inclusive Lexicon Phonology n underlying segment structure or several such structures if allomorphs are stored in the computer; n prosodic patterns or word (to the extent that there are no rules for computing these), i. e. mainly word stress or tone. Morphology n structure in terms of morphemes; n irregular morphological structures linked to particular monosyntactic features (i. e. irregular inflections); n partial similarities to other words (i. e. derived words and compounds);
All-Inclusive Lexicon Syntax n general word-class (e. g. ‘verb’); n sub-class (e. g. ‘auxiliary’); n obligatory mono-syntactic features (e. g. to be); n valency. Context n restrictions relating to immediate social structure (e. g. solidarity markers); n restrictions relating to style (formal, slang) n restrictions relating to discourse structures (topicchange markers).
All-Inclusive Lexicon Spelling n normal orthography; n standard abbreviations; n inflectional irregularities in spelling Etymology and Language n The language to which the word belongs; n The l-ge from which it was borrowed; n The word on which it is based; n The date when it was borrowed.
All-Inclusive Lexicon Usage n frequency and familiarity n age of acquisition; n particular occasions on which the word was used; n cliches containing the word; n taboos.
Specifications: n n In this lexicon checklist there is no clear dividing line between linguistic knowledge and encyclopedic knowledge. But nowadays linguists stick to the point of view that while lexical and world knowledge must be distinguished, it is impossible to discretely separate lexical (linguistic) and world knowledge.
The TEI Lexicon and Lexical Form n n n A problem: what should go into the lexicon. In need of standardization in August 1991 the Computational Lexicon Working Group created the Text Encoding Initiative program. Its primary task was to conduct a survey of currently existing lexicons and produce standards for interchanging electronic documents of various types, i. e. to create lexical databases intended for use by natural language processing systems of all sorts.
TEI Lexicon the TEI Lexicon should include the following types of information: Nouns: n entity nouns (apple, book); n relational groups (speed, age, father) n abstract nouns (courage, love); n mass nouns (wine, sand); n proper names (John, IBM); n complement-taking properties (factive noun like story). n
TEI Lexicon Pronouns: (I, he she) and bound anaphors (myself, himself, each other). Verbs: n -a vide variety of valency classes: 1. intransitive; 2. transitive; 3. ditransitive; 4. clausal complement taking; 5. infinitival complement taking; 6. small clause taking including bare infinitive.
TEI Lexicon Modals and Auxiliaries. Prepositions: n indication of subclasses of prepositions (casemarking, semantically contentful prepositions Adjectives: n complement-taking properties ( proud of, likely to); n semantic classes of adjectives; n the position in which an adj. can appear (prenominal, postnominal, predicate).
TEI Lexicon Determiners and other similar nominal modifiers (articles, quantifiers, demonstratives etc. Multi-word lexical entries. Inflected categories of noun, verb and adjective: how irregular forms, inflectional paradigm, and other morphological information are stored. Conclusion: different word classes vary in their specification of linguistic content and thus demand different treatment as to their packing into computer database.
Lexical Form n n n While the content and specification of each word class might be different, it is very important to provide a uniform means for representing word and world knowledge. One of the solutions to the problem was to utilize the notion of frames. The term ‘frame’ was suggested by Minsky in 1975. He defines it like ‘a data structure for representing a stereotyped situation’. Frame provides the fundamental representation of knowledge in human cognition. The frame is able to represent the attributes of a particular object or concept more descriptively than is possible by just using rules.
Frame: n n The frame consists of a number of slots (or attributes), each of them may contain a value (‘filler’) or be blank. In other words a frame basically uses a slot-filler notation. Applied to language, this slot-filler notation forms the basis for defining a lexical frame, i. e. a device to represent lexical items in terms of speaker-dependent constructs (’lexical representations’). A bit different representation of a frame was given by Ch. Fillmore. He defines frames like “the specific lexico -grammatical provisions in a given language for naming and describing the categories and relations found in schemata.
Frames and Schemata n n n Conceptial schemata is a set of frameworks linked together in the categorization of actions, institutions and objects. Schemata commonly use a slot-filler approach. In frame semantics the slots could correspond to linguistic cases, and the fillers would be formal variations of the word for every case. A frame is used when the interpreter tries to make sense of a text by means of replacing its content in a pattern that is known independently of a text. Thus, for the process of text formation Fillmore suggests the notion of text model, which can be thought of as ‘the assembly of schemata created by the interpreter, which models some set of possible complex scenes’.
Argument - Value Notation According to this theory linguistic information can be encoded as an attribute with an associate value: [Arg 1 Val 1] [Arg 2 Val 2] …… [Argn Valn] n
Lexical Frame Metzing spoke of 2 requirements that a lexical frame should fulfill: 1) LF should correspond intuitively or empirically to the knowledge of a speaker. With regard to the definite socio-psychological reality it touches the described above Associative Lexicon. 2) LF should provide a formally satisfactory basis for the definition of terms, i. e. to form an explicit and theory-neutral grammatical representation for every slot.
Lexical Frame: Example Lexical Frame of the Verb ‘pop’ (The balloon popped). Composition (pop): p 1 + p 2 + p 3 pop consists of three phonemes Model(p 1) /p/ the first is p; Model(p 2) /o/ the second is o; Model(p 3) /p/ the third is p; Model(pop) verb pop is a verb; Companion(verb) C verb pop takes a companion which precedes them; Referent(pop) pop verb pop refers to the lexeme ‘pop’ Model(pop) event pop represents an instance of an event that involves a ‘changer’, namely the referent of the companion
Declarative View of Language n n n Frame theory supports the view of language as mainly declarative, not procedural. The term declarative means with the association between language and information elements. This association should be permitted by the language laws, e. g. a declarative view of language is that any clause can consist of 5 elements: Subject, Verb, Object, Complement, Adverbial.
A procedural approach to language n 1. 2. 3. 4. 5. 6. A procedural approach to language, by contrast, would require which elements are to be processed first, e. g. on encountering a clause we should provide the following procedures: isolate Verb element first; note the subcategorization information for the Verb: isolate the Subject element which usually occurs to the left of the Verb; guess whether the Verb takes C or O; assign either C or O (or both); guess whethere any remaining prepositional phrases or adverbial phrases.
Conclusion These two approaches to the language description correspond to the distinction in computer programming between procedural computer languages (C++, Pascal) and declarative computer languages (programming language PROLOG). But both declarative and procedural approaches are very often present in any substantial program.


