7597d87a555f3a27e12da7da52567eed.ppt
- Количество слайдов: 87
Aug. 22 nd, 2003 Steps towards Integrated Intelligence Naoyuki OKADA (Professor Emeritus) Kyushu Institute of Technology
Progress Step 1 Conceptual taxonomy of vocabulary Step 2 Natural language understanding of moving picture patterns Step 3 Emotion processing vs knowledge processing Step 4 Integrated intelligence
1.Introduction § The history of the research of Artificial Intelligence(AI) is repetition of diversification and specialization 1960 s of it’s fields as other research lang. process. • Natural does. • Pattern At the beginning of 1960 srecognition • Learning At the beginning of 2000 s solving • problem
2000 s • Fundamentals/Theory 1.Introduction Knowledge representation, reasoning, algorithm , fuzzy theory, --- • Learning/Discovery Inductive/deductive learning, example-based reasoning, data-mining, --- § The history of theof knowledge of Arti-fi • Infrastructure research Knowledge acquisition, knowledge base, Web cial Intelligence(AI) is repetition of search, • AI architecture/language diversification and specialization of • Agent/Distributed AI Problem it’s fields as othersolving by collaboration, agent society, --research does. • Life/Brain system Artificial life, genetic algorithm, connectionism, -- At the • beginning of 1960 s Natural language understanding, dialog At the beginningcorpus, speech recognition, ---processing, of 2000 s • Pattern understanding Image recognition, scene analysis, image sequence processing, --- • Cognition/Body Intelligent robot, symbol-ground ding, cognitive psychology, --・・・・・・・・・・・
§ However, too much diversification and specialization weaken the study on the relations among subfields. Those relations are important above all in human intelligence. § So, we should sometimes stop, look back, put various kinds of results in order, and integrate them into a system.
Approach towards integration § Multi-modal Human intelligence accepts multi-modal inputs. - Natural language in letters/voices - Picture patterns
n Intellect and sensitivity Knowledge and emotions are in the relationship of both wheels of a cart.
2.Conceptual taxonomy of vocabulary n Language is “the window” of the mind. Semantic contents of language, or the system of concepts is the most important objects in making clear intelligence.
n Research in Early years C. J. Fillmore ’ 68 Case grammar M. R. Quillian ’ 68 Semantic network R. Schank ’ 72 Conceptual dependency Y. Wilks ’ 75 Preference semantics
Conceptual analysis n Categories of concepts Concepts are formed for all the nature. - There are five categories from the linguistic viewpoints: substance, attribute, event, space/time, and miscellaneous - But each category is vague.
n Computational definition - What is substance? Individual of which quantity and quality can be recognized by sensors Fig. 2・ 1 Substance sensed by eyes ---Mountain
- What is state? Fundamentally, static relation among several substances Fig. 2・2 State----Man in the car
- What is attribute? A special case of state. Fundamentally, difference between object and standard Difference ~ Object Standard Fig. 2・ 3 Object-standard pair--The mountain is higher than the tree.
- A measure is necessary for the detection of difference Measure : Height (length in the perpendicular direction) This measure brings an attribute to the object.
- What is event ? Fundamentally, change from a before- state to an after-state Change 前状態 Before-state 後状態 After-state Fig. 2・4 Before-after state pair--A man gets out of a car
- What is space and time ? Fundamentally, the location of substance, attribute or event is identified. Space:position Time: passage
Primitive and complex - Primitive A concept which can not be decomposed any more(by referring to its word) n - Complex A concept which can be decomposed into one or more primitives
n Formation of complex concept - Compound Type A: Two primitives are connected with a logical/syntactic relation. Type B: Primitives are connected with a scenario - Derivative Derived from a primitive
Conceptual classification Why classification? - Verification of the proposed theory - Acquisition of conceptual data for machine processing n Target vocabulary - About 32, 000 words used in everyday language n
Results of classification川(river) n whole/part attribute N 11 N 12 event N 13 flow stagnate 深い 本 支 底 面 上 中 下 上流 下流 淀み 瀞 本流 川底 (upper (lower (pool) (main 川面 stream) course) stream) 支流 中流 (tributary) (midstream) 水源 河口 浅い 激 し い 急 だ 清 い 大 小 さ き い い 小川 瀬 激流 急流 清流 (brook) (torrent) 大川 (big river) Fig. 2・ 5 Network of substance concepts
Table 2・1 Primitives of attribute/event No. 0・ 00 0・ 01 1・ 00 1・ 01 1・ 02 1・ 03 1・ 04 1・ 05 1・ 06 Subcategory (attribute/event) Spirit/change_in_spirit Examples (attribute/event) Glad/get anger Sense/change_in_sense Cold/hurt Location/change_in_location Deep/fall Direction/change_in_direction Diagonal/turn over Shape/change_in_shape Sharp/bend Quality/change_in_quality Soft/rot Quantity/change_in_ quantity Many/decrease Light/change_in_light Dark/flash Color/change_in_color Red/color
1・ 07 Heat/Change_in_heat Hot/cool 1・ 08 Force・power/change_ in_force・power Strong/strengthen 1・ 09 Sound/change_in_sound Noisy/sing 1・ 10 Appearance・disappearance Bare/appear 1・ 11 Start・finish Sudden/begin 1・ 12 Time/Change_in_time Quick/pass 2・ 00 Continuation Constant/continue 2・ 01 State Fine/tower 3・ 00 Abstract Equivalent/fit 4・ 00 Others Eat
Table 2・2 Case-frame of events Type v(sbj) v(sbj, org) v(sbj, goal) v(sbj, ptn) v(sbj, std) v(sbj, obj) v(sbj, org) v(sbj, obj, goal) v(sbj, obj, inst) v(sbj, obj, att) Others Example Fall(leaf) Come(smoke, chimney) Go(Taro, post office) Collide (truck, bus) Resemble(children, parents) break(boy, cup) Unload(driver, box, truck) Put(girl, candy, pocket) Scoop(Hanako, sugar, spoon) Feel(Jiro, breeze, cool)
n Number of classified concepts Substance 4,200 Attribute 2,060 Event 3,720 Space/time 1,800 ------- Total 11,780
Evaluation Our theory can cover the 70% of the target vocabulary, and almost the whole if a little enlarged. n Fundamental data of concepts was obtained, which contributed to the construction of EDR concept dictionaries later. n
Main publications 1973 N. Okada &T. Tamati:Analysis and Classification of Simple Matter Concepts for the Interpretation of Natural Language and Picture Patterns, IECE Trans, Vol. 56 D, No. 9, pp. 523 -530. 1980 N. Okada: Conceptual Taxonomy of Japanese Verbs for Understanding Natural Language, Proc. COLING'80, pp. 127 -135.
3.Natural language understanding of moving picture patterns u R. A. Kirsch, Pioneer - Kirsch proposed integrated processing through the common representation of their meanings [Kirsch 64]. - But he processed just static picture patterns.
u Approaches to moving picture patterns in early years N. Badler ‘ 75 Temporal scene analysis Sentence generation as the results of temporal scene analysis Minsky ’ 75 Frame theory Universal data structure, particularly representation of event
u Our approach - Input Sequential pictures each of which is line drawing by hands - Meanings captured The events of change_in_location which is the biggest in number. - Output Japanese and English sentences
u Flow of processing start Picture reading Noise cleaning Bottom up Primitive picture recognition Reasoning of occurring events Structural analysis among primitives Top down Understanding events Sentence generation end Fig 3・1 Natural language understanding of picture sequences
Bottom up process Picture reading A TV camera follows a line segment by octagonal scanning. u Primitive picture recognition An input line drawing with graph structure is matched with a template just like wavepropagation. u
(a) Octagonal scanning (b) Line following Fig 3・2 Reading a line segment
P 1 q 1 P 2 P 3 q 2 q 3 P 4 P 5 (a) Input drawing q 4 q 5 (b) Selected template Fig. 3・ 3 Wave-propagation pattern matching(WPPM)
Top down process - Context and focus attention All the things in a picture are not necessarily recognized in a certain context, but some attentional objects are focused.
- Attentional rules 1.Objects related to a goal in the execution of a plan 2.Dangerous objects 3.Favorite things 4. Sudden, big change_in_location/ _shape ----
((S, thing), time passage, existence's)) move +((OX, thing), existence(OX)) +movement_ perpendicularly(S) +(S, movable_ by _oneself) V 2 V 1 +movement downward (S, (T 0, T 1)) +((S , legs), walking_figure(S)) descend walk V 3 +((S, direction), go_forward(S), come_close(S, OT)) go +touch(S, OT, T 1) +((OF, inside), inside(S, OF, T 0), outside(S, OF, T 1)) get out touch +short_time +(OF, veihcle)) get off collide Fig. 3・ 4 Reasoning network of change_in_location
u Structural analysis - Technologies * * Numerical computation Logical computation Gestalt processing Template matching
Logical computation u Sj Si Fig. 3・ 5 Boolean judgment of “inside/outside”
- Gestalt processing Metzger’s rule Continuation: two line segments meeting with angle 180° Enclosure: a domain enclosed by contours
(a) Complex pictures (b) Template Fig. 3・ 6 Symbol processing
Experiments u Reading and recognition of line drawings by hands u Structural analysis of static pictures u Natural language understanding (NLU) of before-after state pairs u NLU of picture sequences
(a) Before-state Generated sentences: 1)A man(4) moves(1). 2)A man(4) passes(1). 3)A man(4) walks. 4)A man(4) goes forward(1). 5)A man(4) goes out(1) of a house. Fig. 3・ 7 NLU (b) After-state 6) A man(4) heads for a car. 7) A man(4) goes to(1) a car. 8) A man(4) comes to(1) a car. 9) A man(4) gets near a car. ----- of a before-after state pair
t=t 0 t=t 1 t=t 2 t=t 3 t=t 4 t=t 5 t=t 0 A man(4) is in a house -------- t=t 1 A man(4) goes out(1) a house of t=t 2 A man(4) gets on(1) a car. -------- t=t 3 A car runs(2). -------- t=t 4 A car collides with a tree. -------- t=t 5 A bird(1) leaves(1) a tree. A man(3) get off (1) a car. -------- Fig. 3・ 8 NLU of a picture sequence
Evaluation u Reading and recognition About 150 primitive pictures were input, the 88% of which were correctly recognized and the 95% of which could be possible by some improvement.
u Structural analysis of beforeafter state pairs Note that the current image processing technology can process gray-scale image sequences by real-time
u Meaning understanding Our technology is still useful for all the subcategories of events except mental one u Historical significance This research took the lead in the field of NLU of moving picture patterns in ’ 70 s.
Main publications 1976 N. Okada & T. Tamachi: Interpretation of Moving Picture Patterns and its Description in Natural Language---Semantic Analysis, IEICE Trans(D), Vol. J 59 D, No. 5, pp. 331 -338. 1979 N. Okada: SUPP---Understanding Moving Picture Patterns Based on Linguistic Knowledge, Proc. IJCAI, pp. 690 -693.
4.Emotion processing vs. knowledge processing Why does AI need emotion processing? (1) Texts, e. g. social articles in newspapers often touch humanity such as glad/sad or gain/loss. (2) Some intelligent agents should be friendly to humans. (3) Some kinds of processing need a mechanism for evaluation of input information.
§ Research in early years J. G. Carbonell ’80 Story understanding by personality Pfeifer & Nicholas ’ 85 Simulation of emotion mechanism by “interruption” Okada ’ 87 Emotion model in NLU
§ Our approach - Evocation and response Analysis of general property and algorithm - Roles shared by emotion and knowledge
Analysis of emotion § Multi-factor analysis by Plutchik Plutchik divided emotions into two categories: “primary” and “complex” [Plutchik60]. - We follows this idea, and take the followings as primary emotion: Gladness/sadness, like/dislike, surprise, expectancy, anger, and fear.
(Gladness( the current state is better than the previous ( physiological (inner pleasure; outer pleasure); psychological ( goal achievement( information collection (expected; discover; become clear); plan (planning); results (completion; gain; useful)); personal relations( companion mind (agreement; sympathy; collaboration; make_friends_again); superiority/inferiority (superior; praise; obedience; hospitality; protection))); others)))) Fig. 4・1 Hierarchical features of gladness
Evocation of emotion n - Reflective Evoked unconsciously by a sudden stimulus from the external world or a remarkable change in the internal. Reflective response follows it. - Deliberative Evoked consciously by a cognitive process. Deliberate reasoning mediates between the input and its response.
n Response of emotion - General trends If one is brought “pleasure” by an input, one promotes the input stimulus through one’s response, otherwise one inhibits it. - Type of response * Free * Constrained
- Free An emotion is evoked straight to a stimulus, and a promoting/inhibitory response follows it. The response may cause to give up a task under execution. - Constrained Even if a free emotion is evoked internally, some task under execution inhibits straight expression
Emotion vs. knowledge § Language expression Emotion is adjective whereas knowledge is verb This implies that emotions are attributes. Since an attribute gives a measure to detect the difference between an object-standard pair, evocation of an emotion is measurement of the input stimulus.
Subjective and objective Emotion: subjective evaluation of information Knowledge: memory of objective information § Pattern of evaluation Formation of personality n
Experiments Simulation of protagonists of fables - Free evocation in a series of actions (Shown in Chapter 5) - Constrained evocation in dialog process
§ A dialog---invitation K1 Hi. P2 Hi. K3 Where are you going? P4 To the river for fishing. K5 Sounds good. P6 And you? K7 I’m going to the mansion to drink water of the pond. I’m very thirsty. (continued)
P8 The mansion is dangerous. K9 Why? P10 Because I heard a voice when I passed it a while ago. K11 Really?I wonder what shall I do. P12 Why don’t you come to the river with me ? K13 Well, it’s far, isn’t it? P14 But the water there is colder and more tasty. K15 O. K. I’ll come with you.
dialogue model persuade_to_abandon(E-Plan) understand (E-PLAN) tentative_ acceptance (R-PLAN) understand_drawback (R-PLAN) emphasize_advantage(R-Plan) intention recognition utterance planning accept(R-PLAN) persuade_to_accept(R-PLAN) refuse_for_drawback(R-PLAN) inform(E-PLAN) --- deny_drawback(R-PLAN) inform_advantage(R-PLAN) --- action planning seek_advantage(R-PLAN) emotion --- seek_drawback(R-PLAN) seek_drawback(E-PLAN) --- language analysis (E 7) “I’m going to the pond in the mansion to - - -. ” (E 13) “Well, it’s far. ” language generation message flow top-down prediction dialogue state tracking dialogue state transition (R 14) “But the water there is colder and more tasty. ” (R 12) “Why don’t you come to the river with me? ” Fig. 4・ 2 Interaction between discourse and mental analyses
Evaluation § Conceptual analysis The properties of primitive emotions of children were made clear. § Evocation The so-called “non-logical” algorithm was clarified. § Response Complicated responses in behavior and dialog were verified.
Main publications 1987 N. Okada: Representation of knowledge and emotions,Proc. Kyushu Symp. Information processing ,pp. 47 -65. 1997 M. Tokuhisa & N. Okada: A Pattern Recognition Approach to Emotion Arousal of Intelligent Agents, Trans. JSAI, Vol. 39, No. 8, pp. 2440 -2451.
5.Integrated intelligence Intelligence dwells in the mind. Recent research in the fields of cognitive science(CS) and AI throws light on the comprehensive mechanism of the mind.
Computer Models of the Mind u Existent models - M. Minsky ’ 85 System of multi-gents - Okada ’ 87 Mind composed of six domains and five levels - P. N. Johnson-Laird ’ 88 Systematization of the results of research in CS
u The author’s model Fundamentally, we follow Minsky’s multi-agent model. Micro-processor ”μ-agent“ and it’s “chain-activation” are introduced.
μ-agent( name (identifier), domain (attached), input (premise of activation), execution (program), memory (data), description (result), output (message)) Fig. 5・1 Frame representation of μ-agent
- Chain activation Various functions of mind is executed by a “chain activation” or a series of activations of μagents. Recognition Reasoning Behavior Fig. 5・2 Chain activation
Domains of processing The mind consists of six domains which function as follows: u (1) (2) (3) (4) (5) (6) Recognition Reasoning&Design Emotion Expression Memory Language
Language Emotion Memory Reasoning&Design Recognition Expression Mind(brain) Sensors ( Thirst, hunger, …) Actuators Body ( Scene, speech, …) External world ( Behavior, speech, …) Fig. 5. 3 Domains of processing
Plan controller Control Interrupt controller Language Reasoning of behavior 実現可能性 Planning. Emotions Reasoning of di 存在性、ほか 危険性 Memory Plan generator Simulator Evaluator 認識・人間の存在性・交差点 1 Reasoning&Design Recognition 認識・人間の存在性・館 1前1 認識・滑る可能性・池 1 認識・転ぶ可能性・池 1 Expression 認識・落ちる可能性・池 1 認識・人間の存在性・館 1の池 1 Mind( brain ) 認識・人間の存在性・池 1 ( Thirst, hunger…) 認識・人間の存在性・猟師小屋1前1 認識・風邪をひく可能性・池 1 認識・人間の存在性・館 1のぶどう棚 1 Sensors 認識・溺れる可能性・池 1 認識・凍死する可能性・池 1 認識・人間の存在性・橋1の東 1 Body 認識・滑る可能性・館 1の池 1 Actuators ( Scene, External world speech, …) ・・・・・・・・・・・・・・・・ ( Behavior, speech, …) ・・・・・・・・・・・・・・・・
u Levels of data Along concept formation process Level 5 4 3 2 1 Connected concept Simple concept Conceptual feature Cognitive feature Raw data
go Agent Connected concept Origin Inside Movable is_a Human Primitive concept Conceptual feature Ni House Car Ai High Go Composed Roof, Wall, Room, . . , Cognitive feature Raw data Vi Shopping( buy, cash/ card, . store, . . . Movement_ from_inside_ to_outside, . . Difference_in_ length, . . Associated , … Extracted , … , Visual (Internal) (External) , … , (Substance) (Event) Fig. 5・ 4 Levels of data (Attribute)
Aesopworld Project - Implementation of our theory - Simulation of the physical and mental activities of the protagonists of Aesop Fables, e. g. The Fox and the Grapes
Language desire relieve thirst Emotion Reasoning goal &Design relieve thirst plan eat fruits plan Drink water Controller Memory Planknowledge Naturereasoning Planner physiology thirst Recognition Plan generator Simulator Expression Evaluator Reasoner reasoning water in pond Sensors reasoning pot in house reasonin g human near pond Fig. 5・ 5 Chain activation of μ-agents Actuators
Language Emotion plan go to mansion to drink water Controller Reasoning &Design Planner Memory Planaction knowledge movement to Naturemansion reasoning Plan generator Recognition Simulator Evaluator Expression Reasoner Sensors Actuators
Experiments u Main system Four PCs and fifteen interpreters (subdomains) Subdomain 1 Subdomain 2 domain 3 Subdomain 4 Subdomain 5 domain 6 Subdomain 7 Subdomain 8 PC 2 : Turbo Linux 8 PC 1 : Turbo Linux 8 LAN Subdomain 9 Subdomain 10 domain 11 PC 3 : Red. Hat Linux 5. 2 J Subdomain 12 Subdomain 13 domain 14 domain 15 PC 4 : Red. Hat Linux 5. 2 J ) Fig. 5・6 Composition Message server
Fig. 5・7 Snapshot1
Fig. 5・8 Snapshot 2
Fig.5・9 Animation
u Generated monolog by the Fox It’s very hot today. I’m on the animal trail 300 meters from the intersection. I’m very thirsty. I’d like to relieve my thirst in a safe way in a hurry. I’ll search for and drink water. I’ll go home. My home is far. I give up going there. I’ll go under the bridge. It’s far. I give up going there… I study other ways. I’ll search for a place with water. I remember a pond. I’ll find it. I remember the B pond. It’s in the Aesopworld. I’ll go there. A hunter’s lodge is close to it. He’ll probably be in it. He is man. Man is dangerous. I give up going there… I’ll eat watery foods. I’ll search for and eat fruits…
Table 5. 2 Comparison with Minsky and Johnson-Laird Minsky ’ 85 Okada ’ 87 Johnson-Laird ’ 88 Approach Bottom up Top down Domains Many Six Levels many Five Many Technology Multi-agents Turing machine Experiment Yes No No
Evaluation u Various mental activities discussed in CS and AI could be captured by our six domains and five levels. u An interface to physiology is put at the level of raw data. u This model can be implemented if the number of μ-agents is less than ten thousands. u Our integrated intelligence took the lead in verifying its validation by experiments.
Main publications 1990 N. Okada and T. Endo: Story Generation Based on Dynamics of the Mind, Computational Intelligence, Vol. 8, No. 1, pp. 123 -160. 1996 N. Okada: Integrating Vision, Motion, and Language through the Mind, Artificial Intelligence Reiview, Vol. 10, pp. 209 -234.
6.Residual problems and social applications § Problems - Learning through experiences - Implementation to robots § Applications - Support agents for education or diagnosis - Partner of handicapped/elder people
7.Conclusions § Concepts of substance, attribute, event, and space/time are systematically analyzed and classified. § A system for NLU of picture sequences were constructed. § Primitive emotions were analyzed and implemented in the tasks of action and dialog planning.
§ A computer model of the mind with six domains of processing and five levels of data was proposed, and was implemented with twelve hundreds μ-agents on computers. These results led us to a conclusion that an infrastructure to construct complex intelligence covering many subfields could be obtained.