ﺍﺗﻮﻣﺎﺗﺎی یﺎﺩگیﺮ Instructor Saeed Shiry 1

ﺍﺗﻮﻣﺎﺗﺎی یﺎﺩگیﺮ Instructor : Saeed Shiry 1

ﻣﻘﺪﻣﻪ l l An automaton is a machine or control mechanism designed to automatically follow a predetermined sequence of operations or respond to encoded instructions. The concept of learning automaton grew out of a fusion of the work of psychologists in modeling observed behavior, the efforts of statisticians to model the choice of experiments based on past observations, the attempts of operation researchers to implement optimal strategies in the context of the two-armed bandit problem, and the endeavors of system theorists to make rational decisions in random environments 2

Stochastic Learning Automata– Reinforcement Learning 3

Stochastic Learning Automata– Reinforcement Learning l l l In classical control theory, the control of a process is based on complete knowledge of the process/system. The mathematical model is assumed to be known, and the inputs to the process are deterministic functions of time. Later developments in control theory considered the uncertainties present in the system. Stochastic control theory assumes that some of the characteristics of the uncertainties are known. However, all those assumptions on uncertainties and/or input functions may be insufficient to successfully control the system if changes. It is then necessary to observe the process in operation and obtain further knowledge of the system, i. e. , additional information must be acquired on-line since a priori assumptions are not sufficient. One approach is to view these as problems in learning. 4

reinforcement learning l l A crucial advantage of reinforcement learning compared to other learning approaches is that it requires no information about the environment except for the reinforcement signal. A reinforcement learning system is slower than other approaches for most applications since every action needs to be tested a number of times for a satisfactory performance. l Either the learning process must be much faster than the environment changes, or the reinforcement learning must be combined with an adaptive forward model that anticipates the changes in the environment 5

applications of learning automata l Some Recent applications of learning automata to real life problems: l l l l l control of absorption columns, Bioreactors, control of manufacturing plants, pattern recognition , graph partitioning , active vehicle suspension, path planning for manipulators , distributed fuzzy logic processor training , path planning and action selection for autonomous mobile robots. 6

learning paradigm l l The the learning automaton presents may be stated as follows: a finite number of actions can be performed in a random environment. l l l When a specific action is performed the environment provides a random response which is either favorable or unfavorable. The objective in the design of the automaton is to determine how the choice of the action at any stage should be guided by past actions and responses. The important point to note is that the decisions must be made with very little knowledge concerning the “nature” of the environment. l The uncertainty may be due to the fact that the output of the environment is influenced by the actions of other agents unknown to the decision maker. 7

The automaton and the environment 8

The environment l l The environment in which the automaton “lives” responds to the action of the automaton by producing a response, belonging to a set of allowable responses, which is probabilistically related to the automaton action. The term environment is not easy to define in the context of learning automata. The definition encompasses a large class of unknown random media in which an automaton can operate. 9

The environment l Mathematically, an environment is represented by a triple: {a, c, b} l a represents a finite action/output set, b represents a (binary) input/response set, and l c is a set of penalty probabilities, where each l element ci corresponds to one action ai of the set a. 10

The environment l l The output (action) a(n) of the automaton belongs to the set a, and is applied to the environment at time t = n. The input b(n) from the environment is an element of the set b and can take on one of the values b 1 and b 2. l In the simplest case, the values bi are chosen to be 0 and 1, l l The elements of c are defined as: l l l 1 is associated with failure/penalty response. { Prob b(n) a } c (i , , . . . ) Therefore ci is the probability that the action ai will result in a penalty input from the environment. When the penalty probabilities ci are constant, the environment is called a stationary environment. 11

Models l P-model l Models in which the input from the environment can take only one of two values, 0 or 1, are referred to as P-models. In this simplest case, the response value of 1 corresponds to an “unfavorable” (failure, penalty) response, while output of 0 means the action is “favorable” l Q-model l A further generalization of the environment allows finite response sets with more than two elements that may take finite number of values in an interval [a, b]. Such models are called Qmodels. l S-model l When the input from the environment is a continuous random variable with possible values in an interval [a, b], the model is named S-model. 12

The automaton l The automaton can be represented by a quintuple {F, a, b, F( • , • ), H( • , • )} where: l l l F is a set of internal states. At any instant n, the state f(n) is an element of the finite set F = {f 1, f 2, . . . , fs} a is a set of actions (or outputs of the automaton). The output or action of an automaton an the instant n, denoted by a(n), is an element of the finite set a = {a 1, a 2, . . . , ar} b is a set of responses (or inputs from the environment). The input from the environment b(n) is an element of the set b which could be either a finite set or an infinite set, such as an interval on the real line: b = {b 1, b 2 , . . . , bm} or b = {(a, b)} 13

The automaton l l F( • , • ): F x b ® F is a function that maps the current state and input into the next state. F can be deterministic or stochastic: f(n +1) = F[f(n), b(n)] H( • , • ): F x b ® a is a function that maps the current state and input into the current output. If the current output depends on only the current state, the automaton is referred to as state-output automaton. In this case, the function H( • , • ) is replaced by an output function G( • ): F ® a, which can be either deterministic or stochastic: a(n) = G[f(n)] 14

The Stochastic Automaton l l In stochastic automaton at least one of the two mappings F and G is stochastic. If the transition function F is stochastic, the elements fij b of F represent the probability that the automaton moves from state fi to state fj following an input b: 15

The Stochastic Automaton l For the mapping G, the definition is similar: l Since fij are probabilities, they lie in the closed interval [a, b]; and to conserve probability measure we must have: b 16

The Stochastic Automaton 17

Automaton and Its Performance Evaluation l l l A learning automaton generates a sequence of actions on the basis of its interaction with the environment. If the automaton is “learning” in the process, its performance must be superior to “intuitive” methods. To judge the performance of the automaton, we need to set up quantitative norms of behavior. The quantitative basis for assessing the learning behavior is quite complex, even in the simplest Pmodel and stationary random environments. To introduce the definitions for “norms of behavior”, we will consider this simplest case 18

Norms of Behavior l l If no prior information is available, there is no basis in which the different actions ai can be distinguished. In such a case, all action probabilities would be equal to a “pure chance” situation. For an r-action automaton, the action probability vector p(n) = Pr {a(n) = ai } is given by: Such an automaton is called “pure chance automaton, ” and will be used as the standard for comparison. 19

Norms of Behavior l Consider a stationary random environment with penalty probabilities l We define a quantity M(n) as the average penalty for a given action probability vector: 20

Norms of Behavior l For the pure-chance automaton, M(n) is a constant denoted by Mo: l Also note that: l i. e. , E[M(n)] is the average input to the automaton. 21

Norms of Behavior 22

Variable Structure Automata l l A more flexible learning automaton model can be created by considering more general stochastic systems in which the action probabilities (or the state transitions) are updated at every stage using a reinforcement scheme. For simplicity, we assume that each state corresponds to one action, i. e. , the automaton is a state-output automaton.

reinforcement scheme l A reinforcement scheme can be represented as follows: l where T 1 and T 2 are mappings.

Linear Reinforcement Schemes the parameter a is associated with reward response, and the parameter b with penalty response. If the learning parameters a and b are equal, the scheme is called the linear reward-penalty scheme LR-P general linear schemes

Linear Reinforcement Schemes l by analyzing eigen values of the resulting difference equation, it can be shown that asymptotic solution of the set of difference equations enables us to conclude: l Therefore, the multi-action automaton using the LR-P scheme is expedient for all initial action probabilities and in all stationary random environments.

Expediency l l l Expediency is a relatively weak condition on the learning behavior of a variable-structure automaton. An expedient automaton will do better than a pure chance automaton, but it is not guaranteed to reach the optimal solution. In order to obtain a better learning mechanism, the parameters of the linear reinforcement scheme are changed as follows: l l if the learning parameter b is set to 0, then the scheme is named the linear reward-inaction scheme LR-I. This means that the action probabilities are updated in the case of a reward response from the environment, but no penalties are assessed.

Interconnected Automata l l l it is possible that there are more than one automata in an environment. If the interaction between different automata is provided by the environment, the case of multi-automata is not different than a single automaton case. The environment reacts to the actions of multiple automata, and the environment output is a result of the combined effect of actions chosen by all automata. If there is direct interaction between the automata, such as the hierarchical (or sequential) automata models, the actions of some automata directly depend on the actions of others. It is generally recognized that the potential of learning automata can be increased if specific rules for interconnections can be established. Example: A Vehicle Control l Since each vehicle’s planning layer will include two automata — one for lateral, the other for longitudinal actions — the interdependence of these two sets of actions automatically results in an interconnected automata network.

Application of Learning Automata to Intelligent Vehicle Control l l Designing a system that can safely control a vehicle’s actions while contributing to the optimal solution of the congestion problem is difficult When the design of a vehicle capable of carrying out tasks such as vehicle following at high speeds, automatic lane tracking, and lane changing is complete, we must also have a control/decision structure that can intelligently make decisions in order to operate the vehicle in a safe way.

Vehicle Control l The aim here is to design an automata system that can learn the best possible action (or action pairs: one for lateral, one for longitudinal) based on the data received from on-board sensors.

The Model l l For our model, we assume that an intelligent vehicle is capable of two sets of lateral and longitudinal actions. l Lateral actions are shift-to-left-lane (SL), shift-to-right-lane (SR) and stayin- lane (Si. L). l Longitudinal actions are accelerate (ACC), decelerate (DEC) and keep-same-speed(SM). There are nine possible action pairs provided that speed deviations during lane changes are allowed.

Sensors l l An autonomous vehicle must be able to ‘sense’ the environment around itself. In the simplest case, it is to be equipped with at least one sensor looking at the direction of possible vehicle moves. Furthermore, an autonomous vehicle must also have the knowledge of the rate of its own displacement. Therefore, we assume that there are four different sensors on board the vehicle: headway sensor, two side sensors, and a speed sensor. l l l The headway sensor is a distance measuring device which returns the headway distance to the object in front of the vehicle. An implementation of such a device is a laser radar. Side sensors are assumed to be able to detect the presence of a vehicle traveling in the immediately adjacent lane. Their outputs are binary. Infrared or sonar detectors are currently used for this type of sensor. The speed sensor is simply an encoder returning the current wheel speed of the vehicle.

Automata in a multi-teacher environment connected to the physical layers

Mapping l l The mapping F from sensor module outputs to the input b of the automata can be a binary function (for a P-model environment), a linear combination of four teacher outputs, or a more complex function ¾ as is the case for this application. An alternative and possibly more ideal model would use a linear combination of teacher outputs with adjustable weight factors (e. g. , S-model environment).

buffer in regulation layer l The regulation layer is not expected to carry out the action chosen immediately. This is not even possible for lateral actions. To smooth the system output, the regulation layer carries out an action if it is recommended m times consecutively by the automaton, where m is a predefined parameter less than or equal to the number of iterations per second. 35

ﻣﺮﺍﺟﻊ l l Phd Thesis: Unsal, Cem , Intelligent Navigation of Autonomous Vehicles in an Automated Highway System: Learning Methods and Interacting Vehicles Approach http: //scholar. lib. vt. edu/theses/available/etd 5414132139711101/ http: //ceit. aut. ac. ir/~shiry/lecture/machinelearning/tutorial/LA/ 36

ﺍﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺳﻠﻮﻟﻲ Cellular Learning Automata Amirkabir Univerity - Machine learning Course

ﺍﺗﻮﻣﺎﺗﺎﻱ ﺳﻠﻮﻟﻲ) (CA ü ﺍﺗﻮﻣﺎﺗﺎﻱ ﺳﻠﻮﻟﻲ ﺷﺒﻜﻪ ﺍﻱ ﺍﺯ ﺳﻠﻮﻟﻬﺎ ﺳﺖ کﻪ ﺩﺭ ﻳک ﺗﻮپﻮﻟﻮژﻲ ﻣﺸﺨﺺ ﻗﺮﺍﺭ گﺮﻓﺘﻪ ﺍﻧﺪ. ü ﺑﺮﺍﻱ ﻫﺮ ﺳﻠﻮﻝ، ﺳﻠﻮﻟﻬﺎﻱ ﺩﻳگﺮﻱ ﺑﻪ ﻋﻨﻮﺍﻥ ﻫﻤﺴﺎﻳﻪ ﻭﺟﻮﺩ ﺩﺍﺭﻧﺪ. • ﺩﻭ ﻭﻳژگﻲ ﻫﻤﺴﺎﻳگﻲ: 1( ﻫﺮ ﺳﻠﻮﻝ ﻫﻤﺴﺎﻳﻪ ﺧﻮﺩﺵ ﺍﺳﺖ. 2( ﺍگﺮ ﺳﻠﻮﻝ x ﻫﻤﺴﺎﻳﻪ ﺳﻠﻮﻝ y ﺍﺳﺖ, ﺳﻠﻮﻝ y ﻫﻢ ﻫﻤﺴﺎﻳﻪ ﺳﻠﻮﻝ x ﺍﺳﺖ. ü ﺑﺮﺍﻱ ﻫﺮ ﺳﻠﻮﻝ ﺗﻌﺪﺍﺩﻱ ﻣﺘﻨﺎﻫﻲ ﺣﺎﻟﺖ) (state ﻭﺟﻮﺩ ﺩﺍﺭﺩ کﻪ ﺩﺭ ﻫﺮ ﻟﺤﻈﻪ ﻫﺮ ﺳﻠﻮﻝ ﺩﺭ ﻳکﻲ ﺍﺯ ﺍﻳﻦ ﺣﺎﻟﺖ ﻫﺎﺳﺖ. 83 Amirkabir Univerity - Machine learning Course

ü ﺩﺭ CA ﻣﺠﻤﻮﻋﻪ ﺍﻱ ﺍﺯ ﻗﻮﺍﻧﻴﻦ ﻣﺤﻠﻲ ﺗﻌﺮﻳﻒ ﻣﻲ ﺷﻮﺩ کﻪ ﺑﺮ ﺍﺳﺎﺱ ﺣﺎﻟﺖ ﻫﻤﺴﺎﻳگﺎﻥ ﺳﻠﻮﻝ، ﺣﺎﻟﺖ ﺑﻌﺪﻱ آﻦ ﺳﻠﻮﻝ ﺭﺍ ﻣﺸﺨﺺ ﻣﻲ کﻨﺪ. Neighborhood Rules Next Step 93 Amirkabir Univerity - Machine learning Course

ﻭﻳژگﻲﻫﺎﻱ ﺍﺳﺎﺳﻲ ﺍﺗﻮﻣﺎﺗﺎﻱ ﺳﻠﻮﻟﻲ ﻋﺒﺎﺭﺗﻨﺪ ﺍﺯ: آ- ﻓﻀﺎﻳﻲ گﺴﺴﺘﻪ ﺩﺍﺭﻧﺪ. ﺏ- ﺯﻣﺎﻥ ﺑﺼﻮﺭﺕ گﺴﺴﺘﻪ پﻴﺶ ﻣﻲﺭﻭﺩ. پ- ﺣﺎﻟﺘﻬﺎﻱ کﻪ ﺳﻠﻮﻟﻬﺎ ﻣﻲ ﺗﻮﺍﻧﻨﺪ ﺩﺍﺭﺍ ﺑﺎﺷﻨﺪ ﻣﺘﻨﺎﻫﻲ ﺍﺳﺖ. ﺕ- ﺗﻤﺎﻡ ﺳﻠﻮﻟﻬﺎ ﻳﻜﺴﺎﻥ ﻣﻴﺒﺎﺷﻨﺪ. ﺙ- ﻋﻤﻞ ﺑﺮﻭﺯ ﺩﺭ آﻮﺭﺩﻥ ﺳﻠﻮﻟﻬﺎ ﺑﺼﻮﺭﺕ ﻫﻤگﺎﻡ ﻣﻲﺑﺎﺷﺪ. ﺝ- ﻗﻮﺍﻧﻴﻦ ﺗﺼﺎﺩﻓﻲ ﻧﺒﻮﺩﻩ ﻭ ﺑﻄﻮﺭ ﻗﻄﻌﻲ ﺍﻋﻤﺎﻝ ﻣﻲﺷﻮﻧﺪ. چ- ﻗﺎﻧﻮﻥ ﺩﺭ ﻫﺮ ﻣﺤﻞ ﻓﻘﻂ ﺑﺴﺘگﻲ ﺑﻪ ﻣﻘﺎﺩﻳﺮ ﻫﻤﺴﺎﻳﻪﻫﺎﻱ آﻦ ﺩﺍﺭﺩ. 04 Amirkabir Univerity - Machine learning Course

ﻣﺜﺎﻝ: Game of Life ﻗﻮﺍﻧﻴﻦ : 1. ﻫﺮ ﺳﻠﻮﻝ کﻪ ﺩﻭ ﻳﺎ ﺳﻪ ﻫﻤﺴﺎﻳﻪ آﻦ ﺯﻧﺪﻩ ﺑﺎﺷﻨﺪ، ﺯﻧﺪﻩ ﻣﻲ ﻣﺎﻧﺪ. 2. ﻫﺮ ﺳﻠﻮﻝ ﺑﺎ چﻬﺎﺭ ﻳﺎ ﺑﻴﺸﺘﺮ ﻫﻤﺴﺎﻳﻪ ﺯﻧﺪﻩ ﺑﻪ ﺧﺎﻃﺮ ﺍﺯﺩﺣﺎﻡ ﺟﻤﻌﻴﺖ ﻣﻲ ﻣﻴﺮﺩ. 3. ﻫﺮ ﺳﻠﻮﻝ ﺑﺎ ﻳک ﻳﺎ ﻫﻴچ ﻫﻤﺴﺎﻳﻪ ﺯﻧﺪﻩ ﺍﺯ ﺗﻨﻬﺎﻳﻲ ﻣﻲ ﻣﻴﺮﺩ. 4. ﻫﺮ ﺳﻠﻮﻝ ﻣﺮﺩﻩ کﻪ ﺩﻗﻴﻘﺎ ﺳﻪ ﻫﻤﺴﺎﻳﻪ ﺯﻧﺪﻩ ﺩﺍﺷﺘﻪ ﺑﺎﺷﺪ ﻣﺘﻮﻟﺪ ﻣﻲ ﺷﻮﺩ. 14 Amirkabir Univerity - Machine learning Course

ﻗﺎﺑﻠﻴﺘﻬﺎ ﻭ ﻧﻘﺼﻬﺎﻱ ﺍﺗﻮﻣﺎﺗﺎﻱ ﺳﻠﻮﻟﻲ § CA ﻣﺪﻟﻲ ﺍﺳﺖ کﻪ ﺍﺯ ﻳکﺴﺮﻱ ﺍﺟﺰﺍﺀ ﻣﺸﺎﺑﻪ ﻭ ﺳﺎﺩﻩ ﺗﺸکﻴﻞ ﺷﺪﻩ ﺍﺳﺖ کﻪ ﻗﻮﺍﻧﻴﻦ ﺑﺴﻴﺎﺭ ﺳﺎﺩﻩ ﻣﺤﻠﻲ ﻧﻴﺰ ﺑﺮ آﻨﻬﺎ ﺣﺎکﻢ ﺍﺳﺖ. ﺍﻣﺎ ﺩﺭ ﻧﻬﺎﻳﺖ ﻣﻲ ﺗﻮﺍﻧﺪ ﺳﻴﺴﺘﻤﻬﺎﻱ پﻴچﻴﺪﻩ ﺍﻱ ﺭﺍ ﻣﺪﻝ کﻨﺪ. § ﻳک ﺍﺷکﺎﻝ ﻋﻤﺪﻩ CA ﺗﻌﻴﻴﻦ ﻓﺮﻡ ﻗﻄﻌﻲ ﻗﻮﺍﻧﻴﻦ ﻣﻮﺭﺩ ﻧﻴﺎﺯ ﺑﺮﺍﻱ ﻳک کﺎﺭﺑﺮﺩ ﺧﺎﺹ ﺍﺳﺖ ﻭ ﺍﻳﻨکﻪ CA ﺑﺮﺍﻱ ﻣﺪﻝ کﺮﺩﻥ ﺳﻴﺴﺘﻤﻬﺎﻱ ﻗﻄﻌﻲ ﻣﻨﺎﺳﺐ ﻣﻲ ﺑﺎﺷﺪ. § پﺲ ﺑﺎﻳﺪ ﺑﻪ ﺩﻧﺒﺎﻝ ﺭﻭﺷﻲ ﺑﺎﺷﻴﻢ کﻪ ﺑﺪﻭﻥ ﻧﻴﺎﺯ ﺑﻪ ﺗﻌﻴﻴﻦ ﻓﺮﻡ ﻗﻄﻌﻲ ﻗﻮﺍﻧﻴﻦ، ﺑﺎ گﺬﺷﺖ ﺯﻣﺎﻥ ﻗﻮﺍﻧﻴﻦ ﻣﻨﺎﺳﺐ ﺍﺳﺘﺨﺮﺍﺝ ﺷﻮﻧﺪ. ﻫﻮﺷﻤﻨﺪ کﺮﺩﻥ ﺳﻠﻮﻟﻬﺎﻱ CA ﻭ ﺍﻓﺰﻭﺩﻥ ﻗﺎﺑﻠﻴﺖ ﻳﺎﺩگﻴﺮﻱ ﺑﻪ آﻨﻬﺎ ﻳکﻲ ﺍﺯ ﺍﻳﻦ ﺭﻭﺷﻬﺎﺳﺖ! 24 Amirkabir Univerity - Machine learning Course

ﺎﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺳﻠﻮﻟﻲ CLAü ﻳک CA ﺍﺳﺖ کﻪ ﻫﺮ ﺳﻠﻮﻝ آﻦ ﺑﻪ ﻳک LA ﻣﺠﻬﺰ ﻣﻲ ﺑﺎﺷﺪ. ü ﺍﻳﻦ ﻣﺪﻝ ﺍﺯ CA ﺑﻬﺘﺮ ﺍﺳﺖ, ﺑﻪ ﺩﻟﻴﻞ ﻗﺎﺑﻠﻴﺖ ﻳﺎﺩگﻴﺮﻱ کﻪ ﺩﺍﺭﺍﺳﺖ. ü ﻭ ﺑﺮ LA ﻧﻴﺰ ﺑﺮﺗﺮﻱ ﺩﺍﺭﺩ ﺯﻳﺮﺍ ﻣﺠﻤﻮﻋﻪ ﺍﻱ ﺍﺯ LA ﻫﺎﺳﺖ کﻪ ﻣﻲ ﺗﻮﺍﻧﻨﺪ ﺑﺎ ﻫﻢ ﻓﻌﻞ ﻭ ﺍﻧﻔﻌﺎﻝ ﺩﺍﺷﺘﻪ ﺑﺎﺷﻨﺪ. ü ﺍﻳﺪﻩ ﺍﺻﻠﻲ CLA کﻪ ﺍﻳﻦ ﺍﺳﺖ کﻪ ﺍﺯ ﻳﺎﺩگﻴﺮﻱ ﺑﺮﺍﻱ ﺗﻨﻈﻴﻢ چگﻮﻧگﻲ ﺍﻧﺘﻘﺎﻝ ﻭﺿﻌﻴﺘﻬﺎ ﺩﺭ CA ﺍﺳﺘﻔﺎﺩﻩ کﻨﻴﻢ. 34 Amirkabir Univerity - Machine learning Course

ﺗﻌﺮﻳﻒ ﺭﻳﺎﺿﻲ ﺎﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺳﻠﻮﻟﻲ ﺍﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺳﻠﻮﻟﻲ d ﺑﻌﺪﻱ ﻳﻚ چﻨﺪﺗﺎﻳﻲ ü ü ü ﺍﺳﺖ ﺑﻪ ﻃﻮﺭﻳﻜﻪ: ﻳﻚ ﺷﺒﻜﻪ ﺍﺯ d ﺗﺎﻳﻲ ﻫﺎﻱ ﻣﺮﺗﺐ ﺍﺯ ﺍﻋﺪﺍﺩ ﺻﺤﻴﺢ ﻣﻲ ﺑﺎﺷﺪ. ﺍﻳﻦ ﺷﺒﻜﻪ ﻣﻲ ﺗﻮﺍﻧﺪ ﻳﻚ ﺷﺒﻜﻪ ﻣﺘﻨﺎﻫﻲ، ﻧﻴﻤﻪ ﻣﺘﻨﺎﻫﻲ ﻳﺎ ﻧﺎﻣﺘﻨﺎﻫﻲ ﺑﺎﺷﺪ. ﻳﻚ ﻣﺠﻤﻮﻋﻪ ﻣﺘﻨﺎﻫﻲ ﺍﺯ ﺣﺎﻟﺘﻬﺎ ﻣﻲ ﺑﺎﺷﺪ. ، ﻳﻚ ﻣﺠﻤﻮﻋﻪ ﺍﺯ ﺍﺗﻮﻣﺎﺗﺎﻫﺎﻱ ﻳﺎﺩگﻴﺮ ) (LA ﺍﺳﺖ ﻛﻪ ﻫﺮ ﺍﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺑﻪ ﻳﻚ ﺳﻠﻮﻝ ﻧﺴﺒﺖ ﺩﺍﺩﻩ ﻣﻲ ﺷﻮﺩ. ، ﻳﻚ ﺯﻳﺮ ﻣﺠﻤﻮﻋﻪ ﻣﺘﻨﺎﻫﻲ ﺍﺯ ﻣﻲ ﺑﺎﺷﺪ ﻛﻪ ﺑﺮﺩﺍﺭ ﻫﻤﺴﺎﻳگﻲ ﺧﻮﺍﻧﺪﻩ ﻣﻲ ﺷﻮﺩ. ﻗﺎﻧﻮﻥ ﻣﺤﻠﻲ CLA ﻣﻲ ﺑﺎﺷﺪ ﺑﻪ ﻃﻮﺭﻳﻜﻪ ﻣﺠﻤﻮﻋﻪ ﻣﻘﺎﺩﻳﺮﻱ ﺍﺳﺖ ﻛﻪ ﻣﻲ ﺗﻮﺍﻧﺪ ﺑﻪ ﻋﻨﻮﺍﻥ ﺳﻴگﻨﺎﻝ ﺗﻘﻮﻳﺘﻲ پﺬﻳﺮﻓﺘﻪ ﺷﻮﺩ. 44 Amirkabir Univerity - Machine learning Course

کﺎﺭﺑﺮﺩﻫﺎﻱ ﺍﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺳﻠﻮﻟﻲ § ﻛﺎﺭﺑﺮﺩﻫﺎﻱ پﺮﺩﺍﺯﺵ ﺗﺼﻮﻳﺮﻱ )ﺣﺬﻑ ﻧﻮﻳﺰ ﻭ ﺗﻌﻴﻴﻦ ﻟﺒﻪ( §ﺗﺨﺼﻴﺺ ﻣﻨﺎﺑﻊ ﺩﺭ ﺷﺒﻜﻪ ﻫﺎﻱ ﻣﻮﺑﺎﻳﻞ )ﺗﺨﺼﻴﺺ ﻛﺎﻧﺎﻝ ﻭ پﺬﻳﺮﺵ ﻛﺎﻧﺎﻝ( § ﻣﺪﻝ ﺳﺎﺯﻱ پﺪﻳﺪﻩ ﻫﺎ )ﺍﻧﺘﺸﺎﺭ ﺷﺎﻳﻌﻪ ﻭ ﺑﺎﺯﺍﺭ ﺍﻗﺘﺼﺎﺩﻱ( § گﺮﻳﺪﻫﺎﻱ ﻣﺤﺎﺳﺒﺎﺗﻲ § ﻃﺮﺍﺣﻲ ﺳﻴﺴﺘﻤﻬﺎﻱ چﻨﺪ ﻋﺎﻣﻠﻪ § ﺍﻟگﻮﺭﻳﺘﻤﻬﺎﻱ ﺑﻬﻴﻨﻪ ﺳﺎﺯﻱ 54 Amirkabir Univerity - Machine learning Course

کﺎﺭﺑﺮﺩ CLA ﺩﺭ پﺮﺩﺍﺯﺵ ﺗﺼﻮﻳﺮ ü ﺍﺑﺘﺪﺍ ﺗﺼﻮﻳﺮ ﺩﺭ ﺍﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺳﻠﻮﻟﻲ ﺩﻭﺑﻌﺪﻱ ﻧگﺎﺷﺖ ﻣﻲ ﺷﻮﺩ ﺑﻄﻮﺭﻳکﻪ ﻫﺮ پﻴکﺴﻞ ﺑﻪ ﻳکﻲ ﺍﺯ ﺳﻠﻮﻟﻬﺎﻱ ﺍﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺩﺍﺩﻩ ﻣﻲ ﺷﻮﺩ. ü ﺑﺎ ﺗﻮﺟﻪ ﺑﻪ کﺎﺭﺑﺮﺩ ﺧﺎﺹ ﻣﻮﺭﺩ ﻧﻈﺮ، ﺍﻗﺪﺍﻣﻬﺎﻱ ﻣﻤکﻦ ﺑﺮﺍﻱ ﺳﻠﻮﻟﻬﺎ ﻭ ﻗﺎﻧﻮﻥ ﻣﺤﻠﻲ ﺟﺮﻳﻤﻪ ﻭ پﺎﺩﺍﺵ ﺗﻌﻴﻴﻦ ﻣﻲ ﺷﻮﺩ. 64 Amirkabir Univerity - Machine learning Course

ﻣﺜﺎﻝ: ﺗﺸﺨﻴﺺ ﻟﺒﻪ ﻫﺎﻱ ﺗﺼﻮﻳﺮ ﺑﺎ ﺍﺳﺘﻔﺎﺩﻩ ﺍﺯ CLA ü ﻫﺮ ﺍﺗﻮﻣﺎﺗﺎ ﺩﺍﺭﺍﻱ ﺩﻭ ﺍﻗﺪﺍﻡ ﻣﻲ ﺑﺎﺷﺪ: 1. پﻴکﺴﻞ ﺑﻪ ﻟﺒﻪ ﻣﺘﻌﻠﻖ ﺍﺳﺖ. 2. پﻴکﺴﻞ ﺑﻪ ﻟﺒﻪ ﻣﺘﻌﻠﻖ ﻧﻴﺴﺖ. ü ﺩﺭ ﺍﺑﺘﺪﺍ ﻫﺮ ﺍﺗﻮﻣﺎﺗﺎ ﺑﻪ ﺻﻮﺭﺕ ﺗﺼﺎﺩﻓﻲ ﻳکﻲ ﺍﺯ ﺍﻗﺪﺍﻣﻬﺎﻱ ﺧﻮﺩ ﺭﺍ ﺍﻧﺘﺨﺎﺏ ﻣﻲ کﻨﺪ ﻭ ﺍﻟﺒﺘﻪ ﺗﻌﺪﺍﺩ ﺍﺗﻮﻣﺎﺗﺎﻫﺎﻳﻲ کﻪ ﺍﻗﺪﺍﻡ ﺍﻭﻝ ﺭﺍ ﺍﻧﺘﺨﺎﺏ ﻣﻲ کﻨﻨﺪ کﻤﺘﺮ ﺍﺯ ﺗﻌﺪﺍﺩ ﺍﺗﻮﻣﺎﺗﺎﻫﺎﻳﻲ کﻪ ﺍﻗﺪﺍﻡ ﺩﻭﻡ ﺭﺍ ﺍﻧﺘﺨﺎﺏ ﻣﻲ کﻨﻨﺪ ﺩﺭ ﻧﻈﺮ گﺮﻓﺘﻪ ﻣﻲ ﺷﻮﺩ. ü ﺩﺭ ﻫﺮ ﻣﺮﺣﻠﻪ ﺍﺯ ﺗکﺮﺍﺭ ﻫﺮ ﺍﺗﻮﻣﺎﺗﺎ ﻭﺿﻌﻴﺖ ﺧﻮﺩ ﺭﺍ ﺑﺎ ﻭﺿﻌﻴﺖ ﻫﻤﺴﺎﻳگﺎﻧﺶ ﻣﻘﺎﻳﺴﻪ ﻣﻲ کﻨﺪ ﻭ ﺑﺮ ﺍﺳﺎﺱ ﺍﻳﻦ ﻣﻘﺎﻳﺴﻪ ﺭﻓﺘﺎﺭ ﺧﻮﺩ ﺭﺍ ﺗﺼﺤﻴﺢ ﻣﻲ ﻧﻤﺎﻳﺪ. ﺑﺮ ﺍﻳﻦ ﺍﺳﺎﺱ کﻪ: • ﺍگﺮ ﺩﻭ ﺗﺎ چﻬﺎﺭ ﺍﺗﻮﻣﺎﺗﺎﻱ ﻫﻤﺴﺎﻳﻪ ﻳک ﺳﻠﻮﻝ ﺭﻭﻱ ﻟﺒﻪ ﺑﺎﺷﻨﺪ، ﺍﺣﺘﻤﺎﻻ ﺍﻳﻦ ﺳﻠﻮﻝ ﻫﻢ ﺭﻭﻱ ﻟﺒﻪ ﺍﺳﺖ. • ﺍگﺮ ﻳک ﻳﺎ ﺑﻴﺸﺘﺮ ﺍﺯ چﻬﺎﺭ ﺍﺗﻮﻣﺎﺗﺎﻱ ﻫﻤﺴﺎﻳﻪ ﻳک ﺳﻠﻮﻝ ﺭﻭﻱ ﻟﺒﻪ ﻧﺒﺎﺷﻨﺪ، ﺍﺣﺘﻤﺎﻻ ﺍﻳﻦ ﺳﻠﻮﻝ ﻫﻤﺮﻭﻱ ﻟﺒﻪ ﻧﻴﺴﺖ. 74 Amirkabir Univerity - Machine learning Course

ﺑﺪﻳﻦ ﺻﻮﺭﺕ: • ﺍگﺮ ﻳک ﺳﻠﻮﻝ ﺩﺭ CLA ﺍﻗﺪﺍﻡ ﺍﻭﻝ ﺧﻮﺩ ﺭﺍ ﺍﻧﺘﺨﺎﺏ کﻨﺪ ﻭ ﺗﻌﺪﺍﺩ ﺍﺗﻮﻣﺎﺗﺎﻫﺎﻱ ﻫﻤﺴﺎﻳگﻲ 8ﺗﺎﻳﻲ آﻦ کﻪ ﻫﻤﺎﻥ ﺍﻗﺪﺍﻡ ﺭﺍ ﺍﻧﺘﺨﺎﺏ کﺮﺩﻩ ﺍﻧﺪ ﺑﻴﻦ ﺩﻭ ﺗﺎ چﻬﺎﺭ ﺑﺎﺷﺪ، ﺍﻗﺪﺍﻡ ﺍﻧﺘﺨﺎﺏ ﺷﺪﻩ ﻣﻨﺎﺳﺐ ﺑﻮﺩﻩ ﻭ پﺎﺩﺍﺵ ﻣﻲ گﻴﺮﺩ. • ﺍگﺮ ﻳک ﺳﻠﻮﻝ ﺩﺭ CLA ﺍﻗﺪﺍﻡ ﺩﻭﻡ ﺧﻮﺩ ﺭﺍ ﺍﻧﺘﺨﺎﺏ کﻨﺪ ﻭ ﺗﻌﺪﺍﺩ ﺍﺗﻮﻣﺎﺗﺎﻫﺎﻱ ﻫﻤﺴﺎﻳگﻲ 8ﺗﺎﻳﻲ آﻦ کﻪ ﻫﻤﺎﻥ ﺍﻗﺪﺍﻡ ﺭﺍ ﺍﻧﺘﺨﺎﺏ کﺮﺩﻩ ﺍﻧﺪ ﻳک ﻳﺎ ﺑﻴﺸﺘﺮ ﺍﺯ چﻬﺎﺭ ﺑﺎﺷﺪ، ﺍﻗﺪﺍﻡ ﺍﻧﺘﺨﺎﺏ ﺷﺪﻩ ﻣﻨﺎﺳﺐ ﺑﻮﺩﻩ ﻭ پﺎﺩﺍﺵ ﻣﻲ گﻴﺮﺩ. • ﺣﺎﻟﺘﻬﺎﻱ ﺩﻳگﺮ ﺑﻪ ﺍﻳﻦ ﻣﻌﻨﻲ ﺍﺳﺖ کﻪ ﺍﻗﺪﺍﻡ ﺍﻧﺘﺨﺎﺏ ﺷﺪﻩ ﻧﺎﺩﺭﺳﺖ ﺑﻮﺩﻩ ﻭ ﺟﺮﻳﻤﻪ ﻣﻲ ﺷﻮﺩ. ü ﻋﻤﻠﻴﺎﺕ ﻓﻮﻕ ﺭﺍ ﺑﻪ ﺗﻌﺪﺍﺩ ﻣﻌﻴﻦ ﻭ ﻳﺎ ﺗﺎ ﺯﻣﺎﻧﻲ کﻪ کﻠﻴﻪ ﺍﺗﻮﻣﺎﺗﺎﻫﺎ ﺑﻪ ﻭﺿﻌﻴﺖ پﺎﻳﺪﺍﺭ ﺑﺮﺳﻨﺪ ﺗکﺮﺍﺭ ﻣﻲ کﻨﻴﻢ. 84 Amirkabir Univerity - Machine learning Course

ﺩﺭ ﺍﺳﺘﺨﺮﺍﺝ ﻟﺒﻪ CLA ﻋﻤﻠکﺮﺩ ﺭﻭﺵ Amirkabir Univerity - Machine learning Course 49

ﻣﻨﺎﺑﻊ H. Beigy and M. R. Meybodi. “A Mathematical Framework For § Cellular Learning Automata”, Advanced in Complex Systems , 2004. § ﻣﺤﻤﺪ ﺭﺿﺎ ﻣﻴﺒﺪﻱ، ﻣﺤﻤﺪ ﺭﻓﻴﻊ ﺧﻮﺍﺭﺯﻣﻲ, “ﺍﺗﻮﻣﺎﺗﺎﻱ ﻳﺎﺩگﻴﺮ ﺳﻠﻮﻟﻲ ﻭ کﺎﺭﺑﺮﺩﻫﺎﻱ آﻦ ﺩﺭ پﺮﺩﺍﺯﺵ ﺗﺼﺎﻭﻳﺮ”, ﺩﺍﻧﺸگﺎﻩ . 1382 ﺍﻣﻴﺮکﺒﻴﺮ، پﺎﻳﻴﺰ Amirkabir Univerity - Machine learning Course 50