Скачать презентацию An Expert System for Chemical Structure Elucidation Sean Скачать презентацию An Expert System for Chemical Structure Elucidation Sean

2b8d2fd9ff5173f5d89dcb086937b99f.ppt

  • Количество слайдов: 37

An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007 An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007

Introduction • I will be discussing an expert system developed to determine the chemical Introduction • I will be discussing an expert system developed to determine the chemical structure of an unknown compound (structure elucidation) • The expert system is implemented on a blackboard

Introduction Motivation • Structure elucidation is a fundamental component of organic chemistry • Requires Introduction Motivation • Structure elucidation is a fundamental component of organic chemistry • Requires a wide range of expertise – Each elucidation technique has its own unique vocabulary that needs to be mastered • An expert system can be used to simplify this process

Introduction Outline • Outline of presentation: 1) Fundamentals of blackboard systems 2) The expertise Introduction Outline • Outline of presentation: 1) Fundamentals of blackboard systems 2) The expertise being modeled • General spectroscopic techniques 3) Description of the expert system

Blackboard Systems “Metaphorically, we can think of a set of workers, all looking at Blackboard Systems “Metaphorically, we can think of a set of workers, all looking at the same blackboard: each is able to read everything that is on it and to judge when he has something worthwhile to add to it. ” – Newell, 1969

Blackboard Systems • A set of experts independently modify solution elements on a central Blackboard Systems • A set of experts independently modify solution elements on a central database to produce a complete solution • The experts communicate solely through their contributions to the central database • Three major components: – 1) a globally accessible database (the blackboard) – 2) a set of knowledge sources (the experts) – 3) a control mechanism (the scheduler)

Blackboard Systems The Blackboard • Blackboard is structured as an abstraction hierarchy • Problems Blackboard Systems The Blackboard • Blackboard is structured as an abstraction hierarchy • Problems can be solved from different points by different knowledge sources • Items on the blackboard are called entries • Entries on the same level or on different levels of the hierarchy are linked • Linked entries constitute a potential solution

Blackboard Systems The Knowledge Sources • Knowledge sources are structured as conditionaction pairs – Blackboard Systems The Knowledge Sources • Knowledge sources are structured as conditionaction pairs – The condition component monitors the blackboard for any changes – The action component makes changes to the blackboard when the condition-part is satisfied • When the condition is satisfied, the knowledge source is “triggered” and the scheduler decides whether the knowledge source will execute its action

Blackboard Systems The Scheduler • One or more problem solving strategies are implemented • Blackboard Systems The Scheduler • One or more problem solving strategies are implemented • The scheduler examines the current state of the blackboard and decides which triggered knowledge source to execute based on the problem solving strategy in place • The scheduler can abandon a strategy and adopt a new one or ignore a strategy altogether in order to pursue the most promising solution

Structure Elucidation • Modern structure elucidation is done using spectroscopy • In absorption spectroscopy Structure Elucidation • Modern structure elucidation is done using spectroscopy • In absorption spectroscopy a frequency of light is irradiated on a sample of the unknown and the absorption of the compound is measured • The resulting data is analyzed by an expert and information about the structure of the unknown can be obtained • The information collected from each spectra is integrated to determine the complete structure

Spectroscopy The Electromagnetic Spectrum Spectroscopy The Electromagnetic Spectrum

Infrared Spectroscopy • Involves the absorption of light in the infrared region of the Infrared Spectroscopy • Involves the absorption of light in the infrared region of the electromagnetic spectrum • Used primarily to determine what functional groups are present in a molecule

Infrared Spectroscopy • The broad peak at around 3000 cm-1 indicates the presence of Infrared Spectroscopy • The broad peak at around 3000 cm-1 indicates the presence of a hydroxyl group (OH) • The strong, sharp peak at around 1750 cm-1 indicates the presence of a carbonyl group

UV Spectroscopy • Involves the absorption of light in the ultraviolet region of the UV Spectroscopy • Involves the absorption of light in the ultraviolet region of the electromagnetic spectrum • Used to determine the level of conjugation in the unknown – Conjugation is alternating single and double bonds • UV spectroscopy is not very useful in structure elucidation

Proton NMR • Contains information about the hydrogens in the molecule • Three key Proton NMR • Contains information about the hydrogens in the molecule • Three key aspects: 1) chemical shift – the “type” of hydrogen 2) integration – ratio of different types of hydrogens 3) splitting – nearest neighbour relationship • Can be used to identify the presence of certain functional groups • Used primarily to determine how the different functional groups present fit together (the connectivity)

Proton NMR • The peak at around 10 ppm indicates the presence of an Proton NMR • The peak at around 10 ppm indicates the presence of an aldehyde • The peak at 2. 6 ppm is split into 4 peaks (a quartet) indicating adjacent to a carbon with 3 hydrogens

Carbon-13 NMR • Contains information about the carbons in the molecule • Three key Carbon-13 NMR • Contains information about the carbons in the molecule • Three key aspects: 1) chemical shift – the “type” of carbon 2) splitting – the number of hydrogens bonded to each carbon 3) number of unique carbons present • Used to determine connectivity

Carbon-13 NMR • Peak at 190 ppm indicates the presence of a carbonyl (C=O) Carbon-13 NMR • Peak at 190 ppm indicates the presence of a carbonyl (C=O) • There are 7 total peaks indicating that there are only 7 unique carbons in the molecule

Mass Spectroscopy • Mass spectroscopy is used to determine the molecular formula of the Mass Spectroscopy • Mass spectroscopy is used to determine the molecular formula of the unknown compound • Mass spectroscopy data that provides structural information tends to be unreliable and thus will only be used to verify a possible structure or in the event that the other spectral techniques are unsuccessful

Structure Elucidation Applicability of a Blackboard Architecture • Each type of spectroscopy is unique Structure Elucidation Applicability of a Blackboard Architecture • Each type of spectroscopy is unique • A human expert will often analyze a set of spectra as a whole, selectively determining which spectral information to utilize at a given time • The blackboard architecture is ideal for this approach • The blackboard architecture also allows for new experts to be added (new spectroscopic techniques)

The Expert System The Blackboard • An expert system implemented on a distributed blackboard The Expert System The Blackboard • An expert system implemented on a distributed blackboard has been developed to determine the structure of a chemical compound • A sequential implementation of a blackboard would allow only one expert to access the blackboard at a time • In a distributed system experts can access different sections of the blackboard at the same time

The Expert System The Blackboard • The hierarchy of the blackboard is based on The Expert System The Blackboard • The hierarchy of the blackboard is based on the complexity of the structures being produced – Low level, basic structures occupy a certain level of the blackboard while more complicated structures occupy a different level

The Expert System The Experts • There are two main types of experts: 1) The Expert System The Experts • There are two main types of experts: 1) Structure generation routines 2) Spectroscopy experts

Structure Generation Routines Storing Structures • Ideally every possible chemical structure could be stored Structure Generation Routines Storing Structures • Ideally every possible chemical structure could be stored but this is not feasible – Even a simple formula such as C 23 H 48 has 5, 731, 580 structural isomers • Instead a set of substructures (components) is stored such that any possible structure can be formed from a combination of these components • There are 630 total components • Components are classified as primary, secondary or tertiary components

Structure Generation Routines Types of Components • 1) Primary Components: – Primary components are Structure Generation Routines Types of Components • 1) Primary Components: – Primary components are the most basic components for constructing organic molecules (CH 3, CH 2, CH, C, CO, OH, O, NH 2, NH, N, SH, S, F, Cl, Br, I) • 2) Secondary Components: – Secondary components are combinations of primary components – There are 86 secondary components • 3) Tertiary Components: – Tertiary components are secondary components with a restriction on what the component can bond to

Structure Generation Routines • The structure generation routines produce sets of primary, secondary or Structure Generation Routines • The structure generation routines produce sets of primary, secondary or tertiary components based on input data • The sets can be further pruned using spectral information

Spectroscopy Experts • There is an expert for each type of spectroscopy: 1) Infrared Spectroscopy Experts • There is an expert for each type of spectroscopy: 1) Infrared Expert 2) Ultraviolet Expert 3) Proton NMR Expert 4) Carbon-13 NMR Expert 5) Mass Spectroscopy Expert

Spectroscopy Experts Spectroscopy Experts

Spectroscopy Experts • The data contained in a spectrum may be unreliable or ambiguous Spectroscopy Experts • The data contained in a spectrum may be unreliable or ambiguous – e. g. in a proton NMR spectrum if the chemical shift between two hydrogens is < 1 then the splitting observed may be inaccurate • Heuristic rules are used to handle this ambiguity • Uncertainty factors are attached to each conclusion drawn from the spectra

Spectroscopy Experts • Each spectral expert translates the data contained in the spectra into Spectroscopy Experts • Each spectral expert translates the data contained in the spectra into molecular fragments • These fragments are placed in an “active list” which is used to direct and restrict the structure generation routines • If fragments from different experts conflict then the fragment with the highest certainty factor is used • The conflicting fragment is placed in an “inactive list” which is used in the event that a correct structure is not found using the active list

Spectroscopy Experts • The spectroscopy experts are also used to test generated structures for Spectroscopy Experts • The spectroscopy experts are also used to test generated structures for consistency with the spectral information • The system is able to identify when there is not enough information to verify a possible structure

An Example… • Formula of unknown: C 7 H 12 O 4 • 93 An Example… • Formula of unknown: C 7 H 12 O 4 • 93 possible sets of primary components are produced • Using these primary sets 497 sets of secondary components are possible – the number of sets of secondary components can be decreased if the primary component sets are pruned using spectral data

An Example… An Example…

An Example… • After pruning the sets of primary components only one possible set An Example… • After pruning the sets of primary components only one possible set remains: – Set contains 2 CH 3, 2 C=O, 2 OH, 1 C and 2 CH 2

An Example… An Example…

Conclusion • Determining the chemical structure of an unknown is an important part of Conclusion • Determining the chemical structure of an unknown is an important part of organic chemistry • Expert system technology can be applied to this domain • A blackboard architecture is especially well suited to this task

References 1) Craig, I. D. , Blackboard Systems, Artificial Intelligence Review (1988) 2, 103 References 1) Craig, I. D. , Blackboard Systems, Artificial Intelligence Review (1988) 2, 103 - 118. 2) Funatsu, K. , Susuta, Y. , Sasaki, S. , Introduction of Two-Dimensional NMR Spectral Information to an Automated Structure Elucidation System, CHEMICS. Utilization of 2 D-Inadequate Information, J. Chem. Inf. Comput. Sci. , 1989, 29, 6 -11. 3) Sobczak, Ronald S. , Matthews, Manton M. , An Expert System for Chemical Structure Elucidation Implemented on a Blackboard, Proceedings of the 3 rd International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, 1990, 91 -98. 4) Sobczak, Ronald S. , Matthews, Manton M. , A Massively Parallel Expert System Architecture for Chemical Structure Analysis, Distributed Memory Computing Conference, 1990, 11 -17. 5) Sasaki, S. , Kudo, Y. , Structure Elucidation System Using Structural Information from Multisources: CHEMICS, J. Chem. Inf. Comput. Sci. , 1985, Vol. 25, 252 -257.