d526927c65d2ca7fe7e8adf1de8fe56b.ppt
- Количество слайдов: 20
An agile process for the creation of conceptual models from content descriptions Hans-Werner Sehring Centre for Sustainable Content Logistics Tu. Tech Innovation Gmb. H / Hamburg University of Technology Joint work with: Sebastian Boßung 30 September 2007 Henner Carl Joachim W. Schmidt An agile modelling process - Hans-Werner Sehring, 2007
Outline 1. Conceptual Content Management 2. Asset expressions and schemata 3. The Asset Schema Inference Process 4. Straight-forward schema inference 5. Cluster-based schema inference 6. Process evaluation 7. Summary and outlook 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 2
1. Conceptual Content Management (CCM) – an approach to domain modelling – inspired by epistemology: entity description by classes and instances, called Assets – Assets are dual entity descriptions consisting of content visualising it and a conceptual model describing it – model-based system generation Features: – – modelling is carried out by domain experts domain models are open to changes existing work is preserved, even if changes are applied communication between domain experts with individual models is maintained 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 3
CCM dynamics CCM systems (CCMSs) are dynamically generated from domain models: model Historiography from Time import Timestamp from Topology import Place class Professor { content image concept characteristic n : String relationship publs : Work* } – immediately realizing model changes – preserving existing Assets – maintaining communication Intermediate model (parse tree) Key contributions to this end: – modelling language – model compiler – architecture for evolvable systems a: Asset. Class … b: Asset. Class … … … Political_Iconography (PI) mediation ( m med 2 mediation ( m distrib 1 PI , Regents distribution ( m client 1 client ( Regents DB ( Regents ) m med 1 PI , ( Regents , Artists )) , Artists ) m client ( PI ) m distrib 2 PI , Artists ) distribution ( m client 2 client ( Artists ) ) Regents 30 September 2007 super. Class m: Model An agile modelling process - Hans-Werner Sehring, 2007 DB ( PI ) DB ( Artists ) Artists 4
Model-driven development All SW development starts with a conceptual model – especially model-driven development approaches call for models with a sufficient degree of formality – CCM is similar to model-driven development in the respect that software creation is highly automated – in CCM, software generation is even dynamic A CCM model is required as a starting point for CCMSs – usually, some modelling expert (analyst) is consulted – due to dynamics requirement, such a modelling expert cannot be employed in CCM – domain experts are not modelling experts; usually have problems with, e. g. , sufficient formality – but: experts can “tell their story” by providing examples 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 5
2. Asset expressions and schemata In many domains research starts by regarding instances (samples), not concepts Ludwig Heydenreich Georg Thilenius title: Name name : Name Architecture in Italy : Full. Professor : Book issued. By : Professor publications: Work* : Professor 24 Feb 1934 concerns: Teacher issued. When : Timestamp reviewer: Professor issued : Place : Dissertation title: Name : City name : Name : Professor Die Sakralbau-Studien Leonardo da Vinci' s where : Geo. Point Erwin Panofsky : Career. Step 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 6
Asset model from the example Models consisting of classes Classes with • content handles and • attributes (and constraints) • characteristics • relationships Manually defined classes for the example: model Historiography from Time import Timestamp from Topology import Place class Professor { content image concept characteristic name : String relationship publications : Work* } class Work { content scan concept characteristic title : String relationship concerns : Professor* relationship issued : Issuing relationship reviewers : Professor*} class Issuing { concept relationship issued : Place relationship issued. By : Professor relationship issued. When : Timestamp } 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 7
Asset model from the example (cont’d) Example of personalisation: a domain expert introduces the distinction of documents: model My. Historiography from Historiography import Work, Professor class Work { concept relationship reviewer unused } class Dissertation refines Work { concept relationship reviewer : Professor* } Import and redefinition of classes for • schema evolution (user communities) • personalisation (single users) • … 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 8
3. Asset Schema Inference Process (ASIP) Bootstrapping: CCM itself requires an initial model as a starting point for the open dynamic modelling process Required: sytematic support for domain experts in finding suitable models Start with Asset Expressions: reviewer: Professor – content abstractions and applications: assigned names and bound values – semantic types (concepts): no inner structure : Professor Concepts and classes are not distinguished in CCM models, intensional and extensional definitions Free-form entity descriptions are used as samples; later they become instances of classes 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 9
Agile CCMS development Agility: – based on the possibility to generate CCMSs dynamically – domain experts review their models based on experiences with an operational CCMS – if changes to the model are required, another iteration of the process is started – entity descriptions created within the CCMS can be used as samples for the next iteration of the process Create Asset expressions Construct schema Generate CCMS 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 10
ASIP phases The ASIP has four phases Phase 1 Sample acquisition Schema inference Phase 2 answer questions Phase 3 Feedback questions unhappy with schema: -modify samples (- modify schema) Prototype generation System generation Phase 4 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 11
Two schema inference experiments Experiments with alternatives for phases 2 and 3: – (traditional) schema inference plus user feedback straight-forward approach starting from singletons – clustering, supervised by domain experts statistical approach, semi-supervised learning Phase 3 (generation of questions to gather feedback) is determined by the alternative chosen Result of phases 1 -3 is a CCM model: – prototype generation and system generation (phase 4) are carried out by the CCM model compiler – the domain expert can modify the inferred schema (openness and dynamics) 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 12
4. Straight-forward schema inference Schema construction by traditional schema inference 1. derive naive classes directly from the set of samples 2. apply simplifications 3. if changes where applied to the schema, repeat step 2 Step 1: for each sample create an Asset class with – a content handle whose type is determined by the encoding format of the sample’s content – attributes for all abstractions over the content • characteristics for certain known types • relationships for other types • no further constraints 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 13
Schema simplification Step 2: simplifications, repeatedly applied in the specified order – identical class: unify classes with attributes and content handles with identical names and types – inheritance: subtype relationship of classes whose sets of attributes are in a subset relationship – type match: if two classes have attributes and content handles of identical types, prompt expert for unification – inheritance orphan: ask domain expert about removal of classes with only few instances Note: – often classes considered equal if the attributes’ types match – here the name is considered, or else feedback is collected 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 14
5. Cluster-based schema inference Schema construction by clustering: – cluster samples, create classes from clusters – experiment based on k-means algorithm Clustering steps: – classification: assign classes to clusters based on distance measure d: d(s, c) = α dsem(s, c) + (1 -α) dstruct(s, c), α [0. . 1] – optimisation: recompute the cluster centres – inheritance hierarchy creation: like in the simple approach – feedback: visualise the clusters, allow to partition clusters => semi-supervised learning Less user interaction than in the traditional approach 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 15
Structural distance measure dstruct is based on the length of the shortest edit script (similar to string matching) Costs like: edit operation add attribute remove attribute change attribute name broaden attribute type narrow attribute type increase cardinality of attribute value decrease cardinality of attribute value 30 September 2007 cost magnitude low high low medium very low An agile modelling process - Hans-Werner Sehring, 2007 16
Semantic distance measure dsem is determined by the shortest paths in the class hierarchy dsem(s, c) = 1/2 h(T 1) if T 1 is direct supertype of TC dsem(T 1, Tm) + dsem(Tm, TC) if T 1 is direct supertype of Tm and Tm is supertype of TC dsem(TS, T 1) + dsem(TS, TC) if TS is the most specific common supertype of T 1 and TC 0 Any 1 Work. Of. Art Person 1 Work. Of. Art 1/2 Image Text Book ta dis 2 e? nc 1/2 Image 1/4 h(T) 30 September 2007 Text Person 3 Book An agile modelling process - Hans-Werner Sehring, 2007 17
6. Process evaluation Schema quality: – generally difficult to judge – for domain modelling: not a schema that describes sample best, but model that best represents the application domain Criteria [Cherfi, Akoka, Comyn-Wattiau]: – specification: • graphical legibility • simplicity • expressiveness • syntactical correctness • semantic correctness – usage: completeness, understandability – implementation: implementability, maintainability 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 18
Process evaluation (cont’d) Selected parameters: – simplicity: in general depends on • the given sample set • domain expert’s answers in feedback phase – syntactical correctness: granted by model generation – semantic correctness: can be negatively impacted by structurally coinciding classes with different meanings – understandability: • generated class names can be an obstacle • but: generated system lowers impact of schema – implementability: by generation – maintainability: through dynamics 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 19
7. Summary and outlook Summary: – Conceptual Content Management allows domain experts to provide and individually change domain models – domain experts are usually no modelling experts, and they prefer to start with samples describing observations – a process helps domain experts defining initial models to start the open dynamic CCM activity – as one novel approach a cluster-based schema inference process has been investigated Outlook: future work will include … – the inclusion of the cluster-based approach into the open modelling for extensional concept definitions – the employment of reasoning techniques (induction, abduction) to guide the schema construction process 30 September 2007 An agile modelling process - Hans-Werner Sehring, 2007 20