Скачать презентацию Statistical Testing Project Maria Grazia Pia INFN Genova Скачать презентацию Statistical Testing Project Maria Grazia Pia INFN Genova

8f671fa2d20708fa4f73016106f0e1a4.ppt

  • Количество слайдов: 36

Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team LCG-Application Meeting CERN, 27 November 2002 Maria Grazia Pia, INFN Genova http: //www. ge. infn. it/geant 4/analysis/Tand. A

History and background Maria Grazia Pia, INFN Genova History and background Maria Grazia Pia, INFN Genova

What is? A project to develop a statistical analysis system, to be used in What is? A project to develop a statistical analysis system, to be used in Geant 4 testing physics validation Main application areas in Geant 4: regression testing system testing Provide tools for the statistical comparison of distributions – equivalent reference distributions (for instance, regression testing) – experimental measurements – data from reference sources – functions deriving from theoretical calculations or from fits Interest in other areas, not only Geant 4? Maria Grazia Pia, INFN Genova LCG?

History “Statistical testing” agreed in the Geant 4 Collaboration as a major objective for History “Statistical testing” agreed in the Geant 4 Collaboration as a major objective for 2002 Initial ideas presented at Geant 4 TSB meeting, November 2001 Open brainstorming session at a Geant 4 -WG workshop, 31 May 2002 Inception phase, summer 2002 – – – Informal discussions with STT, Geant 4 collaborators and interested potential developers Initial collection of user requirements in Geant 4 First version of software process deliverables: Vision, URD, Risk List Presentation at Geant 4 Workshop + parallel sessions, October 2002 – http: //www. ge. infn. it/geant 4/talks/G 4 workshop/CERN/pia/tanda-2002. ppt Launch of the project Maria Grazia Pia, INFN Genova

The team Development team s r rato bo olla e! dc este elcom inter The team Development team s r rato bo olla e! dc este elcom inter are w Pablo Cirrone, INFN Southern National Lab Stefania Donadio, Univ. and INFN Genova Susanna Guatelli, CERN/IT/API Technical Student and INFN Genova Alberto Lemut, Univ. and INFN Genova Barbara Mascialino, Univ. and INFN Genova Sandra Parlati, INFN Gran Sasso National Lab Andreas Pfeiffer, CERN/IT/API + requirements, suggestions, b-testing Maria Grazia Pia, INFN Genova by many other Geant 4 Collaborators Geant 4 system integration team (M. Maire, A. Ribon, L. Urban et al. ) Gabriele Cosmo, CERN/IT/API - Geant 4 Release Manager Sergei Sadilov, CERN/IT/API - Geant 4 System Testing Coordinator Statistical consultancy Paolo Viarengo, Univ. Genova, Statistician Maria Grazia Pia, INFN Genova

The vision Maria Grazia Pia, INFN Genova The vision Maria Grazia Pia, INFN Genova

Vision: the basics Have a vision for the project – – – An internal Vision: the basics Have a vision for the project – – – An internal tool for Geant 4 physics & STT? Also for Geant 4 physics validation in the experiments? Other parties than Geant 4 interested? Clearly define scope, objectives scope Who are the stakeholders? stakeholders Who are the users? users Who are the developers? developers Clearly define roles Rigorous software process Software quality Build on a solid architecture Flexible, extensible, maintainable system Maria Grazia Pia, INFN Genova

Scope of the project The project will provide tools for statistical testing of Geant Scope of the project The project will provide tools for statistical testing of Geant 4 – physics comparisons and regression testing – multiple comparison algorithms Generality (for application also in other areas) should be pursued – facilitated by a component-based architecture The statistical tools should be used in Geant 4 (and in other frameworks) – – tool to be used in testing frameworks not a testing framework itself Re-use existing tools whenever possible – – no attempt to re-invent the wheel but critical, scientific evaluation of candidate tools Maria Grazia Pia, INFN Genova

Architectural guidelines The project adopts a solid architectural approach – to offer the functionality Architectural guidelines The project adopts a solid architectural approach – to offer the functionality and the quality needed by the users – to be maintainable over a large time scale – to be extensible, to accommodate future evolutions of the requirements extensible Component-based approach – – Geant 4 -specific components + general components to facilitate re-use and integration in diverse frameworks AIDA – – adopt a (HEP) standard no dependence on any specific analysis tool Python The approach adopted is compatible with the recommendations of the LCG Architecture Blueprint RTAG Maria Grazia Pia, INFN Genova

The reason why we are here… Core statistics comparison component + user layer can The reason why we are here… Core statistics comparison component + user layer can be generalised to wider scope than Geant 4 only This is the reason why we present the project to LCG – – – to establish a scientific discussion on a topic of common interest to see if there any interested users to see if there any interested collaborators We would all benefit of a collaborative approach to a common problem – share expertise, ideas, tools, resources… Maria Grazia Pia, INFN Genova

Software process guidelines Significant experience in the team – in Geant 4 and in Software process guidelines Significant experience in the team – in Geant 4 and in other projects Guidance from ISO 15504 – standard! USDP, specifically tailored to the project USDP – – – practical guidance and tools from the RUP both rigorous and lightweight mapping onto ISO 15504 Open to use tools provided by the LCG Software Process Infrastructure project Maria Grazia Pia, INFN Genova

Who are the stakeholders? Name Geant 4 STT Coordinator Description Geant 4 physics coordinators Who are the stakeholders? Name Geant 4 STT Coordinator Description Geant 4 physics coordinators Coordinate Geant 4 std EM, Ensure that the system meets the low. E EM, hadronic WGs needs of Geant 4 Physics Testing Geant 4 TSB Is responsible for Geant 4 technical matters Provide guidelines, monitors progress INFN Computing Committee National Committee whom part of the developers respond to; has appointed 4 referees Recommend funding; review the project, monitor progress Others? Who? LCG? Requirements? Expertise? Maria Grazia Pia, INFN Genova Coordinates system testing Responsibilities Ensure that the system meets the needs of Geant 4 System Testing

Who are the users? Groups Responsibilities Geant 4 physics Provide and document requirements, provide Who are the users? Groups Responsibilities Geant 4 physics Provide and document requirements, provide feedback on prototypes, perform b-testing on preliminary releases of the product, provide use cases for acceptance testing Provide and document requirements, performal acceptance testing for adoption in system testing Working Groups Geant 4 STT Other potential users: users of the Geant 4 Toolkit, wishing to compare the results of their Toolkit applications to reference data or to their own experimental results other projects with requirements for statistical comparisons of distributions (e. g. the LHC Computing Grid project) Maria Grazia Pia, INFN Genova

Some use cases Regression testing – Throughout the software life-cycle Online DAQ – Monitoring Some use cases Regression testing – Throughout the software life-cycle Online DAQ – Monitoring detector behaviour w. r. t. a reference Simulation validation – Comparison with experimental data Reconstruction – Comparison of reconstructed vs. expected distributions Physics analysis – – Comparisons of experimental distributions (ATLAS vs. CMS Higgs? ) Comparison with theoretical distributions (data vs. Standard Model) Maria Grazia Pia, INFN Genova

What do the users want? User requirements from Geant 4 (physics, system testing) elicited, What do the users want? User requirements from Geant 4 (physics, system testing) elicited, analysed, specified and reviewed with the users – – – User Requirements Document http: //www. ge. infn. it/geant 4/analysis/Tand. A/URD_Tand. A. html Use case model in progress Specific user requirements related to the core statistical component – – Detail in progress (URD in preparation) Input from LCG? Requirement traceability – Analysis/design, implementation, test, documentation, results Maria Grazia Pia, INFN Genova

Are there any constraints? Geant 4 constraint requirements Based on AIDA No concrete dependencies Are there any constraints? Geant 4 constraint requirements Based on AIDA No concrete dependencies on specific AIDA implementations should appear in the code of the system tests Available on Geant 4 supported platforms The system should not require additional licenses w. r. t. what required for Geant 4 development Other non-functional requirements? Maria Grazia Pia, INFN Genova

The core statistical component Maria Grazia Pia, INFN Genova The core statistical component Maria Grazia Pia, INFN Genova

HBOOK, PAW & Co. 4 99 l, 1 a anu m OK Based on HBOOK, PAW & Co. 4 99 l, 1 a anu m OK Based on considerations such as those given above, as well as O HB considerable computational experience, it is generally believed that tests like the Kolmogorov or Smirnov-Cramer-Von-Mises (which is similar but more complicated to calculate) are probably the most powerful for the kinds of phenomena generally of interest to high-energy physicists. […] The value of PROB returned by HDIFF is calculated such that it will be uniformly distributed between zero and one for compatible histograms, provided the data are not binned. […] The value of PROB should not be expected to have exactly the correct distribution for binned data … CDF Collaboration, but Inclusive jet cross section in p pbar collisions at sqrt(s) 1. 8 Te. V, Phys. Rev. Lett. 77 (1996) 438 Maria Grazia Pia, INFN Genova

Goodness-of-fit tests Pearson’s c 2 test Kolmogorov – Smirnov test Lilliefors test Cramer-von Mises Goodness-of-fit tests Pearson’s c 2 test Kolmogorov – Smirnov test Lilliefors test Cramer-von Mises test Anderson-Darling test Kuiper test … It is a difficult domain… Implementing algorithms is easy But comparing real-life distributions is not easy Incremental and iterative software process Collaboration with statistics experts Patience, humility, time… System open to extension and evolution Suggestions welcome! Maria Grazia Pia, INFN Genova

Pearson’s c 2 Applies to discrete distributions It can be useful also in case Pearson’s c 2 Applies to discrete distributions It can be useful also in case of continuous distributions, but the data must be grouped into classes Cannot be applied if the counting of theoretical frequencies in each class is < 5 When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached Maria Grazia Pia, INFN Genova

Kolmogorov test The easiest among non-parametric tests Verify the adaptation of a sample coming Kolmogorov test The easiest among non-parametric tests Verify the adaptation of a sample coming from a random continuous variable Based on the computation of the maximum distance between an empirical repartition function and theoretical repartition one Test statistics: D = sup | FO(x) - FT(x)| Maria Grazia Pia, INFN Genova

Kolmogorov-Smirnov test Problem of the two samples – mathematically similar to Kolmogorov’s Instead of Kolmogorov-Smirnov test Problem of the two samples – mathematically similar to Kolmogorov’s Instead of comparing an empirical distribution with a theoretical one, try to find the maximum difference between the distributions of the two samples Fn and Gm: D mn= sup |Fn(x) - Gm(x)| Can be applied only to continuous random variables Conover (1971) and Gibbons and Chakraborti (1992) tried to extend it to cases of discrete random variables Maria Grazia Pia, INFN Genova

Lilliefors test Similar to Kolmogorov test Based on the null hypothesis that the random Lilliefors test Similar to Kolmogorov test Based on the null hypothesis that the random continuous variable is normally distributed N(m, s 2), with m and s 2 unknown Performed comparing the empirical repartition function F(z 1, z 2, . . . , zn) with the one of the standardized normal distribution F(z): D* = sup | FO(z) - F(z)| Maria Grazia Pia, INFN Genova

Cramer-von Mises test Based on the test statistics: w 2 = integral (FO(x) - Cramer-von Mises test Based on the test statistics: w 2 = integral (FO(x) - FT(x))2 d. F(x) Can be performed both on continuous and discrete variables Satisfactory for symmetric and right-skewed distributions Maria Grazia Pia, INFN Genova

Anderson-Darling test Performed on the test statistics: A 2= integral { [FO(x) – FT(x)]2 Anderson-Darling test Performed on the test statistics: A 2= integral { [FO(x) – FT(x)]2 / [FT(x) (1 -FT(X))] } d. FT(x) Can be performed both on continuous and discrete variables Seems to be suitable to any data-set (Aksenov and Savageau - 2002) with any skewness (symmetric distributions, left or right skewed) Seems to be sensitive to fat tail of distributions Maria Grazia Pia, INFN Genova

Kuiper test Based on a quantity that remains invariant for any shift or -parameterization Kuiper test Based on a quantity that remains invariant for any shift or -parameterization Does not work well on tails D* = max (FO(x)-FT(x)) + max (FT(x)-FO(x)) Maria Grazia Pia, INFN Genova re

Work in progress Maria Grazia Pia, INFN Genova Work in progress Maria Grazia Pia, INFN Genova

OOAD Preliminary design of the statistical component in progress Core statistics comparison package User OOAD Preliminary design of the statistical component in progress Core statistics comparison package User layer Policy-based class design http: //www. ge. infn. it/geant 4/rose/statistics/ Validation of the design through use cases Some open issues identified, to be addressed in next design iteration Maria Grazia Pia, INFN Genova

+ more algorithms gr o ss e Maria Grazia Pia, INFN Genova rk o + more algorithms gr o ss e Maria Grazia Pia, INFN Genova rk o w pr in

gr o ss e Maria Grazia Pia, INFN Genova rk o w pr in gr o ss e Maria Grazia Pia, INFN Genova rk o w pr in

Use case: compare two continuous distributions ress rog in p ork w Maria Grazia Use case: compare two continuous distributions ress rog in p ork w Maria Grazia Pia, INFN Genova

Work in progress Implementation and test of preliminary design What can be re-used? – Work in progress Implementation and test of preliminary design What can be re-used? – Algorithms in GSL, NAG libraries (to be evaluated) Studies in progress – – – Transformation between continuous-discrete distributions Strategies to use Kolmogorov-Smirnov with discrete distributions (E. Dagum + original ideas) How to deal with experimental errors (not only statistical!) Multi-dimensional distributions Bayesian approach In the to-do list – – Conversion from AIDA objects to distributions “Pythonisation” Revision of the initial documents (Vision, URD, Risks) Based on the recent evolutions in the project – Input INFN today’s Maria Grazia Pia, from Genova meeting? –

Work in progress: Geant 4 -specific Development of general physics tests in the E. Work in progress: Geant 4 -specific Development of general physics tests in the E. M. domain, for comparison of reference distributions – – Compilation of existing tests Evaluation, documentation of tests Elicitation of requirements for tests among the Geant 4 physics groups Collection of reference data/distributions Prototype for automated comparison w. r. t. reference databases – – NIST, Sandia etc. , directly downloaded from the web Prototype as a risk mitigation strategy Integration in the Geant 4 system testing framework Integration in Geant 4 physics testing frameworks Maria Grazia Pia, INFN Genova

Where? Geant 4 -specific stuff – – In Geant 4 May be included in Where? Geant 4 -specific stuff – – In Geant 4 May be included in public distribution, if of interest to users Core statistical component – – Developed in an independent CVS repository Code, documentation, software process deliverables Web site – http: //www. ge. infn. it/geant 4/analysis/Tand. A/index. html Contact persons – Andreas. Pfeiffer@cern. ch, Maria. Grazia. Pia@cern. ch Maria Grazia Pia, INFN Genova

Time scale Aggressive time scale driven by Geant 4 needs – incremental and iterative Time scale Aggressive time scale driven by Geant 4 needs – incremental and iterative software process OOAD + implementation already started Prototype at CHEP Advanced functional system summer 2003 Open to the needs/suggestions of LCG – compatible with the available resources and Geant 4 needs Maria Grazia Pia, INFN Genova

Conclusions… Geant 4 requires a statistical testing system for physics validation and regression testing Conclusions… Geant 4 requires a statistical testing system for physics validation and regression testing – to provide a high quality product to its user communities Core statistical component (of potential general interest) Geant 4 -specific components Project compatible with LCG architecture blueprint – component-based approach, AIDA, Python… Rigorous software process – to contribute to the quality of the product Aggressive time scale dictated by Geant 4 needs Open to scientific collaboration Maria Grazia Pia, INFN Genova Beginning…