b34799ac3c3a8343e816f8805a43f5dd.ppt
- Количество слайдов: 9
Validation related issues Auger-Lecce, 10 November 2009 Build. Bot- introduction ¡ Build. Bot@pbsfarm ¡ Site Wide Installation ¡ Issues related to Install/Config/Valid ¡ Updates on Validation. Tests ¡ Conclusions and Outlook ¡
Build. Bot – Introduction Build. Bot is the system used in Auger to automate the compile/test cycle to validate code changes. By automatically rebuilding and testing the tree each time something has changed, build problems are pinpointed quickly. By running the builds on a variety of platforms, developers who do not have the facilities to test their changes everywhere before checkin will at least know shortly afterwards whether they have broken the build or not. The overall goal is to reduce tree breakage and provide a platform to run tests or code-quality checks. The Validation environment uses Build. Bot as “testing automated framework”. Buildbot works in a master/slave daemons scheme. The master receives notification changes from the SVN server and tells the buildslaves to checkout, build and test the code. Multiple slaves can run on different platforms. The slaves report their results to the master, which posts them in a waterfall display and sends an email to the appropriate person(s) in case problems are found.
Build. Bot @ pbsfarm Setting up Build. Bot slaves on our nodes allow to automatically test the build/test process on our system platforms. A system virtual machine provides a complete System Platform which supports the execution of a complete Operating System (OS). On pbsfarm, 2 system virtual machines have been set up. : • auger-le 64. le. infn. it Operating System: Architecture: Scientific Linux 4. 7 64 bit(x 86 -64) • auger-le 32. le. infn. it Operating System: Architecture: Scientific Linux 4. 7 32 bit(i 386) They emulates the pbsfarm real nodes used for simulation/reconstruction. The idea behind is to have Build. Bot running on it, using a “site-wide” installation.
Site Wide Installation Using APE. Installation done from the virtual machines and located under nexus 06. For using it: In your. bashrc includes For 64 bit architecture: export PATH=/nfs/argo/nexus 06/gabriella/Auger. Offline 64 Last/ape-0. 98/: ${PATH} export APERC=/nfs/argo/nexus 06/gabriella/Auger. Offline 64 Last/ape-0. 98/ape. rc For 32 bit architecture export PATH=/nfs/argo/nexus 06/gabriella/Auger. Offline 32 Last/ape-0. 98/: ${PATH} export APERC=/nfs/argo/nexus 06/gabriella/Auger. Offline 32 Last/ape-0. 98/ape. rc At log, for configuring the environment you need to do a: eval `ape sh Externals` (for setting only the Externals) eval `ape sh Offline` (for offline settings) NOTE It works also for tcsh using eval `ape csh Externals` eval `ape csh Offline` after setting in. tcshrc the equivalent of export (setenv PATH. . setenv APERC)
Issues @ Installation/Configuration Problems during Aires build/install. (ape-0. 98) In #ape-0. 98/ape. rc. . . [package Aires] fc = g 77. . . Should allow the setting of g 77 as compiler in use, but it does not work. The compilation stops since the gfortran (default compiler) is not found. I manually changed the compiler setting directly in build/Aires/2 -8 -4 a/config (mods Fort. Compile=“g 77” ) and then I entered the command build/Aires/2 -8 -4 a/doinstall 0 Apparently things were OK but in the auger-offline-config the build of Aires introduces a set of libraries in the system area, that address a boost installed in the system that conflicts with the Boost in external, crash at run-time. Solved changing manually the auger-offline-config. TRAC (#34) It is MANDATORY to have $APERC set
Issues @ Validation After a few rounds of validation on le-32 le-64 (see waterfall page @ http: //129. 10. 132. 228: 8010/waterfall) • In some cases the Standard. Applications are very slow (particularly on le-32) and the buildbot-master kills the application otherwise lasting forever. The problem seems to be worse since a few days. Apparently no mods related. The Standard. Applications run involves full Sd simulation, starting from a Corsika airshower, and randomize the core position on the array. It can sometimes happens that a core lands very close to a tank. In such a case an enormous amount of particles is run through Geant (. . . It is not worth simulating them in such details since those stations are in any case “saturated”. . . ). . Only a luck of luckiness sequence? !. . . (Notice the Sd. Sim events are never reproduced). . . • The example and standard. Application running shows a difference between le-32 le 64. In (le-64) severals: FDTrigger. Simulator. OG: Make. Mirror. Event. . . TAnalysed. Pixel. Data: : Analyse() – found invalid 0 x 7 f pattern! Seen also by Mariangela- Present also in other 64 bits build machines (see example in waterfall)- Requests to Tom Paul. . . +Ralf Ulrich. . . + Michael Unger and Steffen Mueller. . . (FDEvent. Lib. . . responsability). . . + HJ Mathes. . .
Validation. Tests Mods for Module Sequences: (used Standard. Applications -data Reconstruction- as reference) PLEASE CHECK! FRec Event. File. Reader. OG Event. Checker. OG Fd. Calibrator. OG Fd. Pulse. Finder. OG Pixel. Selector. OG Fd. SDPFinder. OG Fd. Axis. Finder. OG Fd. Aperture. Light. OG Fd. Profile. Reconstructo r. KG Rec. Data. Lister Rec. Data. Writer Event. File. Exporter. OG SValid. Store SRec Event. File. Reader. OG Event. Checker. OG Sd. Calibrator. OG Sd. Event. Selector. OG Sd. Plane. Fit. OG LDFFinder. OG Sd. Event. Posterior. Selector. OG Sd. Rec. Plotter. OG Rec. Data. Lister Rec. Data. Writer Event. File. Exporter. OG SValid. Store with this Module Sequences: the code is working. To do- update the inputevent before commit
Validation. Tests IO work- Main idea: checking that new releases of Offline can read files produced with older versions. How to approach this: • Trigger the Build. Bot build on Event. IO change. • As Input – A list of reference Events with different versions • A script running a read test • A script running the hybrid Simulation+Reconstruction + writing the Event + reco/sim test. TAG 1 I/O TAG 2 I/O TAG …I/O TAG N-1 I/O DEV N I/O Code 1 Code 2 Code. . . Sim ref Code N-1 Sim ref Code DEV Sim Rec
Conclusions and Outlook • 2 Build. Bot(slave) have been set up. They allows to automatically test the build/test process on the system platforms we use. • The use of a site wide installation from emulating node machines running Build. Bot maximize the pinpointing of problems from our side. (The build is from the trunk with fixed externals). An offline reference is available. • Possible evolution of virtualization- Worker Node on demand for GRID(? ) – A possible conservative approach: Check feasibility then do it @CNAF-if OK/ then propose to the collaboration. What is the status of porting OFFLINE on GRID? • The First issues from Installation, Buildbot setting and Validation are under study. • For the old SREC FREC Validation tests. The Module. Sequence has been modified in order to update. Code working Feedback needed!
b34799ac3c3a8343e816f8805a43f5dd.ppt