3585f5d58224b3dc069bff5f8795b63c.ppt
- Количество слайдов: 28
Coordination: Things to do James N. Bellinger University of Wisconsin-Madison CMS Week June 2010 James N. Bellinger CMS week 2010 1
Decide on Data Location • Computer cluster for central analysis – Analysis can be done elsewhere (eg. the Barrel fit), but need collection point Can we decide today? – We all need access • Disk location • Subdivided by project – Input for subproject – Output from subproject • Communications area • Final output area HOME LINK ENDCAP IN OUT BARREL WORK CODE Who can make this available? – Inspection, debugging, etc James N. Bellinger CMSWeek June CODE 2
Understand DB Location • CERN DB for final results – Barrel uses its own DB for all phases of processing • Grouped by project – Input for subproject – Output from subproject s thi l pel S out ay. od t James N. Bellinger CMSWeek June 3
Is this complete? Define Signaling needs 1. When is data available-from/agreed-on by all subgroups? 2. When is the Link fit finished? 3. When is the Z-calculator finished? 4. When is the Transfer Line fit finished? 5. When are the SLM fits finished? 6. When is the Barrel fit finished? 7. Do we need to iterate with the Barrel? 8. Do we have a complete collection? 9. Is the process complete? James N. Bellinger CMSWeek June 4
Signaling needs: Breakdown 1 • When is data available from/agreed-on by all subgroups? – We don’t have this process automated, nor do we have a clear naming convention – “Available” means the inputs to Cocoa are ready – When done, start the processing • When is the Link fit finished? – Link is fast and first – Transfer Line and Z-calculator can begin immediately afterwards, using MAB info. Barrel too, though that’s a design question • When is the Z-calculator finished? – This doesn’t exist yet (all by-hand!) – When done, info goes to SLM models James N. Bellinger CMSWeek June 5
Signaling needs: Breakdown 2 • When is the Transfer Line fit finished? – This took only about 10 minutes – When done, info about Transfer Plate positions has to migrate to the SLM model – Barrel model could be revised to use MAB DCOPS constraints • When are the SLM fits finished? – Takes about an hour – After this, recover the fit CSC chamber positions and interpolate/fit the rest: part of our deliverables James N. Bellinger CMSWeek June 6
Signaling needs: Breakdown 3 • When is the Barrel fit finished? – Takes over 24 hours – Writes to local database: need to transfer info • Do we need to iterate with the Barrel? – Design question. If Barrel fit has large shifts or doesn’t agree with Transfer Line constraints, may want to iterate; redo Transfer Line et seq. • Do we have a complete collection? – We could have a tentative complete collection even while iterating • Is the process complete? – Write to the DB and set up testing James N. Bellinger CMSWeek June 7
Testing the fits • Compare with previous and reference – Need estimates of range expected – Count excursions, flag if above some level – Time plots of selected fit quantities? • Human eyeballs needed at first • Data monitoring is a different animal t now righ doing up is h gro t eac t wha ell ou to sp need We James N. Bellinger CMSWeek June 8
Working Details • CMSSW versions are ephemeral – Need automated “build me a new release” script • Work from different areas, different machines (cocoa files overwrite each other) – Want robust inter-machine communication and file transfer • If all on the same afs/nfs cluster, no problem: semaphore files • Don’t want to monkey with socket programming – Mother process watches for semaphore, starts jobs • Can be tricky James N. Bellinger CMSWeek June 9
Processing cycle: Given an event time or range 1. Collect data from subsystems for event and massage to fit 2. Run Link job 3. Watch for done on Link 4. Rewrite HSLM models w/ Link fits, also Z-calculator and Transfer Line: provide info to Barrel 5. Run Z-calculator, Transfer Line, Barrel 6. Watch for done on Z-calculator and Transfer Line 7. Rewrite SLM models, write info for Barrel 8. Run SLM models 9. Watch for done on SLM 10. Fetch and interpolate chamber info from various models 11. Collect and present 12. Watch for done on Barrel 13. Fetch and interpolate chamber info 14. Collect everything and check against reference – At each step be ready to abort if fails or inconsistent James N. Bellinger CMSWeek June 10
Devil in the Details: Step 1: Collect data • Specify event by time interval? – First event in the interval if more than one? • What naming convention for events? – Start time? I like seconds in epoch… – All event files, both temporary and permanent, should have the same timing ID somewhere in the name • Easy to get DCOPS if online, scripts exist to do all this stuff: but we probably want a program, unless “mother” is also a script • Not sure how to get Endcap analog. Some jitter: do we want to use an average? • Does Barrel read data directly from the DB? • Need “don’t use this” flags for known bad readings; each system with its own flags and definitions and lookup file or DB • Do we have good automatic sanity checks? I still eyeball plots. James N. Bellinger CMSWeek June 11
Devil in the Details: Step 2: Run Link • Spawn a process? Needs to be in the right directory, with the right arguments. Can do it, but spawning needs careful monitoring. Using a script as the mother might be better. • Endcap and Transfer Lines wants – MAB fit X/Y/Z and rx/ry/rz; estimated errors would be nice – LD fit positions – What format is good? Simple text files are easy to read • Barrel wants – MAB fit X/Y/Z and rx/ry/rz – ? – What format is good for Barrel? DB? James N. Bellinger CMSWeek June 12
Devil in the Details: Step 4: Rewrite models • DCOPS uses text SDF files. We can create include files like Minus. LD_1276014332. include and use a soft link from the current one to Minus. LD. include Thanks to the internal structure of the Transfer Line SDF there would be a lot of these for the MABs. • Since the Z-calculator doesn’t exist yet we can define any input we like. Simpler is better • If I understood correctly the Barrel looks to a database for input for everything, so the rewrite needs to write to a database also. Scripts can do this also. James N. Bellinger CMSWeek June 13
Devil in the Details: Step 10: Interpolate data • Sometimes a simple fit matches the fit chamber positions (or angles) obviously well, and sometimes it doesn’t and I use an interpolation. Or, as when something doesn’t fit at all, I use the disk position and orientation and apply that to the photogrammetry. – How does our automatic procedure know which to use? James N. Bellinger CMSWeek June 14
Devil in the Details: Architecture • Need to check for failures at each step • Need to check for timeout failures (hangs, reboots, etc) • Need to have appropriate cleanup procedures at each step • If writing to DB, may need to roll-back changes on failure? Or at any rate flag the entries as BAD James N. Bellinger CMSWeek June 15
Documentation 1. DAQ 2. DCS general controls 3. Expert-only controls 4. Data handling 5. Fitting procedures 6. What to do with changes All James N. Bellinger CMSWeek June ave Ih #2 is and for 3 PS CO D # 16
Additional Material Sample scripts DAQ Still TODO James N. Bellinger CMSWeek June 17
Sample scripts 1 • getframe. ALL. awk – http: //www. hep. wisc. edu/~jnb/cms/tools/getframe. ALL. awk – If report. out was generated using the correct flags and Samir’s code modification, this retrieves the positions and angles for each component in the coordinate system of each of its mother volumes in the hierarchy. – CMS/yep 1/slm_p 12 x dx y dy z dz rx drx ry dry z drz – Errors are only valid when using the immediate mother volume – In my jargon the output is a “frame” file James N. Bellinger CMSWeek June 18
Sample scripts 2 • make. New. TPAwk. File. com – http: //www. hep. wisc. edu/~jnb/cms/tools/make. New. TPAwk. File. com – If the report. out file was generated by a Transfer Line fit and you created a “frame” file using getframe. ALL. awk, then – This generates a new awk script whose name incorporates the framework file name – You can then use the new awk script to process an ideal SLM’s SDF and create one with Transfer Plate positions as found by the Transfer Line fit. • I call this re-writing, but that’s misleading: you make a new file with certain parts changed James N. Bellinger CMSWeek June 19
Sample scripts 3 • unpdbloose. awk – http: //www. hep. wisc. edu/~jnb/cms/tools/unpdbloose. awk – This takes a text file containing the row data from an event in the DCOPS database and creates a Cocoa text input data file from it – Refitting using the root histograms gives quality info unfortunately lost in the summary stored into the database, but this works – Since not all insanities are flagged, I edit the file to increase the errors on profiles I know to be bad but which pass the simple quality cuts. This script needs to be replaced by a program which reads in a “known-bad” list. James N. Bellinger CMSWeek June 20
DCOPS DAQ TODO James N. Bellinger CMSWeek June 21
DCOPS DAQ • • • Phoenix DAQ DCS Data Quality Monitoring Data transfer to offline Transforming selected event (not really DAQ) James N. Bellinger CMSWeek June 22
Phoenix DAQ • Write out every 60’th event as root? – 1/day, 5 MB/event – Gives full plots, more details from fit if required – Need to move root files offline to permanent area • Read Oracle password from protected file • Automatic start at boot time – Can hack this with a cron job and avoid being tied to a single machine • Tools to remotely kill/restart? – Not sure if ssh permissions allow this James N. Bellinger CMSWeek June 23
DCOPS DCS • Fix fake error bug • Cleanup user interface James N. Bellinger CMSWeek June 24
Data Quality Monitoring • Not sure how to integrate into overall DQM • Simple job (cron? ) can collect raw data for day/week/month and flag excursions in a temperature plot • 1’st question: is the DAQ still running? • Need database table (file at first) with the known bad readings flagged: 504∗ 4 possible • Need tool for experts to manipulate aforementioned table • Need tool to make diagnostics available • Not keen on reinventing the wheel James N. Bellinger CMSWeek June 25
DCOPS Data to Offline • Data put in Online DB, never finished job of moving it offline • Move root histogram files (if we want them) James N. Bellinger CMSWeek June 26
DCOPS Event selection • Easy to create a database query and rewrite the results into a Cocoa-text file – Pieces exist, combine – Need a “bad profile” reference file – Add communication details and locations and naming conventions – Partition into different input files • HSLM files are special, using analog and Link data also • This has to be coordinated with the rest of the group James N. Bellinger CMSWeek June 27
Link DAQ TODO Barrel DAQ TODO I haven’t a clue. You tell us. James N. Bellinger CMSWeek June 28
3585f5d58224b3dc069bff5f8795b63c.ppt