963e14b0112c52e86e288c46c3934390.ppt
- Количество слайдов: 12
CAIGEE: Clarens Architecture June 23, 2003 Caltech Analysis Workshop Pasadena Eric Aslakson, Julian Bunn, Iosif Legrand, Harvey Newman, Conrad Steenberg, Michael Thomas, Frank van Lingen
View from below Try to make sense of the Alphabet soup: Service/functionality oriented view: ● Providers ● Clients ● Both ● Middleware/information providers ● CMD line Prog. Lang X ROOT, JAS, Lizard, PROOF Web browser E-mail PDA JAS Mona. LIsa Iguana Middleware/Information providers CMD line Prog. Lang X ROOT, PROOF Dist. JAS, DIAL PBS, Condor, LSF This space intentionally left blank ORCA POOL SEAL RDBMS Filesystem
View from below II Client/server view based on resource abundance Middleware/IP helps organize resources in a resource-scarce environment G****, OGS*, Tomcat, Mona. Lisa, web server Metadata catalogs – some missing ● Needed: make a (more) consistent and unified environment without resorting to X scripting languages as glue ● Interact with network-enabled components in from the most sensible environment for the task ● ● Not only client/server, but between all components
Enable higher level services Reduce the development impedance for higher level services to function properly ●E. g. Mona. Lisa uses modules to monitor using 'ping', SNMP, Ganglia etc. , but provides agregated information usingle remote API (SOAP) ●Reduce manual interaction Counter-example: VO management Obtain certificates from X CAs via LDAP Store in VO LDAP server, create VO Extract structure using different tool, using config file to create new config file (gridmap) used by middleware (Globus gatekeeper) Site admins must maintain separate copies of gridmap files for different clusters/servers ●
CAIGEE architecture Internet Clarens Catalog Planner 1 GDMP Planner 2 Virtual Data Catalog Materialized Data Catalog Execution Priority manager Grid-wide Execution Service Grid Processes Monitoring
CAIGEE architecture II
GE Analysis examples SOCATS – use RDBMS for analysis Submit query to database SOCATS logs progress Sends rows, creates output file (ROOT) upon request May receive commands to terminate/restart with new query ●Clarens/ORCA remote analysis Submit job, ORCA starts up on farm node using farm scheduler Receive shared library/scripts in controlling thread Output files available similar to above case Head node uses farm scheduler to kill idle jobs ●
Interactive Analysis Grid/distributed environment so far considered 2 extremes of proccessing/data access: ● A. Fetch remote data, process locally (0% “granularity”) ● B. Submit batch job to remote cluster, get processed data back (100% “granularity”) ● 0% ? ? What if another granularity is required: ● Submit job ● Get feedback ● Request more data from same job ● Data staging/job very time-consuming ● 100%
Interactive Analysis II Use Clarens as RPC layer ● Python as scripting language already used ● Multithreaded analysis job listens to RPCs ● Use Condor/PBS/LSF as scheduler to start and kill jobs. Fork a shell as a system user to control Condor Clarens Pclarens Client Clarens ● Analysis WD
Interactive Analysis III The Analysis Process effectively becomes an application server ● This process get started with a one-time username/password, so it doesn't have to do certificate-based authentication ● On farm nodes the analaysis process will be contacted via the Clarens server on the head-node, or an http proxy could be set up. Default will be the former, because sysadmins may not have experience ● Preliminary code is in COBRA 7_3_0_pre 1, shown at CMS week in June using Qt interface ● Last week: Qt code removed ●
Current work WSDL interface descriptions (Frank v. L. ~done) ● P 2 P resource/data discovery (Asif Jan M. Phil. - started) ● Expand web management interface (me) ● Integation with CMS analysis tools (me, Vincenzo I. ) ● Iguana client for DC 04 (Shahzad M. At CERN) ● Ongoing VO integration (me, Yujun Wu, Dantong Yu) ● Packaging/distribution (me, Michael T. ) ● Interface to SOCATS (Eric. A) FUTURE ● Monitoring integration – via Mona. Lisa ● Persistent connection implementation (ftp? ) ● http: //clarens. sf. net
Summary As many interaction modes as possible using standard interfaces/protocols increases value of any individual component ●Feedback/Logging crucial to make higher level services work ● PDA JAS CMD line Prog. Lang X ROOT, PROOF, JAS, Lizard. . . Mona. LIsa Web browser E-mail Iguana Middleware/Information providers CMD line Prog. Lang X ROOT, PROOF Dist. JAS, DIAL PBS, Condor, LSF Mona. Lisa ORCA POOL SEAL RDBMS Filesystem