
62fcb6b1d27682283c74290a1c0ff0d8.ppt
- Количество слайдов: 21
GRAM: Software Provider Forum Stuart Martin Computational Institute, University of Chicago & Argonne National Lab Tera. Grid 2007 Madison, WI
GRAM - Basic Job Submission and Control Service l A uniform service interface for remote job submission and control – Includes file staging and I/O management – Includes reliability features – Supports basic Grid security mechanisms – Asynchronous monitoring – Interfaces with local resource managers, simplifies the job of metaschedulers/brokers l GRAM is not a scheduler. – No scheduling – No metascheduling/brokering 2
Performance Comparisons 4
Concurrent Jobs (as in paper) Average seconds per 1000 jobs Condor-g to GRAM to Condor LRM Stage Out File Clean Up Unique GRAM 2 GRAM 4 Job Dir In None No No 2552 2100 1 X 10 K B No No 2608 3779 1 X 10 K B Yes 2698 5695 5
Concurrent Jobs (as will be in GT 4. 0. 5) Average seconds per 1000 jobs Condor-g to GRAM to Condor LRM Stage Out File Clean Up Unique GRAM 2 GRAM 4 Job Dir In None No No 2552 2176 1 X 10 K B No No 2608 2147 1 X 10 K B Yes 2698 2254 6
Improving performance for staging jobs l Adding local method call mechanism for general use in Java WS Core (4. 0. 5) – GRAM is doing this with RFT – Any service which calls another in-process service could make similar modifications for local calls and likely benefit from improved performance l Adding caching of the Grid. FTP server connections in RFT (4. 0. 6) 7
Sequential Jobs Average seconds per job (Fork) Delegatio n Stage GRAM 2 GRAM 4 In Out None N/A 1. 70 Per Job None 1. 07 3. 53 Per Job 1 X 10 K B None 1. 78 5. 57 Shared 1 X 10 K B None N/A 5. 41 Per Job 1 X 10 K 2. 44 9. 08 8
Sequential Jobs Average seconds per job (Fork) Delegatio n Stage GRAM 2 GRAM 4 In Out None N/A 1. 46 Per Job None 1. 07 3. 42 Per Job 1 X 10 K B None 1. 78 3. 46 Shared 1 X 10 K B None N/A 3. 51 Per Job 1 X 10 K 2. 44 5. 25 9
GRAM Auditing 10
TG Gateways l l Lower the barrier for scientists and their applications to use Tera. Grid resources Provide an application or domain-specific interface that a scientist can easily understand Each gateway may have 100 s or 1000 s of users accessing TG resources Must be efficient and scale 11
Use Cases l Group Access – For efficiency, a “community” credential is used to multiplex many users over a single ID l Query Job Accounting – Gateways need a remote interface to obtain the TG units charged for their user’s jobs l Auditing – Grid services provide access to resources – TG Resource Providers need a record of actions performed by services 12
Requirements From Use Cases l l l Grid Job Identifier Remote client interface to auditing and accounting information Creation of service audit and accounting information Access to remote LRM accounting information from the audit service Scalability in storing information/records Secure access (authentication and authorization) to audit and accounting information 13
Grid Job Identifier l Uniquely identifies a job Shared between the client (Gateway) and service (TG RP) Obtained in the normal service interaction/protocol In GRAM 4 it’s the EPR converted In GRAM 2 it’s the job contact (as is) l GRAM 4 Example >>> l l 14
GRAM 4 EPR:
Remote Client Interface l l l Flexible query interface to retrieve audit and accounting records Define an operation “get. Charge. For. Job” to return the units consumed by a Grid Job ID Keep audit service interface separate from GRAM service to allow flexible deployment scenarios – Allow a single audit service for multiple GRAM services – Same client interface could be used for other services, for example, charging for data storage or transfers l OGSA-DAI satisfies these requirements 16
Creation of Service Auditing Information l Added GRAM audit record creation upon job termination – Record fields: Job_grid_id, local_job_id, submission_job_id, subject_name, username, creation_time, queued_time, stage_in_gid, stage_out_gid, clean_up_gid, gt_verison, rm_type, job_description, success_flag – Gerson Galang (APAC) contribution for GRAM 4 audit record creation at beginning of job, update after LRM submission, and final update upon termination – Records are needed soon after job termination l Accounting information is created by the local resource managers 17
Access to LRM Accounting Information l l Tera. Grid uploads all LRM accounting information from each TG site to a central DB (TGCDB) The OGSA-DAI service can be configured to access the remote TGCDB 18
Scalability in Storing Information/Records l l Estimated that system should handle 100, 000+ records GRAM service inserts records directly into audit DB Audit DB must be local to GRAM service to assure reliability Implemented to use either postgress or My. SQL 19
Secure access l Standard authentication and authorization methods should be used to limit access to the audit and accounting information – Clients must present a valid X. 509 certificate – Access can be controlled based on a range of policies l Current policy is to allow access iff the DN of the requestor matches the DN in the audit record 20
Resource Provider Site GT 4 Java Container Delegation Compute Cluster RFT Audit Table RFT 1, 2 Resource Manager 3 WS GRAM LEAD Gateway 5 7 4 GRAM Audit Table RM Accounting 8 OGSA DAI 9 AMIE 6 TG Central Accounting DB 21
Sequence Description 1. 2. 3. 4. 5. 6. 7. 8. 9. Gateway submits job and gets an EPR on the reply Gateway controls and monitors job with EPR GRAM submits and monitors job in RM GRAM inserts audit record at end of job RM writes job accounting record AMIE uploads RM accounting records to TGCDB. The RM accounting record is converted to TG accounting units. Gateway locally converts EPR to GJID Gateway calls OGSA-DAI get. Charge. For. Job with GJID and gets the job usage on the reply OGSA-DAI processes remote join between GRAM audit and TGCDB 22