1693d21f72ef8044a6920873b416b0dd.ppt
- Количество слайдов: 32
Grids Are Real: Avaki in the Life Sciences Sector September, 2002 Avaki Corporation One Memorial Drive Cambridge, MA 02142 617. 374. 2500 www. avaki. com
Agenda § Avaki Background § Commercial Requirements § Grids in Commercial Environments § AVAKI Grid Software § Avaki and Standards § Avaki in Life Sciences 2
AVAKI Company Background § Comprehensive grid software - Development began in 1994 - Formerly Legion - First deployed in 1997 as NPACI-Net § Founded in 1998 § Funded by Polaris, General Catalyst, and Soffinova § Strong customers, leading partners, industry consortia Customers Partners Standards Organizations 3
Agenda § Avaki Background § Commercial Requirements § Grids in Commercial Environments § AVAKI Grid Software § Avaki and Standards § Avaki in Life Sciences 4
Situation § Competitive advantage depends on bringing new products to market faster § Increased automation in drug discovery and new product development has caused an explosion in data and in computation § Mergers, acquisitions, joint ventures, and partnerships are creating distributed and virtual organizations Competitive advantage depends on dynamically matching information technology resources – data, processing, and applications – with the individuals and organizations that depend on them. 5
Challenges § Data Chaos - Numerous or large data sources - Data updated frequently and on different schedules - Accessed by users at multiple locations in different organizations § Increased Computational Load - Compute intensive and high-throughput applications - Spikes in demand for computing power - Complex staging requirements § Supporting Distributed and Virtual Organizations - Heterogeneous information technology resources - Multiple locations and administrative domains - Frequently changing requirements and system failures 6
Requirements § End Users - Easy access to current, consistent data Convenient access to applications and processing Easy collaboration with colleagues and partners Regardless of location, administrative domain, or platform Applications often can not, or will not, be modified • MUST WORK WITH LEGACY APPLICATIONS - Increasingly Java, J 2 EE as execution environment § IT - Support requests for better access and more resources Streamline data management Enable more flexibility in the use of resources Protect corporate assets and intellectual property IT managers are overworked - and represent 40% of IT costs • Grids must simplify life, not make it more complicated! 7
What you won’t see § MPI is very rare § MPI cross platform, cross site - no interest at all § Multi-site applications - basically no interest § Remote visualization - no interest § Fancy parallel schedulers - little interest § Desire to write new applications - almost nil § loads of bandwidth 8
Agenda § Avaki Background § Commercial Requirements § Grids in Commercial Environments § AVAKI Grid Software § Avaki and Standards § Avaki in Life Sciences 9
What is a Grid System? A Grid system is a collection of distributed resources connected by a network. Examples of Distributed Resources: n Desktop n Handheld hosts n Devices with embedded processing resources such as digital cameras and phones n Tera-scale supercomputers 10
What is a Grid? A grid is all about gathering together resources and making them accessible to users and applications. A grid enables users to collaborate securely by sharing processing, applications, and data across heterogeneous systems and administrative domains for collaboration, faster application execution and easier access to data. - Compute Grids - Data Grids 11
Solution: The Grid-Enabled Enterprise Organizations are adopting grid computing to simplify how information technology resources are accessed, to more flexibly and fully utilize resources, and to reduce manual system administration. By grid-enabling, enterprises can: § Establish more productive IT environments for end users, environments that address the complexities of today’s work § Streamline key processes and facilitate essential collaboration within and between companies § Simplify administration and management of IT resources § Fully utilize existing resources and avoid capital expenditures § Support wider range of infrastructure choices and more easily migrate to new technologies § Provide secure access to resources while supporting the protection of intellectual property 12
Grid Computing Scenarios – re rid a tw a G f So Dat rid nd G KI te a VA pu A m Co Partner Grids § Multiple owners, sites, domains § Multiple file systems § Internet connectivity Campus/Enterprise Grids Cluster Grids Desktop Cycle Aggregation § Limited acceptance in commercial enterprises § Multiple owners, domains § Multiple file systems § WAN connection § Single owner, department, project § Single domain, file system § LAN connection 13
Agenda § Avaki Background § Commercial Requirements § Grids in Commercial Environments § AVAKI Grid Software § Avaki and Standards § Avaki in Life Sciences 14
AVAKI Grid Software – Compute and Data Grid Capabilities Enterprise Users § Unifies compute, data and application resources Partner Users § Single, global namespace § Secure access § Simplified administration § Failure detection and restart Queuing System Cluster Queuing System Desktops Server Shared Output Shared Data Sources IT Departments Enterprise User Departments Server Shared Data Partner 15
AVAKI 2. 5 Data Grid Enterprise Users § Federates multiple data sources Partner Users § Provides access to data in local and virtual file systems (DAS, NAS, SAN) § Provides access to shared data through standard interfaces § Caches data locally Queuing System Cluster Queuing System Desktops Server Shared Output Shared Data Sources IT Departments Enterprise User Departments Server Shared Data Partner 16
Avaki Data Grid – Data Mapped to the Global Namespace § Links directories and files from source location to data grid directory and user-specified name § Generates location independent grid name § Presents unified view of the data across platforms, locations, firewalls, administrative domains, and data owners Windows 2000 Linux IT Departments Enterprise User Departments Solaris Partner 17
Avaki Data Grid – Access Data Enterprise Users § Access using standard NFS protocol or Avaki commands Partner Users AVAKI Data Access Server § Access using user specified name AVAKI Data Access Server Cached Copy § Access based on specified privileges § Single log-on for shared data access § Aggressively caches data locally Queuing System Cluster Queuing System Desktops Server Shared Output Shared Data Sources IT Departments Enterprise User Departments Server Shared Data Partner 18
AVAKI 2. 5 Data Grid Benefits § Requires no changes to applications or the way users typically access the data § Easy, convenient, wide-area access to data – regardless of location, administrative domain or platform § Provides consistent access to the most recent data available § Eliminates the need to create and maintain multiple copies § Caches remote data locally for high performance § Protects data with fine-grained security § Eases data administration and management 19
AVAKI 2. 5 Compute Grid Enterprise Users § Federates heterogeneous compute resources Partner Users § Easy integration to third party queuing systems § Identifies “appropriate” resources § Automatically stages data and applications Resources Queuing System Cluster Queuing System Desktops Server Shared Output Shared Data Sources IT Departments Enterprise User Departments Server Shared Data Partner 20
Avaki 2. 5 Compute Grid Benefits § Easy, convenient, wide-area access to processing resources – regardless of location, administrative domain or platform § Eliminates time-consuming searching for available processing cycles § Executes jobs more efficiently § Better utilizes existing resources helping avoid capital expenditures § Supports flexible usage of resources as required for changing requirements and system failures § Requires no changes to legacy or commercial applications § Protects resources with fine-grained access control § Eases system administration and management § Improves capacity management and planning 21
Agenda § Avaki Background § Commercial Requirements § Grids in Commercial Environments § AVAKI Grid Software § Avaki and Standards § Avaki in Life Sciences 22
Commitment to Standards: GGF Commitment: § Respect for the standards: AVAKI will deliver the first and best commercial implementations of the OGSA/I standards § Respect for customer investments: AVAKI will interoperate with other OGSAcompliant apps (following ratification), including Globus Background: § AVAKI taking a visible, active role at the Global Grid Forum (GGF) - Andrew Grimshaw on GGF Steering Committee AVAKI engineers active in OGSI, OGSA, and numerous other Working Groups § Contributed Secure Grid Naming Protocol (SGNP) to OGSI WG - Spec for scalable naming of grid entities, ability for such entities to communicate securely and reliably in spite of migration, replication, failure, etc. Public expressions of support from IBM, H-P/Compaq, Sun, Platform 23
Commitment to Standards: I 3 C The I 3 C facilitates and enables data exchange, data management, and knowledge management across the entire life science community by promoting common protocols that ensure interoperability in an open, consistent and robust manner § AVAKI is a member of the I 3 C, alongside IBM, and is active on the Technical Architecture Committee § AVAKI co-authored LSID: a naming standard for distributed data - § Distributed, biologically significant data items Files, database records, and data objects managed by N-tier applications Accessible over public and/or private networks Owned, managed, and/or curated by different organizations Joint demo with IBM, others, at Bio 2002 (July ‘ 02) - Integrates LSID to uniquely identify objects & data elements in a distributed, federated fashion 24
Agenda § Avaki Background § Commercial Requirements § Grids in Commercial Environments § AVAKI Grid Software § Avaki and Standards § Avaki in Life Sciences 25
Industry Problem: Increasing Cost and Complexity of Life Science Data-sharing Genbank growth has continued to trend sharply upwards, as have many other classes of biological data § Over 400 public Life Sciences databases § 8 x growth in genomics data, last 18 months. And this is just the beginning … - Proteomics data: 1, 000 x multiplier 2 - Glycomics, new small molecule efforts § Increasing scope of data diversity - Annotations (interactions) - Organism-specific (mouse, human) - Molecule-specific (protein, sugar) - Data-type-specific (gene expression) § Increasingly complex data interrelationships Increasinglycomplex interrelationshi ps between biological research databases today (LION graphic) 1 Source: Time. Logic estimates. 2 Frank Gleeson, CEO, MDS Proteomics 26
Public DB Enterprise IT Problem: Life Sciences Data Management Varying Media • Multiple research groups, domains • Each dept or site acquires & manages its own data • Coherence issues among researchers • Bandwidth costs • Multiple FTEs allocated to data management efforts CD FTP Web Portal Tape Location 3 Location 2 SEQ_1 APP 1 SEQ_2 Bioinformatics External Partner Pharmaceutical Company SEQ_3 Tape APP 2 Research External Partner 27
Data Management Solution -- Using Avaki Data Grid § Multiple data sources Central IT Public Data Internal Data Partner Data § One authoritative copy § Consistent data across sites § Automated process eliminates manual and duplicated effort § Write-through cache supports sharing usercreated data § “Data Access Problem” solved Avaki Data Cache Enterprise/Partner Sites Avaki Data Cache 28
Resource Availability and Access Problem § Some resources are at maximum capacity while other resources are underutilized § Different user interfaces 1 2 n 1 § Multiple queuing systems 2 n § Multiple log-ons and complex UID management § Policy and security needs make sharing difficult Queuing System WIN 2000 Workstations Linux Cluster Partner Server Solaris Workstations Linux Cluster Servers Enterprise 29
Resource Availability and Access Solution – Using Avaki Compute Grid § Load balanced across resources for improved utilization § Single log-on to run jobs Avaki Log-on/Commands § Single user interface § Single set of commands to access all resources Local Usage Policies § Usage policies make sharing easy and secure Queuing System WIN 2000 Workstations Linux Cluster Server Queuing System Solaris Workstations Linux Cluster Servers Cluster Server 30
Summary 31
The AVAKI Difference § AVAKI 2. 5 combines data and compute grid capabilities - Provides wide-area access to data, processing and application resources, while protecting corporate assets and intellectual property - Supports complexities of enterprise and global grids - Simplifies system administration and management § Packaged to be deployed quickly - Integrated architecture that requires no additional development - Requires no changes to legacy or commercial applications § Comprehensive support - Design, installation and configuration - Performance tuning - Customer support 32


