Скачать презентацию Advanced CPAS Adam Rauch Lab Key Software adam labkey Скачать презентацию Advanced CPAS Adam Rauch Lab Key Software adam labkey

bd66ed5cbea5b914138c5711436e00a1.ppt

  • Количество слайдов: 30

Advanced CPAS Adam Rauch Lab. Key Software adam@labkey. com Advanced CPAS Adam Rauch Lab. Key Software [email protected] com

Agenda • Demo of recent & advanced features • Pipeline architecture & configuration • Agenda • Demo of recent & advanced features • Pipeline architecture & configuration • Production installations

What Is CPAS? A proteomics analysis system that handles all data processing & management What Is CPAS? A proteomics analysis system that handles all data processing & management for high-throughput labs and core facilities

Demo Demo

“Mini-Pipeline” • Included & configured in standard install • CPAS invokes executables (tandem, tpp) “Mini-Pipeline” • Included & configured in standard install • CPAS invokes executables (tandem, tpp) directly on web server • Simple approach works fine for lowthroughput evaluation installs

FHCRC Installation CPAS Pipeline Web Server 2 Proc, 2 GB Tomcat Pipeline Mgr Database FHCRC Installation CPAS Pipeline Web Server 2 Proc, 2 GB Tomcat Pipeline Mgr Database Server 4 Proc, 4 GB MS SQL Server Net. App Mass Spec PC File Server (Sun Hierarchical Storage) mz. XML Conversion Server Cluster 20+ TB Tape Robot

Production Pipeline • Multi-server, clustered, high-throughput pipeline demands a more sophisticated approach • CPAS Production Pipeline • Multi-server, clustered, high-throughput pipeline demands a more sophisticated approach • CPAS interface for configuring, submitting jobs is identical, but pipeline control & communication is handled differently • Each project typically configured with separate “pipeline root” • User initiates search by selecting raw file and specifying search parameters (protocol) • CPAS writes settings file to raw-file directory • Background process (chron job) running on pipeline server sees new job and kicks off pipeline processing

CPAS Pipeline Automated pipeline moves MS 2 data from instrument, through MS/MS search and CPAS Pipeline Automated pipeline moves MS 2 data from instrument, through MS/MS search and post-processing, and into CPAS Sample Input LTQ FT MALDI LCQ Raw File MS/MS Search Cluster X! Tandem, SEQUEST, MASCOT XPRESS, Peptide/Protein Prophet Raw File Convert Server mz. XML File PC #40 mz. XML, pep. XML, prot. XML Files CPAS

Production Pipeline Workflow • Chron job state machine manages workflow – Initiates RAW mz. Production Pipeline Workflow • Chron job state machine manages workflow – Initiates RAW mz. XML conversion • Conversion server (Conversion. Queue) • Vendor-specific DLLs require Windows server – Submits MS/MS search to cluster scheduler – Submits post-processing jobs to cluster scheduler – Handles fractionation scenarios (individual, multi) – When processing is complete, instructs CPAS to load run • Job status is reported via log files, which CPAS reads to update web UI

Search Engine Configuration • SEQUEST cluster uses “Sequest. Queue” – Custom Tomcat/Java web application Search Engine Configuration • SEQUEST cluster uses “Sequest. Queue” – Custom Tomcat/Java web application – Installed on head node of cluster – Pipeline communicates with Sequest. Queue over HTTP • Pipeline drives Mascot cluster directly via HTTP • Pipeline drives X! Tandem via cluster scheduler

Configuring A Production Pipeline • Install, customize Perl scripts that manage the workflow – Configuring A Production Pipeline • Install, customize Perl scripts that manage the workflow – Scripts used at Fred Hutchinson are available as an example • Configure conversion server – Converters & vendor-specific DLLs • Install TPP, MS/MS search engine(s) on cluster • Enable your search engine(s) within CPAS • Install CPAS FTP server (optional) – Useful to allow external collaborators to submit jobs to pipeline • Configure pipeline email notifications (optional) – Email notifications for completion and/or failures

Demo Demo

Production Installation Production Installation

Web & Database Servers • Server operating system(s) – CPAS runs on all popular Web & Database Servers • Server operating system(s) – CPAS runs on all popular operating system platforms – Solaris, Linux, Windows, OS X installations – Windows has somewhat easier install & upgrade process • Graphical installer • Pre-compiled binaries – Select OS that you & your IT staff are most comfortable with • Database server – Postgre. SQL: runs on all popular hardware/OS platforms, free – Microsoft SQL Server: Windows only, commercial, well tested • Server hardware – Invest in database server: powerful server, ample storage, reliability – Web server much less demanding

IT Infrastructure • Shared file system (NFS) – CPAS and pipeline need to access IT Infrastructure • Shared file system (NFS) – CPAS and pipeline need to access to a common NFS – Archive RAW, mz. XML, pep. XML, etc. files • Need plan for backing up NFS and database

Select Administrators • • Database administrator Server administrators CPAS site administrators CPAS project administrators Select Administrators • • Database administrator Server administrators CPAS site administrators CPAS project administrators

Production Installation Customization & Settings • Many settings for customizing CPAS to your needs Production Installation Customization & Settings • Many settings for customizing CPAS to your needs – Fully documented on www. labkey. org – Review all settings carefully on a regular basis • CPAS settings are handled in several places – Most configuration is done via the “Admin Console” – /conf/server. xml – /conf/Catalina/localhost/labkey. xml

Database • JDBC parameters specified in labkey. xml – Driver class (Postgre. SQL vs. Database • JDBC parameters specified in labkey. xml – Driver class (Postgre. SQL vs. SQL Server) – URL includes server name, port, database name – User name & password • Protected your data – CPAS database user needs read/write/delete/update perms – Use a strong password! – Provide no access to database server outside firewall • PGTest and jtdstest tools can help test config

Networking • Basic Networking – Specify port in server. xml – Open firewall port(s) Networking • Basic Networking – Specify port in server. xml – Open firewall port(s) – Procure server name and update DNS • SMTP settings – Server, port, credentials specified in labkey. xml – System email address specified in site settings

Security • Designed to keep sensitive, unpublished scientific data secure • Authentication: dual scheme Security • Designed to keep sensitive, unpublished scientific data secure • Authentication: dual scheme approach – Can delegate to institution’s LDAP system – External users: invitation only • Users choose their own passwords • Hash of password is stored in database and used for authentication • Authorization: Users must be granted explicit permissions – – All data stored in folder hierarchy managed by the database Users are added to groups Groups are granted permission to folder or hierarchy Authorized only if user belongs to group with required permissions • Folders can be made “public” (no authentication required)

Security Settings • SSL – We strongly recommend requiring SSL connections – Enable SSL Security Settings • SSL – We strongly recommend requiring SSL connections – Enable SSL port in server. xml – Use “Require SSL connections” option & port setting • LDAP & SASL – Configure CPAS to authenticate users to your organization’s LDAP server(s) – Specify server name, domain, principal template, SASL • Email templates – Customize new user registration, password change, etc. emails

Other Settings • Network drive – Allows CPAS running as Windows service to attach Other Settings • Network drive – Allows CPAS running as Windows service to attach NFS as a drive • Site-wide option to enable ca. BIGTM • Mascot & SEQUEST connection settings • Site description, color theme, font size, logo

Future Directions • • • Web services-based pipeline Faster, easier loading of protein annotations Future Directions • • • Web services-based pipeline Faster, easier loading of protein annotations Multi-engine comparisons Improved generalized query support Phase 2 of ca. BIG support

Lab. Key Software, Inc. • Private consulting company created by FHCRC and team of Lab. Key Software, Inc. • Private consulting company created by FHCRC and team of software professionals – Formed to support, document, and extend the CPAS project to other functions and labs – Independent company to directly address other institutions’ needs and secure outside funding • Partnership: – Clients provide scientific leadership – Lab. Key focuses on software development • Lab. Key is available to customize, install, and support your pipeline, CPAS, and other Lab. Key applications – Business model ensures you get help & support when you need it

Next Steps • Visit our booth • Join our informal receptions here – 6: Next Steps • Visit our booth • Join our informal receptions here – 6: 30 – 9: 30 PM Tonight & Tomorrow • Talk to Lab. Key about your plans

Resources • http: //www. labkey. org – CPAS Distribution & Support Site – Ask Resources • http: //www. labkey. org – CPAS Distribution & Support Site – Ask questions, contribute feedback – Peruse all the CPAS documentation & tutorials – Download the latest version (Lab. Key 2. 1) • Graphical installer for Windows installation • Well documented “manual” installation for Linux/Mac • http: //www. labkey. com – Lab. Key Software Inc. company web site • CPAS Paper – Rauch A, Bellew M, Eng J, et al. Computational Proteomics Analysis System (CPAS): An Extensible, Open-source Analytic System for Evaluating and Publishing Proteomic Data and High throughput Biological Experiments. J Proteome Res 2006; 5(1): 112 -121.

Acknowledgements • • Fred Hutchinson Cancer Research Center National Cancer Institute Canary Foundation Gates Acknowledgements • • Fred Hutchinson Cancer Research Center National Cancer Institute Canary Foundation Gates Foundation Institute for Systems Biology Ron Beavis & The GPM Numerous developer contributors

Questions? Questions?

Advanced Analysis Features • Filter groups of runs and compare peptides, proteins, Protein. Prophet, Advanced Analysis Features • Filter groups of runs and compare peptides, proteins, Protein. Prophet, quantitation, etc • Analyze groups of runs based on sample properties • Search all experiments for a specific protein or gene name • Link results to protein annotations – Load protein knowledgebases: Tr. EMBL, Swiss-Prot – Gene Ontology: produce GO charts analyzing molecular function, cellular location, metabolic process – Custom protein annotation lists • Flexible, custom query capability – Join results to protein, experiment, sample tables – Display exactly the data you care about