DOC PREVIEW
IU ORE-Chem Update

This preview shows page 1-2-3-19-20-38-39-40 out of 40 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

IU ORE-Chem UpdateSlide 2What We Said We Would DoLayer Cake of IU ActivitiesHardware LayerCloud InfrastructureSlide 7Data Cloud InfrastructureTriple Store: IntellidimensionOpen Elastic Block StoreBlock Store ArchitectureIntegration with Cloud Computing SystemsProgramming CloudsMulticore and Cloud Technologies to support Data Intensive applicationsSlide 15IU’s ORE-CHEM PipelineIterative MapReduce- Kmeans Clustering and Matrix MultiplicationSlide 18Conclusions: Dryad for Scientific ComputingSlide 20Architecture of Swarm ServiceSwarm-GridSome DetailsDryad Data PartitioningProgramming the PipelineSecurity in Web 2.0OAuth: REST SecurityOAuth Security StatusAcknowledgmentsFull StopDryad and DryadLINQSlide 32Slide 33Slide 34Slide 35Slide 36Drilling Though Data CloudsSlide 38Job Execution Time in Swarm-Dryad with Windows HPC 16 nodesJob Execution Time in Swarm-Dryad various number of nodesIU ORE-Chem UpdateMarlon Pierce, Geoffrey FoxIndiana UniversityIU to lead New US NSF Track 2d $10M Award See http://www.futuregrid.org for more information.What We Said We Would Do•Apply data-centric workflow technologies (Dryad)–Significant effort•Install and run triple store–Done locally.–Need to do this in Azure.•Design alternative formats for ORE (JSON, Microformats)–Nothing to report yet•Design secure services, compositions, mash-ups–OAuth piece done.–Significant effort on social network interfaces–Nothing to report on ORE-chem enabled services yet•Investigate clouds for ORE-Chem–Infrastructure and runtime–Significant effort on virtual data stores, overheads of virtualization.Layer Cake of IU ActivitiesWeb 2.0 Research: Security for REST ServicesCloud Computing: Infrastructure and RuntimesInfrastructure: Windows HPC TestbedsHardware LayerCloud Infrastructure•Tempest: HP distributed shared memory cluster with 768 processor cores and 1.5 TB total memory capacity. The cluster includes 13.7 TB of local spinning disk.–Tempest can be dynamically reconfigured to act as either a Windows HPC or Linux cluster.–Smaller versions Madrid and Barcelona•Other machines:–The IBM iDataPlex system is an IBM e1350 distributed shared memory cluster with 1024 processor cores and 3 TB total memory capacity. –Cray XT5m distributed shared memory cluster with 672 processor cores and 1.3 TB total memory capacity. –A shared memory system with at least 480 cores and 640 GB of RAM will also be installed at IU as part of the FutureGrid award.Data Cloud InfrastructureTriple Store: Intellidimension•This has been installed on IU servers.•We are ready for data.•Efforts to install this on MS Azure were not successful.–Inadequate documentation earlier in the year.–We will revisit this.Open Elastic Block Store•Amazon EBS is a way to mount virtual disks in cloud-space.–Empty disk space or archived data stores–ORECHEM enabled data sets, for example.–Clone-able, so keep your own version of community data.•We are implementing an open version of this.–Contribute to Nimbus, an open-source EC2–But independent of Xen, etc. –Would be interesting to do this for Windows•Eventual backbone: IU has over a petabyte disk space of lustre file system. –Can be used to load and store VMs.•X. Gao won best student poster award at TG09.–Paper accepted to E-Science 2009Block Store ArchitectureVolume ServerVolume DelegateVirtual Machine Manager (Xen Dom 0)VMM DelegateVM instance (Xen Dom U) VBS Web Service VBS ClientVBDiSCSICreate Volume,Export Volume,Create Snapshot,Etc. Import Volume,Attach Device,Detach Device,Etc.Integration with Cloud Computing SystemsVolume ServerVolume DelegateXen Dom 0Xen DelegateXen Dom U VBS Web Service VBS ClientVBDiSCSICreate Volume,Export Volume,Create Snapshot,Etc. Import Volume,Attach Device,Detach Device,Etc.Nimbus Workspace Service VBS_Nimbus Web ServiceAttach-volume <volId> <Nimbus Instance Id> <device>Query for Xen Dom0 Host and DomUId with <Nimbus Instance Id>Programming CloudsMulticore and Cloud Technologies to support Data Intensive applications•Using Dryad (Microsoft) and MPI to study structure of Gene Sequences on Tempest Cluster. We are working on PubChem.See http://www.infomall.org/salsa for lab projects (X. Qiu).PubChem dataset consists of binary 166 MACCS keys (fingerprints), which indicate whether a each chemical compound has a special functional molecule or not We have total 26,466,421 chemical compounds. (i.e, the total PubChem dataset has 166 dimensions and 26M records)Randomly selected 50K chemicals to produce 3D GTM map. GTM is an algorithm to find a lower dimension structure from higher dimensional data (3D in this case).http://www.youtube.com/watch?v=nylgjKgnSLgCourtesy of Jong Y. ChoiIU’s ORE-CHEM PipelineHarvest NIH PubChem for 3D StructuresConvert PubChem XML to CMLConvert PubChem XML to CMLConvert CML to Gaussian InputSubmit Jobs to TeraGrid with SwarmConvert Gaussian Output to CMLConvert CML to RDF->ORE-ChemInsert RDF into RDF Triple StoreConversions are done with Jumbo/CML tools from Peter Murray Rust’s group at Cambridge. Swarm is a Web service capable of managing 10,000’s of jobs on the TeraGrid. We are developing a Dryad version of the pipeline. Goal is to create a public, searchable triple store populated with ORE-CHEM data on drug-like molecules.Iterative MapReduce- Kmeans Clustering and Matrix MultiplicationIterative MapReduce algorithm for Matrix MultiplicationKmeans Clustering implemented as an iterative MapReduce applicationOverhead of parallel runtimes – Matrix Multiplication•Compute intensive application O(n^3)•Higher data transfer requirements O(n^2)•CGL-MapReduce shows minimal overheads next to MPIOverhead of parallel runtimes – Kmeans Clustering•O(n) calculations in each iteration•Small data transfer requirements O(1)•With large data sets, CGL-MapReduce shows negligible overheads•Extremely higher overheads in Hadoop and DryadJaliya Ekanayake {[email protected]}•Performance of MPI on virtualized resources–Evaluated using a dedicated private cloud infrastructure–Exactly the same hardware and software configurations in bare-metal and virtual nodes–Applications with different communication: computation ratios–Different virtual machine(VM) allocation strategies {1-VM per node to 8-VMs per node}High Performance Parallel Computing on CloudPerformance of Matrix multiplication under different VM configurationsOverhead under different VM configurations for Concurrent Wave


IU ORE-Chem Update

Download IU ORE-Chem Update
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view IU ORE-Chem Update and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view IU ORE-Chem Update 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?