COSC 6374 Parallel Computation Collective I O and Scientific Data Libraries Mohamad Chaarawi and Edgar Gabriel Spring 2009 Edgar Gabriel Collective I O Operations Same notion as collective communication All processes in the communicator participate in the operation How do we benefit from Collective I O COSC 6374 Parallel Computation Edgar Gabriel 1 Developing Collective I O Algorithms Three classes of algorithms will be explained Dynamic segmentation algorithms Static segmentation algorithms Individual algorithms Examples using MPI File write all int MPI File write all MPI File fh void buf int count MPI Datatype datatype MPI Status status COSC 6374 Parallel Computation Edgar Gabriel Dynamic Segmentation Algorithm Group processes according to the number of writers All processes share location information about the data to be written within their group MPI Allgather File offsets Number of elements to be written Sort these lists in an ascending order of the file offsets Write a fixed number of bytes to disk in each cycle Cycle buffer size Using MPI Gatherv each process sends its elements contributing in the current cycle to the writer process assigned to him COSC 6374 Parallel Computation Edgar Gabriel 2 Fixed vs Scaling CBS Scaling Each writer writes the specified CBS in one cycle so in each cycle the total amount of data written to disk is CBS number of writers Fixed The cycle buffer size is divided between all writers so in every cycle each writer would write CBS number of writers COSC 6374 Parallel Computation Edgar Gabriel Example Write Dynamic Segmentation Process 0 Process 1 writer 1 2 3 4 5 6 7 8 9 10 Process 2 11 12 13 Cycle 1 1 2 3 4 5 Cycle 2 1 2 3 4 5 6 7 8 9 10 Cycle 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Cycle 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Per process 6 KB Cycle Size 5 KB 14 15 16 16 17 18 17 18 Each block is 1 KB COSC 6374 Parallel Computation Edgar Gabriel 3 Static Segmentation Algorithm Data is gathered from all processes at a root process which will perform the low level write operation Data is written in fixed chunks with the size of the chunk being a configurable parameter The root process gathers a fixed number of bytes from all processes in each cycle Due to the fact that every process contributes in every cycle a constant amount of data this algorithm makes a better use of the communication resources in the cluster COSC 6374 Parallel Computation Edgar Gabriel Static Segmentation Algorithm This algorithm would be good for the following scenarios With huge caches on the I O nodes storage devices are decoupled from the compute cluster and thus show from the application perspective virtually no sensitivity to irregular or strided file access patterns solid state hard drives SSD insensitive to irregular access in the file COSC 6374 Parallel Computation Edgar Gabriel 4 Example Write Static Segmentation Process 0 Process 1 writer 1 2 3 4 5 Cycle 1 1 2 Cycle 2 1 2 3 4 Cycle 3 1 2 3 4 Per process 6 KB Cycle Size 2 KB 6 5 7 6 8 9 10 7 8 7 8 9 10 7 8 9 10 11 Process 2 11 12 12 13 14 13 14 13 14 15 16 13 14 15 16 15 16 17 18 17 18 Each block is 1 KB COSC 6374 Parallel Computation Edgar Gabriel Individual Algorithm Avoid communication operations entirely and has each process writes its data individually to the hard drive Extended by using a scheduling approach to control the number of processes concurrently performing I O operations and thus limit the burden on meta data servers for some file systems COSC 6374 Parallel Computation Edgar Gabriel 5 Test Environment Front end node Node 0 Node 1 Meta Data server Process 0 Process 1 Node 23 Process 23 IB GE File Systems PVFS2 NFS COSC 6374 Parallel Computation Edgar Gabriel Test Case A simple benchmark where processes collectively write max size bytes of data for a given number of iterations to the file We measure the execution time of the test required to write ALL data to file Each test has been executed three times taking the maximum bandwidth achieved Need to check the difference in performance while varying algorithms parameters and file system COSC 6374 Parallel Computation Edgar Gabriel 6 PVFS2 Results Each process executes MPI File write all operations writing 20MB of data per function call writing all in all 1GB of data to file Thus the overall file size for the 24 processes test cases is 24GB and for the 48 processes test cases is 48 GB COSC 6374 Parallel Computation Edgar Gabriel CBS 20MB 24 Processes COSC 6374 Parallel Computation Edgar Gabriel 7 CBS 20MB 48 Processes COSC 6374 Parallel Computation Edgar Gabriel SS 20MB 24 Processes COSC 6374 Parallel Computation Edgar Gabriel 8 NFS Results Each process writes 100MB of data in order to keep the execution time within a reasonable time The segment size is kept constant at 2MB The cycle buffer size is varied between 1MB 10MB and 20MB COSC 6374 Parallel Computation Edgar Gabriel SS 2MB 24 Processes COSC 6374 Parallel Computation Edgar Gabriel 9 SS 2MB 48 Processes COSC 6374 Parallel Computation Edgar Gabriel Tuning Methodologies Results show a lot of factors affect the performance of collective I O operations File System Network interconnect Number of processes Size of file Algorithmic parameters SS CBS Static tuning Prior to the execution of the applications tune for the best algorithmic parametric properties for a certain platform Dynamic tuning Tune at runtime COSC 6374 Parallel Computation Edgar Gabriel 10 Motivation MPI I O is good It knows about data types data conversion It can optimize various access patterns in applications MPI I O is bad It does not store any information about the data type A file written as MPI INT can be read as MPI DOUBLE in another application No information is stored whether it is a twodimensional data array or anything else COSC 6374 Parallel Computation Edgar Gabriel Scientific data libraries Handle data on a higher level Add more information to the data Metadata Size of data structure Information about the numerical format Read and write data structures by name or add units to your data Two widely used libraries available NetCDF HDF 5 COSC 6374 Parallel Computation Edgar Gabriel 11 HDF 5 Hierarchical Data Format HDF developed since 1988 at NCSA University of Illinois http hdf ncsa uiuc edu HDF5 Has gone through a long history of changes the recent version HDF 5 available since 1999 HDF 5 supports Very large files Parallel I O interface Fortran C Java bindings COSC 6374 Parallel Computation Edgar Gabriel HDF 5 dataset Multi
View Full Document
Unlocking...