CS736 Course Project Report An In Depth Examination of Java I O Performance and Possible Tuning Strategies Kai Xu Hongfei Guo xuk cs wisc edu guo cs wisc edu Abstract There is a growing interest in using Java for high performance computing because of the many advantages that Java offers as a programming language To be useful as a language for high performance computing however Java must not only have good support for computation but must also be able to provide highperformance file I O In this paper we first examine possible strategies for doing Java I O Then we design and conduct a series of performance experiments accordingly using C C as a comparison group Based on the experimental results and analysis we reach our conclusions Java raw I O is slower than C C since system calls in Java are more expensive buffering improves Java I O performance for it reduces system calls yet there is no big gain for larger buffer size direct buffering is better than the Java provided buffered I O classes since the user can tailor it for his own needs increasing the operation size helps I O performance without overheads I O related system calls implemented within Java native methods are cheap while the overhead of calling Java native methods is rather high When the number of JNI calls is reduced properly a performance comparable to C C can be achieved 1 Introduction There is a growing interest in using Java for high performance computing because of the many advantages that Java offers as a programming language To be useful as a language for high performance computing however Java must not only have good support for computation but must also be able to provide highperformance file I O as many scientific applications have significant I O requirements However while much work has been done in evaluating Java performance as a programming language little has been done in a satisfying evaluation of Java I O performance In this paper we investigate in depth the I O capabilities of Java and examine how and how well different possible tuning strategies work compared to C C 1 1 Contribution of This Paper The contributions of this paper are threefold First we explored possible strategies one can utilize to get high performance in Java I O Secondly we designed and conducted a series of experiments that examine the performance of each individual strategy accordingly in comparison to C C Finally experiment results are thoroughly analyzed and conclusions are reached 1 2 Related Work There are already some papers discussing Java I O performance Our work is different from those in that we summarize possible I O strategies in Java and give a thorough Java I O performance evaluation and analysis in comparison to C C 1 describes in detail possible strategies in improving Java I O However no convincing experiments have been given to show how well those strategies work neither has it studied Java I O in comparison to that of C C 2 compares Java I O to that of C C and proposes bulk I O extensions However this paper mainly focuses on parallel Java I O for specific applications instead of examining Java I O in general 1 CS736 Course Project Report 1 3 Organization The rest of this paper is organized as follows In Section 2 we describe the basic I O mechanisms defined in Java In Section 3 we discuss our test methodology and experiments design Then we give out the corresponding experiment results and analysis in Section 4 Conclusions and ideas for future work are presented in Section 5 2 Java I O Overview To understand the issues associated with performing I O in Java it is necessary to briefly review the Java I O model When discussing Java I O it is worth noting that the Java programming language assumes two distinct types of disk file organization One is based on streams of bytes the other on character sequences Byteoriented I O includes bytes integers floats doubles and so forth text oriented I O includes characters and text In the Java language a character is represented using two bytes instead of the one byte representation in C C Because of this some translation is required to handle characters in file I O In this project since our major concern is to compare Java I O to that of C C we will focus on the byte oriented I O In Java byte oriented I O is handled by input streams and output streams where a stream is an ordered sequence of bytes of unknown length Java provides a rich set of classes and methods for operating on byte input and output streams These classes are hierarchical and at the base of this hierarchy are the abstract classes InputStream and OutputStream It is useful to briefly discuss this class hierarchy in order to clarify the reason why we are interested in FileInputStream FileOutputStream BufferedInputStream BufferedOutputStream and RandomAccessFile in our test cases Figure 2 1 provides a graphical representation of this I O hierarchy Note that we have not included every class that deals with byteoriented I O but only those classes that are pertinent to our discussion OutputStream RandomAccessFile FileOutputStream FileInputStream FilterOutputStream FilterInputStream BufferedOutputStream BufferedInputStream DataOutputStream DataInputStream Figure 2 1 2 1 InputStream Pertinent Java I O classes hierarchy InputStream and OutputStream Classes The abstract classes InputStream and OutputStream are the foundation for all input and output streams They define methods for reading writing raw bytes from to streams For example the InputStream class provides methods for reading a single byte a byte array or reading the available data into a particular region of a byte array The OutputStream class provides methods for writing that are analogous to those of InputStream 2 CS736 Course Project Report 2 2 File Input and Output Streams The FileInputStream and FileOutputStream classes are concrete subclasses of InputStream and OutputStream respectively which provide a mechanism to read from and write to files sequentially Both classes provide all the methods of their superclasses These two classes are the lowest file I O classes provided to users 2 3 Filter Streams Filter streams provide methods to chain streams together to build composite streams For example a BufferedOutputStream can be chained to a FileOutputStream to reduce the number of calls to the file system The FilterInputStream and FilterOutputStream classes also define a number of subclasses that manipulate the data of an underlying stream 2 4 Buffered Input
View Full Document
Unlocking...