View Full Document

An FPGA-based Simulator for Datacenter Networks



View the full content.
View Full Document
View Full Document

6 views

Unformatted text preview:

An FPGA based Simulator for Datacenter Networks Zhangxi Tan Krste Asanovic David Patterson Computer Science Division UC Berkeley CA Computer Science Division UC Berkeley CA Computer Science Division UC Berkeley CA xtan eecs berkeley edu krste eecs berkeley edu pattrsn eecs berkeley edu ABSTRACT We describe an FPGA based datacenter network simulator for researchers to rapidly experiment with O 10 000 node datacenter network architectures Our simulation approach configures the FPGA hardware to implement abstract models of key datacenter building blocks including all levels of switches and servers We model servers using a complete SPARC v8 ISA implementation enabling each node to run real node software such as LAMP and Hadoop Our initial implementation simulates a 64 server system and has successfully reproduced the TCP incast throughput collapse problem When running a modern parallel benchmark simulation performance is two orders of magnitude faster than a popular full system software simulator We plan to scale up our testbed to run on multiple BEE3 FPGA boards where each board is capable of simulating 1500 servers with switches 1 INTRODUCTION In recent years datacenters have been growing rapidly to scales of 10 000 to 100 000 servers 18 Many key technologies make such incredible scaling possible including modularized container based datacenter construction and server virtualization Traditionally datacenter networks employ a fat tree like three tier hierarchy containing thousands of switches at all levels rack level aggregate level and core level 1 As observed in 13 the network infrastructure is one of the most vital optimizations in a datacenter First networking infrastructure has a significant impact on server utilization which is an important factor in datacenter power consumption Second network infrastructure is crucial for supporting data intensive Map Reduce jobs Finally network infrastructure accounts for 18 of the monthly datacenter costs which is the third largest contributing factor In addition existing large commercial switches and routers command very healthy margins despite being relatively unreliable 26 Sometimes correlated failures are found in replicated million dollar units 26 Therefore many researchers have proposed novel datacenter network architectures 14 15 17 22 25 26 with most of them focusing on new switch designs There are also several new network products emphasizing low latency and simple switch designs 3 4 When comparing these new network architectures we found a wide variety of design choices in almost every aspect of the design space such as switch designs network topology pro tocols and applications For example there is an ongoing debate between low radix and high radix switch design We believe these basic disagreements about fundamental design decisions are due to the different observations and assumptions taken from various existing datacenter infrastructures and applications and the lack of a sound methodology to evaluate new options Most proposed designs have only been tested with a very small testbed running unrealistic microbenchmarks as it is very difficult to evaluate network architecture innovations at scale without first building a large datacenter To address the above issue we propose using Field Programmable Gate Arrays FPGAs to build a reconfigurable simulation testbed at the scale of O 10 000 nodes Each node in the testbed is capable of running real datacenter applications Furthermore network elements in our testbed provide detailed visibility so that we can examine the complex network behavior that administrators see when deploying equivalently scaled datacenter software We built the testbed on top of a cost efficient FPGA based full system manycore simulator RAMP Gold 24 Instead of mapping the real target hardware directly we build several abstracted models of key datacenter components and compose them together in FPGAs We can then construct a 10 000 node system from a rack of multi FPGA boards e g the BEE3 10 system To the best of our knowledge our approach will probably be the first to simulate datacenter hardware along with real software at such a scale The testbed also provides an excellent environment to quantitatively analyze and compare existing network architecture proposals We show that although the simulation performance is slower than prototyping a datacenter using real hardware abstract FPGA models allow flexible parameterization and are still two orders of magnitude faster than software simulators at the equivalent level of detail As a proof of concept we built a prototype of our simulator in a single Xilinx Virtex 5 LX110 FPGA simulating 64 servers connecting to a 64 port rack switch Employing this testbed we have successfully reproduced the TCP Incast throughput collapse effect 27 which occurs in real datacenters We also show the importance of simulating real node software when studying the TCP Incast problem Network Architecture Policy away switching layer 17 DCell 16 Portland v1 6 Portland v2 22 BCube 15 VL2 14 Thacker s container network 26 Testbed Click software router Commercial hardware Virtual machine commercial switch Virtual machine NetFPGA Commercial hardware NetFPGA Commercial hardware Prototyping with FPGA boards Scale Single switch 20 nodes 20 switches 16 servers 20 switches 16 servers 8 switches 16 servers 10 servers 10 switches Workload Microbenchmark Synthetic workload Microbenchmark Synthetic workload Microbenchmark Microbenchmark Table 1 Datacenter network architecture proposals and their evaluations EVALUATING DATACENTER NETWORKS 2 1 We begin by identifying the key issues in evaluating datacenter networks Several recent novel network architectures employ a simple low latency supercomputer like interconnect For example the Sun Infiniband datacenter switch 3 has a 300 ns port port latency as opposed to the 7 8 s of common Gigabit Ethernet switches As a result evaluating datacenter network architectures really requires simulating a computer system with the following three features 1 Scale Datacenters contains O 10 000 servers or more 2 Performance Large datacenter switches have 48 96 ports and are massively parallel Each port has 1 4 K flow tables and several input output packet buffers In the worst case there are 200 concurrent events every clock cycle 3 Accuracy A datacenter network operates at nanosecond time scales For example transmitting a 64 byte packet on a 10 Gbps link takes only 50 ns which is


Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view An FPGA-based Simulator for Datacenter Networks and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view An FPGA-based Simulator for Datacenter Networks and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?