Unformatted text preview:

Page: 1 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem  Tandem NonStop Systems - Cyclone – Affordable Commercial Database Systems with very long MTTF – Modularity » units of service, failure, diagnosis, repair, growth » fault containment regions » expandable for performancePage: 2 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – Fail Fast Mode » terminate operation immediately after error detection » reduces error propagation » single error corrections/ double error detection » ECC, data coding » hardware self checking » software and firmware consistency checks » after failure OS distributes processors applications on remaining processors » load balancing is transparent to userPage: 3 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – Architecture (Overview in Pra96. Fig. 4.1) » loosely coupled MIMD, up to 16 processors » dual processors, independent & asynchronous » heavy use of low-level dual redundancy » multiple, physically separate sections » each section: up to 4 processors, communication via Dynabus » write through cache » mirrored disks – Processor Pair » primary/backup approach » primary sends checkpoints » when primary proc. fails:  backup becomes primary  rolls back to last checkpoint  picks up from that pointPage: 4 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – Hardware Fault Tolerance » single fault tolerance » primary objective to prevent single fault to bring down system » redundant hardware: processors, busses, I/O controllers, disks, power supplies » spare RAM chips » each processor has own power supply – Software Fault Tolerance » processors can detect other halted processors » “I'm alive" protocol » GUARDIAN 90 OS maintains idle backups of user processes » Processor consistency check via checkpoint messagesPage: 5 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – Networking and I/O » Networks  Dynabus: 40 MB/s = 2 independent 20 MB/s buses  Dynabus+: 4 unidirectional fiber optics, – up to 50m physical separation – robust to electro-magnetic interference » I/O  processor can support 2 I/O systems  each system has 2 channels  each channel supports up to 32 I/O devices  burst data of 5 MB/s = 10 MB/s per processor  DMA I/O  mirrored disks (dual ported)Page: 6 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – On-line Maintenance » Field replaceable units (FRU)  processors  I/O controllers  fans  power supplies  can be installed/replaced by user » Warm swaps of FRU » Effective MTTR = milliseconds => very high AvailabilityPage: 7 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem  Tandem - Himalaya – Main features » loosely coupled massively parallel computer » 2 to 4080 processors » cross-coupled MIPS R4400 RISC processors  one logical processor  both processors operate in lockstep » 32K primary cache, 4MB secondary cache » up to 256 MB RAM » 4 independent I/O channels » fiber-optic TorusNet  horizontal controller => 4 sections (each section = 4 processors)  vertical controller => 14 nodes = domain  depth controller => 16 domainsPage: 8 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem  Himalaya 2000Page: 9 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem K2000SE serverPage: 10 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem optional expansion cabinetPage: 11 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – TorusNet » section » node » ringPage: 12 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – K200, K2000, K20000 Servers Spec. Features: » Target: online transaction processing » standard RISC technology » loosely coupled architecture » dual interprocessor buses » dual-ported controllers » fault-tolerant power subsystem » in case of power outage server memory is preserved via integrated battery backupPage: 13 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – NonStop Operating system » core of Tandem’s open systems environment » enables operation to run primary and backup processes » before performing any critical function, sends backup process a checkpoint message containing data and status information » kernel supports end-to-end integrity features » micro-kernel is message-based (parallel processing software) » kernel supports application program and operations control interfaces called “personalities” » these personalities support applications from different platforms » e.g. relational database management personalities applications can be developed using:  SQL, Data Access Language (Macintosh), SQL Server (Microsoft/Sysbase), ODBC (Microsoft), Oracle Tools (Oracle)Page: 14 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem » other personalities are transaction processing personalities allows parallel transaction processing services for different systems » “guardian services” allow compatibility to Tandem applications » “open systems services” supports UNIXPage: 15 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem » Transaction Manager (NonStop TM/MP) deals with effects of incomplete transactions, system failures and network failures. » Remote Duplicate Facility allows data to be located remote to shield from environmental disaster. » Safeguard security management facility deals with security issues » Network support includes TCP/IP, IPX/SPX, NETBIOS, AppleTalk, SNA, OSI and ATMPage: 16 © 2003 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 30 Tandem – Maintenance » key data logged and evaluated by expert-system to identify potential problem » can dial automatic to online support center » field replaceable units can be exchanged by warm swapsPage: 17 © 2003 A.W. Krings


View Full Document

UI CS 449 - Tandem

Course: Cs 449-
Pages: 24
Download Tandem
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Tandem and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Tandem 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?