11PortabilityThe material for this lecture is drawn, in part, fromThe Practice of Programming (Kernighan & Pike) Chapter 8Professor Jennifer Rexfordhttp://www.cs.princeton.edu/~jrex2Goals of this Lecture• Learn to write code that works with multiple:• Hardware platforms• Operating systems•Compilers• Human cultures•Why?• Moving existing code to a new context is easier/cheaper than writing new code for the new context• Code that is portable is (by definition) easier to move; portability reduces software costs• Relative to other high-level languages (e.g., Java), C is notoriously non-portable23The Real World is Heterogeneous• Multiple kinds of hardware• 32-bit Intel Architecture• 64-bit IA, PowerPC, Sparc, MIPS, Arms, …• Multiple operating systems•Linux• Windows, Mac, Sun, AIX, …• Multiple character sets• ASCII• Latin-1, Unicode, …• Multiple human alphabets and languages4Portability• Goal: Run program on any system• No modifications to source code required• Program continues to perform correctlyy Ideally, the program performs well too35C is Notoriously Non-Portable• Recall C design goals…• Create Unix operating system and associated software• Reasonably “high level”, but…• Close to the hardware for efficiency• So C90 is underspecified• Compiler designer has freedom to reflect the design of the underlying hardware• But hardware systems differ!• So C compilers differ• Extra care is required to write portable C code6General HeuristicsSome general portability heuristics…47Intersection(1) Program to the intersection• Use only features that are common to all target environments• I.e., program to the intersection of features, not the union• When that’s not possible…8Encapsulation(2) Encapsulate• Localize and encapsulate features that are not in the intersection• Use parallel source code files -- so non-intersection code can be chosen at link-time• Use parallel data files – so non-intersection data (e.g. textual messages) can be chosen at run-time• When that’s not possible, as a last resort…59Conditional Compilation(3) Use conditional compilation• And above all…#ifdef __UNIX__/* Unix-specific code */#endif…#ifdef __WINDOWS__/* MS Windows-specific code */#endif…10Test!!!(4) Test the program with multiple:• Hardware (Intel, MIPS, SPARC, …)• Operating systems (Linux, Solaris, MS Windows, …)• Compilers (GNU, MS Visual Studio, …)• Cultures (United States, Europe, Asia, …)611Hardware Differences•Some hardware differences, and corresponding portability heuristics…12Natural Word Size• Obstacle: Natural word size• In some systems, natural word size is 4 bytes• In some (esp. older) systems, natural word size is 2 bytes• In some (esp. newer) systems, natural word size is 8 bytes• C90 intentionally does not specify sizeof(int); depends upon natural word size of underlying hardware713Natural Word Size (cont.)(5) Don’t assume data type sizes• Not portable:• Portable:int *p;…p = malloc(4);…int *p;…p = malloc(sizeof(int));…14Right Shift• Obstacle: Right shift operation• In some systems, right shift operation is logicaly Right shift of a negative signed int fills with zeroes• In some systems, right shift operation is arithmeticy Right shift of a negative signed int fills with ones• C90 intentionally does not specify semantics of right shift; depends upon right shift operator of underlying hardware815Right Shift (cont.)(6) Don’t right-shift signed intsy Not portable:y Portable:…-3 >> 1…Logical shift => 2147483646Arithmetic shift => -2…/* Don't do that!!! */…16Byte Order• Obstacle: Byte order• Some systems (e.g. Intel) use little endian byte ordery Least significant byte of a multi-byte entity is storedat lowest memory address• Some systems (e.g. SPARC) use big endian byte ordery Most significant byte of amulti-byte entity is storedat lowest memory address000001010000000000000000000000001000100110021003The int 5 at address 1000:000000000000000000000000000001011000100110021003The int 5 at address 1000:917Byte Order (cont.)(7) Don’t rely on byte order in code• Not portable:• Portable:int i = 5;char c;…c = *(char*)&i; /* Silly, but legal */Little endian:c = 5Big endian:c = 0;int i = 5;char c;…/* Don't do that! Or... */c = (char)i;18Byte Order (cont.)(8) Use text for data exchange• Not portable:unsigned short s = 5;FILE *f = fopen("myfile", "w");fwrite(&s, sizeof(unsigned short), 1, f);00000101 00000000Run on a littleendian computerRun on a bigendian computer:Reads 1280!!!fwrite()writesraw data to a fileunsigned short s;FILE *f = fopen("myfile", "r");fread(&s, sizeof(unsigned short), 1, f);fread() readsraw data from a filemyfile1019Byte Order (cont.)• Portable:unsigned short s = 5;FILE *f = fopen("myfile", "w");fprintf(f, "%hu", s);00110101fprintf() convertsraw data to ASCII textRun on a big orlittle endiancomputerRun on a big orlittle endiancomputer:Reads 5myfileunsigned short s;FILE *f = fopen("myfile", "r");fscanf(f, "%hu", &s);fscanf()reads ASCIItext and converts to raw dataASCII code for ‘5’20Byte Order (cont.)If you must exchange raw data…(9) Write and read one byte at a timeunsigned short s = 5;FILE *f = fopen("myfile", "w");fputc(s >> 8, f); /* high-order byte */fputc(s & 0xFF, f); /* low-order byte */00000000 00000101Run on a big orlittle endiancomputerRun on a big orlittle endiancomputer:Reads 5unsigned short s;FILE *f = fopen("myfile", "r");s = fgetc(f) << 8; /* high-order byte */s |= fgetc(f) & 0xFF; /* low-order byte */myfileDecide on big-endian dataexchange format1121OS Differences•Some operating system differences, and corresponding portability heuristics…22End-of-Line Characters• Obstacle: Representation of “end-of-line”• Unix (including Mac OS/X) represents end-of-line as 1 byte: 00001010 (binary)• Mac OS/9 represents end-of-line as 1 byte: 00001101(binary)• MS Windows represents end-of-line as 2 bytes: 00001101 00001010 (binary)1223End-of-Line Characters (cont.)(10) Use binary mode for textual data exchange• Not portable:y Trouble if read via fgetc() on “wrong” operating systemFILE *f = fopen("myfile", "w");fputc('\n', f);00001010 0000110100001101 00001010Run on Unix Run on Mac OS/9 Run on MS Windows\n \r\r \nOpen the filein ordinarytext mode24End-of-Line Characters (cont.)• Portable:y No problem if read via fgetc() in binary mode on “wrong”operating systemy
View Full Document