Unformatted text preview:

CS107 Handout 07 Spring 2008 April 7, 2008 The Ins and Outs of C Arrays This handout was written by Nick Parlante and Julie Zelenski. C Arrays As you recall, a C array is formed by laying out all the elements contiguously in memory from low to high. The array as a whole is referred to by the address of the first element. For example, the variable intArray below is synonymous with the address of the first element and can be used in expressions like an int *. int intArray[6]; intArray[0]...intArray[1] The programmer can refer to elements in the array with the simple [] syntax such as intArray[1]. This scheme works by combing the base address of the array with the index to compute the base address of the desired element in the array, using some simple arithmetic. Each element takes up a fixed number of bytes known at compile-time. So address of the nth element in the array (0-based indexing) will be at an offset of (n * element_size) bytes from the base address of the whole array. address of nth element = address_of_0th_element + (n * element_size_in_bytes) The square bracket syntax [] deals with this address arithmetic for you, but it's useful to know what it's doing. The [] multiplies the integer index by the element size, adds the resulting offset to the array base address, and finally dereferences the resulting pointer to get to the desired element. intArray[3] = 13; [0] [1] [2] [3] [4] [5]0 4 8 12 16 20indexOffset in bytesAssume sizeof(int) = 4.Each element takes up 4 bytes.13base + 3*elem_size2 '+' Syntax In a closely related piece of syntax, adding an integer to a pointer does the same offset computation, but leaves the result as a pointer. The square bracket syntax dereferences that pointer to access the nth element while the + syntax just computes the pointer to the nth element. So the expression (intArray + 3) is a pointer to the integer intArray[3]. (intArray + 3) is of type (int*) while intArray[3] is of type int. The two expression only differ by whether the pointer is dereferenced or not. So the expression (intArray + 3) is exactly equivalent to the expression (&(intArray[3])). In fact those two probably compile to exactly the same code. They both represent a pointer to the element at index 3. Any [] expression can be written with the + syntax instead. We just need to add in the pointer dereference. So intArray[3] is exactly equivalent to *(intArray + 3). For most purposes, it's easiest and most readable to use the [] syntax. Every once in a while the + is convenient if you needed a pointer to the element instead of the element itself. Pointer++ If p is a pointer to an element in an array, then (p+1) points to the next element in the array. Code can exploit this using the construct p++ to step a pointer over the elements in an array. It doesn't help readability any, so I can't recommend the technique, but you may see it in code written by others. Here's a sequence of versions of strcpy written in order: from most verbose to most cryptic. In the first one, the straightforward for loop needs a little follow-up to ensure that the terminating null character is copied over. The second removes that opportunity for error by rearranging into a while loop and moving the assignment into the test. The last two are cute (and they demonstrate using ++ on pointers), but not really the sort of code you want to maintain. Among the four, I think the second is the best stylistically. With a smart compiler, all four will compile to basically the same code with the same efficiency. void strcpy1(char dest[], const char source[]) { int i; for (i = 0; source[i] != '\0'; i++) dest[i] = source[i]; dest[i] = '\0'; // don't forget to null-terminate! }3 // Move the assignment into the test void strcpy2(char dest[], const char source[]) { int i = 0; while ((dest[i] = source[i]) != '\0') i++; } // Get rid of i and just move the pointers. // Relies on the precedence of * and ++. void strcpy3(char dest[], const char source[]) { while ((*dest++ = *source++) != '\0') ; } // Rely on the fact that '\0' is equivalent to FALSE void strcpy4(char dest[], const char source[]) { while (*dest++ = *source++) ; } Pointer Type Effects Both [] and + implicitly use the compile time type of the pointer to compute the element_size which effects the offset arithmetic. When looking at code, it's easy to assume that everything is in the units of bytes. int *p; p = p + 12; // at run-time, what does this add to p? 12? The above code does not add the number 12 to the address in p— that would increment p by 12 bytes. The code above increments p by 12 ints. Each int takes 4 bytes, so at run time the code will effectively increment the address in p by 48. The compiler figures all this out based on the type of the pointer. You can manipulate this using casts. For example, the following code really does just add 12 to the address in the pointer p. It works by telling the compiler that the pointer points to char instead of int. The size of char is defined to be exactly 1 byte (or whatever the smallest addressable unit is on the computer). In other words, sizeof(char) is always 1. We then cast the resulting (char*) back to an (int*). You can use casting like this to change the code the compiler generates. The compiler just blindly follows your orders. p = (int*) ((char*)p + 12);4 Arithmetic on a void pointer For a (void*) pointer, array subscripting and pointer arithmetic don't quite make sense. These manipulations include implicit multiplication by the size of the element type, and what is sizeof(void)? Unknown! Some compilers assume that it should treat it like a (char*), but if you were to depend on this you would be creating non-portable code. To be precise and correct, you should cast the (void*) to (char*) before doing any math to make clear that all arithmetic is done in one-byte increments and any necessary multiplication will be done explicitly. void *ptr; p = (char*)ptr + 4; // increments ptr by exactly 4, no extra multiplication Note that you do not need to cast the result back to (void*), a (void*) is the "universal recipient" of pointer types and can be freely assigned any type of pointer. It is best to use casts only when you absolutely must. Arrays and Pointers One effect of the C array scheme is that the compiler does not meaningfully distinguish between arrays and pointers— they both just look like pointers. In the following


View Full Document

Stanford CS 107 - Study Notes

Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?