DOC PREVIEW
UMD ASTR 415 - Data Representations

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Data RepresentationsData RepresentationsComputers store data as different variable types, e.g. integer, floating point, complex, etc.Different machines have different wordlengths, e.g. 4-byte ints on a 32-bit machine (Pentium), 8-byte ints on a 64-bit machine (Alpha).This makes (binary) data non-portable.IntegersIntegersAll data types represented by 0's and 1's.An integer value:N = # of bits in wordsi= value of bit i in binary string se.g. 0 0 0 0 0 1 1 0 = 22 + 21 = 6 for 8-bit word.Use "two's complement" method for sign.=∑=× −Integers, Cont'dIntegers, Cont'dLargest value that can be represented is 2N - 1.For 32-bit word this is 4,294,967,295.Arithmetic with integers is exact, except:When division results in remainder.Result exceeds largest representable integere.g. 2 × 109 + 3 × 109 = overflow errorNote multiplication by 2's can be achieved by left-shift, which is very fast (in C: "<<" operator).Two's ComplementTwo's ComplementExploits finite size of data representations (cyclic groups) and properties of binary arithmetic.To get negative of binary number, invert all bits and add 1 to the result.e.g. 1 = 0 0 0 0 0 0 0 1 in 8-bitinvert bits: 1 1 1 1 1 1 1 0add 1: 0 0 0 0 0 0 0 1result: 1 1 1 1 1 1 1 1 = -1In 8 bits, signed char ranges from -128 to +127.Negative Powers of 2Negative Powers of 2Binary notation can be extended to cover negative powers of 2, e.g. "110.101" is:1 × 22 + 1 × 21 + 1 × 2-1 + 1 × 2-3 = 6.625Can represent real numbers by specifying some location in the word as the "binary point" → fixed-point representation.In practice, use some bits for an exponent → floating-point representation.FloatsFloatsFor most machines these days, real numbers are represented by floating-point format:s = sign B = base (usually 2, sometimes 16)M = mantissa e = exponentE = bias, usually 127.In past, manufacturers used different number of bits for each of M and e → non-portable code.=× ×−Floats, Cont'dFloats, Cont'dCurrently, most manufacturers adopt IEEE standard:s = 1st bitNext 8 bits are eLast 23 bits are M, expressed as a binary fraction, either 1.F, or, if e=0, 0.F, where F is in base 2.Largest single-precision float fmax =2127≈ 1038.Smallest (and least precise!) fmin = 2-149 ≈ 10-45.Round-off ErrorRound-off ErrorNot all values along real axis can be represented.There are 1038 integers between fmin & fmax, but only 232 ≈ 109 bit patterns.Values < |10-45| result in "underflow" error.If value cannot be represented, next nearest value is produced. Difference between desired and actual value is called "round-off error" (RE).Round-off Error, Cont'dRound-off Error, Cont'dSmallest value em for which 1 + em > 1 is called "machine accuracy", typically ∼10-7 for 32 bits.Double precision greatly reduces em (~ 10-16).RE accumulates in a calculation:Random walk: total error N1/2 em after N operations.But algorithms rarely random → linear error N em.Round-off Error, Cont'dRound-off Error, Cont'dSubtraction of two very nearly equal numbers can give rise to large RE.e.g. Solution of quadratic equation......can go badly wrong whenever ac << b2 (Cf. PS#2).RE cannot be avoidedit is a consequence of using a finite number of bits to represent real values.=−±−  Truncation ErrorTruncation ErrorIn practice, most numerical algorithms approxi-mate desired solution with a finite number of artithmetic operations.e.g. evaluating integral by quadrature summing series using finite number of termsDifference between true solution and numerical approximation to solution is called "truncation error" (TE).Truncation Error, Cont'dTruncation Error, Cont'dTE exists even on "perfect" machine with no RE.TE is under programmer's control; much effort goes into reducing it.Usually RE and TE do not interact.Sometimes TE can amplify RE until it swamps calculation. Solution is then called unstable.e.g. Integer powers of Golden Mean (Cf.


View Full Document

UMD ASTR 415 - Data Representations

Download Data Representations
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Data Representations and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Data Representations 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?