Unformatted text preview:

RELATIVE INPUT/OUTPUT FILES. COMMANDS - ABOUT THE SAME AS INDEXED COMMANDS. LOGIC IS QUITE DIFFERENTDEFINITION OF A RELATIVE FILE:. ONE FILE ONLY - ONLY A DATA COMPONENT. ACCESS: RANDOMLY BY A RELATIVE RECORD NUMBER.(actually use a field within the record (symbolic key) to derive the Relative Key, which is Anormally@ NOT part of the record.). A RELATIVE FILE IS LIKE AN AEXTERNAL ARRAY@IT MAY HAVE UNFILLED ELEMENTS.Relative Record #1Relative Record #2Relative Record #3.....Relative Record #n. KEY (SYMBOLIC) AND RELATIVE RECORD NUMBER (Relative Key) CAN BE THE SAME... MAY BE FIELD W/I RECORD WITH VALUES 1,2,...,N... SYMBOLIC KEY IS THEN RELATIVE KEY.CONSIDER: 1A RECORD OF DATA GATHERED FROM A LAB EXPERIMENT.LAB EXPERIMENT #1 RESULTED IN ..... DATALAB EXPERIMENT #2 RESULTED IN .... DATA001 LAB DATA FIELDS002 LAB DATA FIELDSETC.. CONSECUTIVE RELEATIVE RECORD NUMBERS CAN OCCUR QUITE NATURALLY IN PHYSICAL PHENOMENON. NORMALLY, MUST MODIFY A SYMBOLIC KEY AND CONVERT INTO A RELATIVE RECORD NUMBER PRIOR TO ANY I/O.. DISCUSS: PRACTICALITY OF USING SSAN AS RELATIVE KEY?. DISCUSS: INPUT SYMBOLIC KEY WITH VALUES 100, 200, ETC.?. DISCUSS SYMBOLIC KEY W/RANGE 40,000 TO 50,000?DIVIDE EACH KEY BY 40,000 AND TAKE REMAINDER.. USE REMAINDER AS RELATIVE KEY... ALLOCATE SPACE FOR 1000 RECORDS W/RELATIVE KEYS:1 -> 1000.. THIS LEADS US TO AN IMPORTANT CONCEPT:ENTER: AN AALGORITHM@ FOR CONVERTING A SYMBOLIC KEY TO A RELATIVE KEY (RELATIVE RECORD NUMBER). THE PROCESS OF AKEY TO ADDRESS@ TRANSFORMATION IS REFERRED TO AS AHASHING.@. OBJECTIVE OF HASHING: AHASH TO ARRIVE AT A RELATIVE 2RECORD NUMBER - PLACE WHERE RECORD WILL BE STOREDIN THE FILE.. AHASHING@ REFERS TO APPLYING AN ALGORITHM TO A SYMBOLIC KEY TO ARRIVE AT A RELATIVE KEY.N.B. RELATIVE RECORD NUMBER IS . ANORMALLY@ NOT FOUND WITHIN THE RECORD, . IS NOT STORED IN THE RECORD AFTER HASHING, AND . MUST BE COMPUTED FOR EVERY INPUT OR OUTPUT ACTION ATTEMPTING TO ACCESS THE TARGET RECORD.RELATIVE FILE PROCESSING IS VERY FAST - MUCH FASTER THAN INDEXED SEQUENTIAL.....WHY??NOT AS FAST AS A TABLE RESIDENT W/I PROGRAM, BUT SIMILAR.THERE ARE MANY ALGORITHMS FOR HASHING TO AN ADDRESS.(HASHING ALGORITHMS)LET=S CONSIDER A PRACTICAL EXAMPLE:CONSIDER A KEY OF RANGE 40000 TO 50000.WE WILL AHASH@ TO AN AADDRESS@ USING THE DIVISION/REMAINDER METHOD. WE WILL DIVIDE BY 40,000 AND ADD 1 TO PRODUCE KEYS WITH A RANGE OF 1 TO 10,000. THIS => EVERY RECORD HAS ITS OWN, UNIQUE SLOT IN FILE.. NOT TOO PRACTICAL, IF FILE CONTAINS 1000 RECORDS.. EXTREMELY WASTEFUL TO ALLOCATE SPACE FOR 10,000 3RECORDS WHEN WE KNOW ONLY 100 RECORDS WILL BE AACTIVE.@. MODIFY THE HASHING ALGORITHM:(RECALL: SYMBOLIC KEY VALUES: 40,000 -> 50,000).. DIVIDE BY 40,000. YIELDS REMAINDERS OF 1 TO 10,000.. DIVIDE REMAINDER BY 125, AND ADD 1 YIELDS A REMAINDER (RELATIVE RECORD NUMBER) OF 1 ... 125.EXAMPLES:1. 42000 = 2000 (=> QUOT = 16, REM=0) + 1 => REL KEY = 1 40000 1252. 42200 = 2200 (=> QUOT = 17, REM=75) + 1 => REL KEY =76 40000 1253. 42220 = 2220 (=> QUOT = 17, REM=95) + 1 => REL KEY =96 40000 1254. 42249 = 2249 (=>QUOT = 17, REM=124) + 1 => REL KEY = 125 40000 125LOOKS GOOD...ALL KEY-TO-ADDRESSS TRANSFORMATIONS SEEM TO HASH TO ADDRESSES BETWEEN 1 AND 125.CONSIDER THE SYMBOLIC KEYS: 42125, 42250, 42375, ... ALL HASH TO THE SAME ADDRESS!!!. 40,000 WOULD HASH TO 1 THAT IS, REL KEY = 1 . 42125 = 2125 (=> QUOT = 17, REM=0) + 1 => REL KEY = 1 40000 1254. 42250 = 2250 Q=18 W/REM = 0 + 1 => REL KEY = 1 40000 125THUS WE WILL HAVE MANY RECORDS WHO HASH TO SAME ADDRESS.THIS IS DESIRABLE!!WE ACCEPT THIS PHENOMENON TO AVOID HAVING TO ALLOCATE DISK SPACE FOR UNIQUE KEY-TO-ADDRESS TRANSFORMATIONS WHEN YOU KNOW THAT THE NUMBER OF RECORDS IN THE FILE IS ONLY A SMALL SUBSET OF THE RANGE OF VALUES. HAVING MULTIPLE RECORDS HASH TO THE SAME ADDRESS DOES, HOWEVER, PRESENT A PROBLEM.. CANNOT HAVE MORE THAN ONE RECORD PHYSICALLY OCCUPYING THE SAME SPACE ON THE DISK.. THUS, WE NEED TO DO SOMETHING ABOUT RECORDS THAT HASH TO AN OCCUPIED RELATIVE RECORD LOCATION. WE TYPICALLY ALLOCATE 20-25% MORE SPACE THAN THE FILE SIZE MIGHT SUGGEST.. EXTRAPOLATING: WE FEEL THAT FILESIZE = 1000 RECORDS;ALLOCATE SPACE FOR 1250 RECORDS... THIS DEPENDS UPON THE FILE DYNAMICS!!5 WE CALL THE SITUATION WHEN SEVERAL SYMBOLIC KEYS HASH TO THE SAME ADDRESS COLLISION.  TWO OR MORE RECORDS THAT HASH TO THE SAME ADDRESS ARE CALLED SYNONYMS.SO...LET=S LOOK AT SOME 1. HASHING ALGORITHMS, AND 2. COLLISION ALGORITHMS6HASHING ALGORITHMSNOTE: ONE GOAL IN CHOOSING ANY HASHING ALGORITHM SHOULD BE TO SPREAD OUT RECORDS AS UNIFORMLY AS POSSIBLE OVER THE RANGE OF ADDRESSES AVAILABLE.NOTE: AHASH@ MEANS TO CUT INTO PIECES....MUDDLE, CONFUSE.THE MORE CHARACTERS IN A KEY WE USE, IN GENERAL, THE MORE WE ARE ARANDOMIZING.@ LET=S CONSIDER A VERY GOOD ALGORITHM.... (USUALLY GIVES GOOD RESULTS...)1. REPRESENT THE KEY IN NUMERICAL FORM2. FOLD AND ADD3. DIVIDE BY A PRIME NUMBER AND USE THE REMAINDER ASTHE ADDRESS....1) IF KEY IS NUMERIC, THIS STEP IS DONE.IF NOT, USE THE ASCII CODE OF EACH CHARACTER AND USE IT TO FORM A NUMBERIF WE WERE TO USE A ANAME@ FIELD (THAT WOULD CONTAIN BLANKS), WE SHOULD USE THE ENTIRE KEY, RATHER THAN ONE/TWO CHARACTERS.BY USING MORE PARTS OF A KEY, WE INCREASE THE LIKELIHOOD THAT DIFFERENCES AMONG THE KEYS CAUSE DIFFERENCES IN ADDRESSES PRODUCED.EXTRA TIME IN PROCESSING IS INSIGNIFICANT WHEN COMPARED TO THE POTENTIAL IMPROVEMENT INPERFORMANCE.2) FOLD AND ADD....MEANS CHOPPING OFF PIECES OF THE NUMBER AND ADDING THEM TOGETHERIF DOING THIS IN C, CAN ADD CHARACTERS DIRECTLY.IN PASCAL, WE CAN ORD( ) TO OBTAIN INTEGER VALUESAME IN COBOL....(SEE COBOL 85 EXTENSIONS)SO, CAN ADD THE ASCII VALUES TOGETHER - EACH TWOCHARACTERS FORMING A FOUR DIGIT NUMBER.ADD THE FOUR DIGIT NUMBERS TOGETHER TO CREATE A FINAL VALUE...OR, IF KEY IS ALREADY NUMERIC, SAY FIVE DIGITS, 12345, MAY TREAT THESE AS 01 + 23 + 45. (= 69)3) DIVIDE BY THE SIZE OF THE ADDRESS SPACE.HERE WE WANT TO CUT DOWN THE SIZE OF THE NUMBER PRODUCED ABOVE SO IT FALLS WITHIN THE RANGE OF ADDRESSES OF RECORDS IN THE FILE.FOR ADDING THREE 2-DIGIT PAIRS, THE SUM COULD BE 99 + 99 + 99 = 297. DO WE WANT TO RESERVE ROOM FOR 297 RECORDS?SO, WE DIVIDE THE SUM BY A NUMBER THAT IS THE


View Full Document

UNF COP 3531 - Study Notes

Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?