DOC PREVIEW
UI STAT 5400 - Queries and SQL More on Data Integrity

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

122S:166SASQueries and SQLMore on Data IntegrityLecture 20Nov. 26, 2007Kate Cowles374 SH, [email protected] Query Language• query: a view of data which represents thedata from one or more tables• queries bui lt in a relational database usingStructured Query Language or SQL• SQL is the standard language for relationaldatabases• includes th e capability of manipulating boththe structure of a database and its data• most common use: to create a simple SE-LECT query3Proc sql in SAS• SAS data files and SQL tables– structure of an SQL tab le is very similarto that of a SAS data file– only difference: SASdata file has inherentordering– in SAS System, SQL table is representedas a SASdata file• proc sql can perform some of the operationsprovided y the data step and the print, sort,and means procedures– often can achieve same results as theseprocedures with fewer and short statements– why sh o uld you still know how to do thesetasks with print, sort, means, etc.?∗ because you are likely to have to main-tain or modify older programs writtenbefore proc sql was added to SAS4Queries using proc sql select statement• select statement i n proc sql finds and dis-plays specified records and variables• can a lso link files, calculate summary statis-tics, sort, etc.5Return to sites and deposition exampleoptions linesize = 75 pagesize = 60 nodate nonumber ;data depo ;infile ’depoRep90s.asp’ firstobs = 8 ;input SiteID $ Per $8. Year Crit1 Crit2 Crit3 Crit4 Ca MgK Na NH4 NO3 InorgN Cl SO4 HLab HField Svol Ppt Pct ValidF ValidLDays @196 Date1 mmddyy10. @209 Date2 mmddyy10. ;drop Per Crit1-Crit4 Ca Mg K Na NH4 NO3 InorgN Cl HLab HFieldSvol Ppt Pct ValidF ValidL ;daysop = Date2 - Date1 ;format Date2 Date1 date8. ;run ;data sites ;infile ’/space/kcowles/166/lectures/lect1mkc/stateCO.asp’ firstobs = 19missover ;input @13 SiteID $ @20 sitename $18. @40 strtdate mmddyy10. @53 stopdate mmddyy10.if strtdate ne . ; * subsetting if: exclude observations meeting conditionformat strtdate stopdate date8. ;drop sitename ;run ;6• sites file– SiteID– strtdate– stopdate– elev– SiteID• depo file– SiteID– Year– SO47proc sql ;title ’Proc sql listings’ ;select * from sites ; /* list all variables and records */Proc sql listingsSiteID strtdate stopdate elev--------------------------------------CO00 22APR80 . 2298CO01 04OCT83 . 1213CO02 05JUN84 . 3520CO08 29DEC87 . 2502CO10 02FEB99 . 2926CO15 20MAR79 . 1998CO19 29MAY80 . 2490CO21 17OCT78 . 2362CO22 22MAY79 . 1641CO91 26MAY92 . 3292CO92 13JAN88 . 3206CO93 14OCT86 . 2527CO94 04NOV86 . 2524CO95 29JUL86 02JAN90 2758CO96 29JUL86 . 3249CO97 07FEB84 . 3234CO98 16AUG83 . 3159CO99 28APR81 . 21728select SiteID, elev from sites ; /* list selected variables, all recsProc sql listingsSiteID elev------------------CO00 2298CO01 1213CO02 3520CO08 2502CO10 2926CO15 1998CO19 2490CO21 2362CO22 1641CO91 3292CO92 3206CO93 2527CO94 2524CO95 2758CO96 3249CO97 3234CO98 3159CO99 21729* multiple-table query ;title2 ’Multiple table query’ ;select s.siteID, s.elev, d.SO4, d.Yearfrom sites s, depo dwhere s.SiteID = d.SiteIDorder by s.SiteID ;Multiple table querySiteID elev SO4 Year--------------------------------------CO00 2298 1.2 1992CO00 2298 1.28 1998CO00 2298 1.07 1996CO00 2298 2.08 1991CO00 2298 1.01 1999CO00 2298 1.18 2000CO00 2298 1.1 1997CO00 2298 1.5 1993CO00 2298 1.46 1995CO00 2298 1.31 1994CO01 1213 3.64 1995CO01 1213 3.09 1992CO01 1213 2.98 1994CO01 1213 2.3 1993CO01 1213 2.44 1998CO01 1213 2.53 1997CO01 1213 3.19 1991CO01 1213 2.06 2000...10* more sophisticated query ;title2 ’More complicated SELECT and ORDER’ ;select s.siteID, s.elev, d.SO4, d.Yearfrom sites s, depo dwhere s.SiteID = d.SiteID and d.Year > 1995order by s.SiteID, d.Year ;More complicated SELECT and ORDERSiteID elev SO4 Year--------------------------------------CO00 2298 1.07 1996CO00 2298 1.1 1997CO00 2298 1.28 1998CO00 2298 1.01 1999CO00 2298 1.18 2000CO01 1213 2.99 1996CO01 1213 2.53 1997CO01 1213 2.44 1998CO01 1213 3.49 1999CO01 1213 2.06 2000CO02 3520 14.82 1996CO02 3520 10.9 1997CO02 3520 8.67 1998CO02 3520 10.7 1999CO02 3520 19.32 2000...11* summing and grouping ;title2 ’Total Deposition’ ;select siteID, sum(SO4) as totso4from depogroup by SiteIDorder by SiteID;Proc sql listingsTotal DepositionSiteID totso4------------------CO00 13.19CO01 28.71CO02 115.68CO08 26.25CO10 4.85CO15 23.28CO19 26.17CO21 35.49CO22 30.16CO91 69.82CO92 33.01CO93 65.73CO94 40.81CO96 39.53CO97 82.16CO98 57.27CO99 39.8812Producing report with proc meansproc means data = depo ;class SiteID ; /* separate summary stats by this variable */var SO4 ; /* which numeric variable to summarize */output out = meandepo mean=avgso4 ; /* identify output dataset andvariable name for summary stat */run ;proc print data = meandepo ;run ;SiteObs ID _TYPE_ _FREQ_ avgso41 0 161 4.54652 CO00 1 10 1.31903 CO01 1 10 2.87104 CO02 1 10 11.56805 CO08 1 10 2.62506 CO10 1 2 2.42507 CO15 1 10 2.32808 CO19 1 10 2.61709 CO21 1 10 3.549010 CO22 1 10 3.016011 CO91 1 9 7.757812 CO92 1 10 3.301013 CO93 1 10 6.573014 CO94 1 10 4.081015 CO96 1 10 3.953016 CO97 1 10 8.216017 CO98 1 10 5.727018 CO99 1 10 3.988013title ’Report produced using proc means’ ;proc print data = meandepo noobs ;var SiteID avgso4 ;where _type_ = 1 ;run ;Report produced using proc meansSiteID avgso4CO00 1.3190CO01 2.8710CO02 11.5680CO08 2.6250CO10 2.4250CO15 2.3280CO19 2.6170CO21 3.5490CO22 3.0160CO91 7.7578CO92 3.3010CO93 6.5730CO94 4.0810CO96 3.9530CO97 8.2160CO98 5.7270CO99 3.988014Types of data integrity• “data integrity” is database language for datavalidity a nd checking• types• entity integrity: no d uplicate rows• domain integrity: values in any given columnfall within an acceptable range or set• referential integrity: foreign key value s pointto val id rows in the referenced table• user-defined integrity: data complies with oth errules specific to the appl


View Full Document

UI STAT 5400 - Queries and SQL More on Data Integrity

Documents in this Course
Load more
Download Queries and SQL More on Data Integrity
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Queries and SQL More on Data Integrity and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Queries and SQL More on Data Integrity 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?