Unformatted text preview:

CompletenessOutlinesSlide 3Abstract universeData completeness vs. Model completenessSpatial completenessAttribute completenessRelation to other data quality componentsData quality vs. fitness of useSlide 10Data quality reportCompletenessFebruary 27, 2006Geog 458: Map Sources and ErrorsOutlines•Completeness•Testing completeness•Documenting completeness in the metadata•Data qualityCompleteness•The data set is called “complete” if what’s defined/needed is encoded in the DB•Spatial completeness: degree to which all features are captured corresponding to data capture specifications•Attribute completeness: degree to which the relevant attributes of a feature are available corresponding to a given capture specifications•Data quality component that describes whether the entity objects represent all entity instances of the corresponding abstract universe•Relationship between the objects represented in the data set and the abstract universe of all such objectsAbstract universe•Can be thought of a reference frame•Data set = digital representation of a subset of (perceived) reality•Abstract universe = terrain nominale; abstract view of the universe; universe of discourse; miniworld; subset of perceived reality (it involves selection and abstraction process)•Data set is intended to represent the abstract universe•Since completeness means the relationship between data set and abstract universe, a useful characterization of completeness relies on a comprehensive definition of the abstract universeData completeness vs. Model completeness•It is possible to classify completeness into two categories depending on how the abstract universe is defined or specified •Data completeness: the abstract universe is defined on generic uses of data; application-independent•Model completeness: the abstract universe is defined on specific uses of data; application-dependent•So which would be more flexible? Which would have multiple versions of completeness on the same data?Spatial completeness•Let’s say the abstract universe “lake” is defined as the water body with the area more than 1 square mile•Check the number of entities in the abstract universe; set this number to A•Check the number of entities encoded in the DB (lake data set); set this number to B•Completeness would be B/A•The definition of “lake” varies depending on applications, thus so does A varyAttribute completeness•Subordinated to spatial completeness•Define what the relevant attributes will be–Lake will have area, depth, type (freshwater), and so on•Check if attribute values are missing for entity in hand–Geometric description might be incomplete (area)•Report on the number of missing values out of the total number of features for each attributeRelation to other data quality components•Completeness may affect the logical consistency of a data set–Missing arc, node  connectivity, closed polygon–Missing attribute (left and right-node)  connectivity–Missing attribute in PK  key constraint–Missing attribute in FK  referential constraint•So where do I document this in completeness or logical consistency?–If incompleteness causes logical inconsistency, describe it in logical consistency section–Else it will be included in completeness sectionData quality vs. fitness of use•Data quality–The totality of features and characteristics of a data set that bear on its ability to satisfy a stated set of requirements; application-independent•Fitness of use–The totality of features and characteristics of a data set that bear on its ability to satisfy a set of requirements given by the application; application-dependentData quality vs. fitness of use•Data quality information is usually provided by the producer of a data set•Fitness of use is assessed when evaluating the use of a data set by users  this principle is referred to truth in labelling (users are responsible for quality control indeed)•See different approaches to quality control in the lecture note on spatial data qualityData quality report•What you are reporting in data quality section of the metadata will be data-independent, so that it can be reused for any potential uses of the data•Reporting data quality can be thought of the process for evaluating the ability of the data set to meet up to the requirements•In that how well the value is close to ground truth (attribute/positional accuracy), whether it exhibits lack of contradictions (logical consistency), and whether what’s relevant is encoded in the DB


View Full Document

DePaul GEOG 458 - Completeness

Download Completeness
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Completeness and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Completeness 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?