TRINITY CSCI 2323 - Data Processing and Mining (8 pages)

Previewing pages 1, 2, 3 of 8 page document View the full content.
View Full Document

Data Processing and Mining



Previewing pages 1, 2, 3 of actual document.

View the full content.
View Full Document
View Full Document

Data Processing and Mining

55 views

Lecture Notes


Pages:
8
School:
Trinity University
Course:
Csci 2323 - Scientific Computation
Scientific Computation Documents

Unformatted text preview:

Data Processing and Mining 10 19 2005 1 Opening Discussion How are things going with the project You should try to get the writeup of that done soon hopefully today I ll send out a questionnaire so you can evaluate group work 2 Beyond Numerics So far what we have explored is mainly the use of computers to do numerics The reason is obvious computers can do calculations much faster than you or I can by hand The ability to do numerics was one of the key reasons that computers were developed at all More recently scientists have begun using computers for different reasons to process vast amounts of data Early computers didn t have that much memory so this wasn t an option Now you can easily get a 1TB drive which can hold more information than every book you will ever touch in your life 3 Applications Many different areas of science now have a need to process large quantities of data Biology has the areas of bioinformatics and genomics Basically the ability to look at the genome has produced large data sets that have to be compared Astronomy has large surveys of the sky that keep track of all types of information on a large number of stars Geology and atmospheric physics have more monitoring stations and in the past and they return more data than in the past The biggest data sets currently come from particle physics Collider events can produce many GB from a single smash and colliders will produce TBs from an experiment 4 Flat Files We will begin talking about handling data in standard text files and tools looking through those These types of files are easy to use and have the advantage that you don t really require special tools They are less than ideal when the data sets get really big We will use Perl for this type of processing 5 Databases When the amount of data gets beyond many MB and into the GB and TB range using flat files becomes much less efficient and it becomes useful to put the data into databases Databases give you the ability to quickly find information



View Full Document

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Data Processing and Mining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Data Processing and Mining and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?