View Full Document

Exploratory Data Analysis with Categorical Variables



View the full content.
View Full Document
View Full Document

3 views

Unformatted text preview:

Exploratory Data Analysis with Categorical Variables An Improved Rank by Feature Framework and a Case Study Jinwook Seo and Heather Gordish Dressman jseo hgordish cnmcresearch org Research Center for Genetic Medicine Children s Research Institute 111 Michigan Ave NW Washington DC 20010 RUNNING HEAD RANK BY FEATURE FRAMEWORK FOR CATEGORICAL DATA Acknowlegement This work was supported by NIH 5R24HD050846 02 Integrated molecular core for rehabilitation medicine and NIH 1P30HD40677 01 MRDDRC Genetics Core We also thank the FMS study group especially Joseph Devaney and Eric Hoffman for providing genotype data Corresponding Author s Contact Information Jinwook Seo Research Center for Genetic Medicine Children s Research Institute 111 Michigan Ave NW Washington DC 20010 Tel 1 202 884 4942 Fax 1 202 884 6014 1 ABSTRACT Multidimensional datasets often include categorical information When most dimensions have categorical information clustering the dataset as a whole can reveal interesting patterns in the dataset However the categorical information is often more useful as a way to partition the dataset gene expression data for healthy vs diseased samples or stock performance for common preferred or convertible shares We present novel ways to utilize categorical information in exploratory data analysis by enhancing the rank by feature framework First we present ranking criteria for categorical variables and ways to improve the score overview Second we present a novel way to utilize the categorical information together with clustering algorithms Users can partition the dataset according to categorical information vertically or horizontally and the clustering result for each partition can serve as new categorical information We report the results of a longitudinal case study with a biomedical research team including insights gained and potential future work Color figures are available at www cs umd edu hcil ben60 2 1 INTRODUCTION In many analytic domains multidimensional datasets



Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Exploratory Data Analysis with Categorical Variables and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Exploratory Data Analysis with Categorical Variables and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?