Unformatted text preview:

Summary Below is the syllabus for the SPRING 2021 offering of the course Fall 2021 will be similar though the class will be offered in person only INFO 2950 is an intro level information science course on the foundations of data science It covers topics including the standard Python data science stack univariate and multivariate statistical analysis of small and medium size datasets regression methods hypothesis testing probability models basic supervised and unsupervised machine learning data visualization and network analysis Student who complete the course will be able to produce meaningful data driven analyses of real world problems and will be prepared to begin more advanced work in data intensive domains Texts There is no required textbook for this class though individual readings may be assigned throughout the semester If you want additional information about course topics we recommend the following books and resources General introduction to Python John Guttag Introduction to Computation and Programming in Python General introduction to statistics Allen B Downey Think Stats Principles of data science Joel Grus Data Science from Scratch Python data science stack Jake VanderPlas Python Data Science Handbook Ethical issues in data science Princeton Dialogues on AI and Ethics case studies Schedule Week 1 beginning Monday Feb 8 Intro and setup HW 0 released Reading If you aren t familiar with Jupyter notebooks and JupyterLab review the JupyterLab documentation before section on Friday Monday 2 8 Intro Wednesday 2 10 More intro advice notebooks data types Week 2 Feb 15 Dataframes and Pandas HW0 due 2 18 project phase 0 due 2 17 HW 1 released Reading 10 minutes to Pandas Monday 2 15 Toward Pandas Wednesday 2 17 Pandas Week 3 Feb 22 Summary statistics grouping basic visualization HW 1 due project phase I due HW 2 released Monday 2 22 COVID data case study Wednesday 2 24 Projects COVID II Avocados Week 4 March 1 Correlation and covariance transformations joining data HW 2 due HW 3 released Reading Downey chapters 2 distributions and 7 relationships between variables See Texts section above for link Note Read for the statistical concepts not Downey s code We will use standard Pandas and NumPy functions for our work Monday 3 1 Covariance and correlation Wednesday 3 3 Correlation continued and bias Week 5 March 8 Distance similarity clustering No class on Wednesday HW 3 due Saturday 3 13 at 11 59 due to wellness days no new HW released project work during section Reading optional Python Data Science Handbook on k means clustering Monday 3 8 Bias continued clustering Wednesday 3 10 No class wellness day Week 6 March 15 Linear regression HW 4 released project phase II due Monday 3 15 Clustering wrap up intro to linear regression Wednesday 3 17 Linear regression II Week 7 March 22 Multiple regression HW 4 due HW 5 released project phase II peer review due Monday 3 22 Model evaluation multiple linear regression Wednesday 3 24 Binary inputs collinearity logistic regression Week 8 March 29 Hypothesis testing HW 5 due HW 6 released Monday 3 29 Logistic regression Wednesday 3 31 Model evaluation Friday 4 2 Supplemental lecture videos to watch before Monday 4 5 code o Classification reports and why we model data o Permutation and p values o Bootstrap resampling and confidence intervals o Optional How to read documentation Week 9 April 5 Probabilistic models and simulation HW 6 due HW 7 released Monday 4 5 Probabilistic models Wednesday 4 7 Model selection Bayes rule Week 10 April 12 Dimension reduction matrix decomposition HW 7 due project work during section project phase III due no new HW released Monday 4 12 Hypothesis tests and distributions Wednesday 4 14 Dimension reduction and matrix factorization Week 11 April 19 Supervised learning and text as data No section on Friday wellness day no new homework for remainder of the semester Monday 4 19 Conclude dimension reduction and matrix factorization Wednesday 4 21 Text as data Bayesian classifiers Week 12 April 26 Text as data No lecture on Monday wellness day Project phase IV due Project and review work in section Wednesday 4 28 Text as data II Week 13 May 3 Networks Project phase IV peer review due Project work in section Monday 5 3 Networks I Wednesday 5 5 Networks II Week 14 May 10 Wrapup Project phase V final submission due No section on Friday Course concludes no final exam Policies Harassment and respect All students are entitled to respect from course staff and from their fellow students All staff are entitled to respect from students and from fellow staff members Violations of this principle whether large or small will not be tolerated Respect means that your ideas are taken seriously that you feel welcome in class settings including in study groups and online fora and that you are treated as a full co equal member of the class Harassment describes any action intentional or otherwise that abridges the respect owed to every member of the class If you experience harassment in any form or if you would like to discuss your experience in the class please see me in office hours or contact me by email The university also has reporting and counseling resources available including those for sexual harassmentLinks to an external site and for other bias incidentsLinks to an external site Academic integrity Each student in this course is expected to abide by the Cornell University Code of Academic IntegrityLinks to an external site Any work submitted by a student in this course for academic credit will be the student s own work unless specifically and explicitly permitted otherwise Using other people s code is an important part of programming but for group projects the code should be substantially the work of the group members except for standard libraries Any code used in projects that was not written by the group members should be placed in separate files and clearly labeled with their source URLs If you have benefitted from online resources such as StackOverflow list the URLs in comments in your own code even if you did not directly copy anything Project work that relates to your other classes or research is encouraged but you may not recycle assignments There must be no doubt that the work you turn in for this class was done for this class When in doubt consult with me or with a graduate TA during office hours Disabilities Every student s access is important to us If you have or think you may have a disability please contact Student


View Full Document

City Tech INFO 2950 - Syllabus

Download Syllabus
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Syllabus and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Syllabus and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?