homepage

courses

previous -- view my teaching portfolio here

 

 

 

Daemen University

Data Exploration -- MTH 400

 

Course Description (from the College): An advanced statistical methods course on exploratory data analysis and its application in the fields of health science, marketing, finance, and political science. Using the R software package, students will examine the basic tenants of Exploratory Data Analysis (EDA). Topics will include: transforming and standardizing data, handling missing data, data visualization (distributions, relationships, clusters, et.), data summarization, data reduction, cluster identification, and hypothesis development.  (3 hours)
Prerequisite: MTH 325 and CSC 350 (UG)

 

Syllabus/Syllabus -- in Word format
Final Project Directions

Homeworks
 

Important Dates

 

see the syllabus for a more detailed calendar
 

Email List

can be gotten through Blackboard

Announcements:

Math Adjunct Office
My voicemail
My Daemen email: bmccall@daemen.edu
Office hours: see syllabus


 

Homeworks

Swirl Course: Getting and Cleaning Data

Package Summaries

Data Explorations
(data files are in .xlsx format)

DE #1
DE #2
DE #3
DE #4
DE #5
DE #6

General Data Exploration Directions

Final Project Suggestions (or select your own)

Readings

9/4 EDA 9/11 Tidyverse or Not
9/18 Categorical Data 9/25 Numerical Data
10/2 Comparing Data / Data Viz 10/9 Outliers
10/16 Missing Values / Imputation 10/23 Quarto
10/30 Animations / Time Series 11/6 Importing Data / APIs
11/13 Spatial Data / Maps 11/20 Feature Engineering
11/27 Text Data / Word Clouds 12/4 Storytelling with Data

R Tutorials:

Bar Graphs (base R)
Boxplot (base R)
Dotplots (base R)
Histogram (base R)
Normal Distribution shaded between two values (base R)
Normal Probability Plots (base R)
Scatterplots with Trendlines and Residual Graphs (base R)

 

Handouts:

Answer Keys


Resources

Tutorials on Advanced Stats and Machine Learning With R
Applied Statistics with R (textbook)
Intermediate Statistics with R (textbook)
Probability and Statistics for Engineering and the Sciences, Jay. L. Devore, 8th ed. (textbook)
Introductory Statistics (textbook)
Practical Statistics for Data Scientists (textbook)
Online Statistics Book (textbook)
A Little Book of Time Series Analysis for R (textbook)
A Course in Time Series Analysis (textbook/notes)
Introduction to Probability for Data Science (textbook)
Friedman's ANOVA Test
How to Perform Friedman's Test in R
Kendall's Tau
Calculating Kendall's Rank Correlation in R
Introduction to Bootstrapping (Statistics by Jim)
Boostrapping in R
Tutorial on Permutation Tests in R
How to use Permuation Tests
Understanding AUC-ROC Curves
Some Packages for ROC Curves
Time Series Analysis in R
Getting Started with Multiple Imputation in R
Basic Statistics Using R
Learning Statistics with R
Statistics with R (Table of Contents)
Stats and R
Intro to Hypothesis Testing in R
R-Tutorial: An R Introduction to Statistics
Tidy Modeling with R
R Cheatsheets
Free Web Books for Learning (Statistics) with R
Easier ggplot with ggcharts
R Color Brewer's Palettes
Markdown Cheat Sheet
Smoothing
Cubic and Smoothing Splines in R
B-Spline Basis for Polynomial Splines

R Project
R Studio
Anaconda
Using R with Anaconda

 

Links!

PDF Graph Paper
Bad Graphs (Convention Speeches)
Visualizing Data Badly: 8 Examples
Correlation is not Causation: orginal article / handout
Presidents by State
TI-Connect Software
How much people lie on surveys
On the Hazards of Significance Testing
Exploring Correlation and Regression
Central Limit Theorem: with Bunnies and Dragons
SOCR: Statistics Online Computational Resources
How Many Ways Can You Arrange a Deck of Cards?
Free Online Math Courses
Confidence Interval for Rho
Free Courses from Coursera

Coding
MTH 324/MTH 325

 

 

 
(c) 2013, 2007, 2004 by Betsy McCall, all rights reserved
To contact the webmistress, email betsy@pewtergallery.com
Last updated: 2022 May 8th