Buffalo
State College
Databases and the Data Science Information Life Cycle, DSA 610
Course Description (from the College):
Introduction to a “big picture” understanding of data flow for strategic,
data-driven decision making, including data storage, data organization,
data gathering and preparation, exploratory data analysis, and meaningful
visualizations and communication. Includes hands-on practice.. (3 credits)
Syllabus -- in Word format
Homework
Important Dates
Final Project -- Due
Wednesay, May 17th
more detailed schedule in the syllabus
Email List
via Brightspace
Announcements:
Buffalo State Office:
My BSC voicemail:
My BSC email: mccallb@buffalostate.edu
Office Hours: by appt (after class in-person, or make appt over zoom)
Answer
Keys
Homeworks to be
Turned in
Homework #1 --
Data
Homework #2 --
Data
Homework #3
Homework #4 --
Data
Homework #5
Homework #6
Homework #7 --
Data/Data
Homework #8 --
Data/Data
Homework #9 --
Data
Practice
Labs & Class Notes
N - class notes D - dataset J - Jupyter notebook R - review B
- blank (unexecuted) Jupyter notebook P - pdf of a Jupyter notebook
H - html version of a Jupyter notebook E - other example
My server isn't a fan of the Jupyter notebook files. You can access all
the versions of the file (blank, executed, pdf and html with associated
files) in the zip file here.
Joke
Projects
Project #1 --
Data
Project #2 --
Data
Project #3 --
Data/Data
Final
Project -- Data set options posted in Blackboard
Peer Reviews
Handouts Links:
Data Lifecycle
Data Lifecycle Management
The Lifecycle of Data
16-Step Lifecycle
Data
Lifecycle, Best Practices
Data Lifecycle Management (DLM)
Data
Analytics Lifecycle
Data Protection and Information Lifecycle Management
The
Data Science Lifecycle
Excel Easy
Data Analysis in
Excel
Excel 2016+
Microsoft Excel Video Training
SQL Tutorial
Learn SQL
SQL Basics for
Beginners SQLite Tutorial
Beginners' Guide to SQLite
SQL Query Cheet Sheet SQL
Tutorial SQL Tutorial for
Beginners
Interactive SQL Course
10+ Free Python Books
Python for
Beginners Learn Python
Which Library should I use for my Python Dashboard?
Best Python Data Visualization Libraries
Big Data: How is it generated?
Data Capture
Data Classification
Types of Data Classification
Data Classification
Ethics of Data Collection
The Murky Ethics of Data Gathering in a Post-Cambridge Analytica World
What is Data
Validation?
Data Privacy
5 Things to Know about Data Privacy
Data Storage
Data Storage: Emerging Technologies
Data Lakes vs. Data Warehouses
Data Security
Most Common Passwords
Spreadsheets vs. Databases
Relational Databases
Definition and Overview of ODBMS
Object-Oriented Databases and Advantages
JSON Databases JSON
Interchange Standard
JSON vs. XML
What is JSON?
Importing XML into Pandas
Data Maintenance vs. Data Cleansing
Tips to Maintaining Your Data
Data Management
History of Data Management
Data Management: A Cheat Sheet
Data sharing
and how it can benefit your scientific career
What is Data Sharing?
Data Reuse
Your Data Can Live Forever: How to Plan for Data Reuse
Why Data Sharing and
Reuse are Hard to Do
Data Retention, Archiving and Disposing
Dos and Don'ts of Data Archiving
Data
Retention Best Practices
Data Retention and Archiving Policy -- example
OECD Data Retention Policy -- example
Historical Data, Archiving and Retention -- example (HIT)
Data Retention
101
The Essentials of Data Retention: Policies, Plans, and Templates
Safe Data Destruction 101: Why Data Destruction is Necessary
Dispose of Information Properly
Secure Data Disposal and Destruction: 6 Methods
Data Disposal Laws
Data Discovery
Data Preparation
Data Preparation in Data Mining
Why is Data Preparation Important?
Stats NZ
Public
Datasets (List of Sources)
Recommended
Data Repositories Data.gov
Color Brewer (for maps)
Color Brewer for
Python Plotly Graphing
Library
Maps with Folium
Geographic Maps with Basemap
Python
Libraries for GIS
State FIPS codes
Web
Scraping with Beautiful Soup
Twitter API
Using APIs
with Python
Beginner's Guide to Using an API with Python
Handling Missing Data
Missing Values in Machine Learning
Missing Values Guide
Missing Values from Python Data Science Handbook
Model Planning
Map Reduce
Spark vs. Map Reduce
Natural Language Processing
Sentiment Analysis
Clustering
Classification
Regression
Graph Theory
Statistical Hypothesis Testing
Data and Racial Equity
Communicating Results
Analyzing Data and Communicating Results
Telling a Story with Data
What is a Data Dashboard?
Time Series
Forecasting
Introduction to Time Series Forecasting
5 Common Times Series Methods
Four Phases of Operationalizing Analytics
5 Keys to Operationalize Big Data Analytics in the Cloud
Operationalizing Analytics
What is a Robust Machine Learning Model?
Correct Model Validation
Cross Validation
PDF Graph Paper
I Will Derive song How to
draw Greek GraphCalc
Free Online Math
Courses
Mathnotes
Coding
Spring 2021
Spring 2022
|