Syllabus
Instructor
Dr. Andrew Heiss
639 TNRB
andrew_heiss@byu.edu
@andrewheiss
Office hours: Sign up here.
E-mail is the best way to get in contact with me—I will respond to all course-related e-mails within 24 hours (really).
Course
Thursdays
September 6–December 13, 2018
7:35–9:45 PM
417 SLC
Course objectives
By the end of this course, you will (1) be data literate and (2) be able to answer your own questions with statistical and data scientific tools.
Specifically, you’ll be able to:
- Be curious and confident with data
- Feel comfortable with R
- Tidy and wrangle data imperturbably
- Find patterns in data with appropriate graphs and visualizations
- Pose and test insightful hypotheses
- Accurately interpret results from basic inferential statistical analyses
- Understand the assumptions and limitations of statistical procedures
- Communicate the results of your analyses in accessible language
Given these objectives, this course fulfills three of the four learning outcomes for BYU’s Master of Public Administration (MPA) program:
- Quantitative Analysis: BYU MPA graduates are skilled at evaluating programs and policies. They know how to gather data, correctly analyze it, and employ the analysis to recommend policy and programmatic action in public service organizations.
- Public Service Values: BYU MPA graduates demonstrate an understanding of, passion for, and commitment to public service values, including reverence for the dignity and worth of all people and dedication to ethical governance.
- Communication: BYU MPA graduates effectively convey verbal and written information with the polish and professionalism appropriate for the public service context. They listen to and promote understanding among people with diverse viewpoints.
Course philosophy
Classical statistics classes spend substantial time covering probability theory, null hypothesis testing, and other statistical tests first developed hundreds of years ago. Some classes don’t use software or actual real data and instead live in the world of mathematical proofs. They can be math-heavy and full of often unintuitive concepts and equations.
In this class, we will take the opposite approach. We begin with data and learn how to tidy, wrangle, manipulate, and visualize it with code. Later in the semester we turn to more classical topics like inference and statistical modeling, but continue to keep the focus on data as we do so.
In other words, there’s way less of this:
\[ f(x) = \dfrac{1}{\sqrt{2\pi}} e^{-\frac12 x^2} \]
And way more of this:
summary_monthly_temp <- weather %>%
group_by(month) %>%
summarize(mean = mean(temp),
std_dev = sd(temp))
Over the last decade there has been a revolution in statistical and scientific computing. Open source languages like R and Python have overtaken older (and expensive!) corporate software packages like SAS and SPSS, and there are now thousands of books and blog posts and other online resources with excellent tutorials about how to analyze pretty much any kind of data.
This class will expose you to R—one of the most popular, sought-after, and in-demand statistical programming languages. Armed with the foundation of R skills you’ll learn in this class, you’ll know enough to be able to find how to analyze any sort of data-based question in the future.
Course materials
There are no formal physical textbooks for the class. We will only use (free!) online resources.
Books
Chester Ismay and Albert Y. Kim, ModernDive: An Introduction to Statistical and Data Sciences via R, version 0.4.0 (2018). [FREE online]
ModernDive: ModernDive will be our primary textbook. Please take the time to read the assigned chapters in depth and complete the Learning Checks when possible.Hadley Wickham and Garrett Grolemund, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (Sebastopol, California: O’Reilly Media, 2017). [FREE online; $13 used, $18 new at Amazon]
R for Data Science: Some of your readings will also come from the free R for Data Science book. Please take the time to read this book carefully as well and walk through the examples in RStudio.David M. Diez, Christopher D. Barr, and Mine Çetinkaya-Rundel, OpenIntro Statistics (3rd edition, 2017). [FREE online; $15 new at Amazon]
OpenIntro Statistics: A few of your readings will also come from the free OpenIntro Statistics book.
There will also occasionally be additional articles and videos to read and watch. When this happens, links to these other resources will be included on the reading page for that week.
I also highly recommend subscribing to the R Weekly newsletter. This e-mail is sent every Monday and is full of helpful tutorials about how to do stuff with R.
DataCamp
Update: This text is still here for historical reasons, but all links to DataCamp have been removed, and future versions of this class will not use DataCamp content due to the organization’s horrible handling of sexual harassment claims and its deeper issues with organizational culture.
We will use DataCamp—a collection of fantatstic interactive videos and tutorials and coding exercises available online—to supplement your readings. These exercises will give you additional practice with R and will be essential to understanding the material (especially in the first half of the course).
I will assign specific chapters from DataCamp courses as part of your readings, but by being in this class, you have access to the entire DataCamp course library for the next six months. If you ever feel bored, adventurous, or both, check out other courses and learn even more!For real, this is an incredible opportunity!
Importantly, you do not have to purchase any services or classes from DataCamp.
I will enroll you in our class DataCamp group using the e-mail address you have registered with BYU. You will then receive an e-mail from DataCamp with a link to register on their site.
R and RStudio
You will do all of your analysis with the open source (and free!) programming language R. You will use RStudio as the main program to access R. Think of R as an engine and RStudio as a car dashboard—R handles all the calculations and the actual statistics, while RStudio provides a nice interface for running R code.
R is free, but it can sometimes be a pain to install and configure. To make life easier, you can (and should!) use the free RStudio.cloud service, which lets you run a full instance of RStudio in your web browser. This means you won’t have to install anything on your computer to get started with R! We will have a shared class workspace in RStudio.cloud that will let you quickly copy templates for labs and problem sets.
RStudio.cloud is convenient, but it can be slow and it is not designed to be able to handle larger datasets or more complicated analysis. Over the course of the semester, you’ll probably want to get around to installing R, RStudio, and other R packages on your computer and wean yourself off of RStudio.cloud. This isn’t necessary, but it’s helpful.
You can find instructions for installing R, RStudio, and all the tidyverse packages here.
Online help and Gitter
Data science and statistical programming can be difficult. Computers are stupid and little errors in your code can cause hours of headache (even if you’ve been doing this stuff for years!).
Fortunately there are tons of online resources to help you with this. Two of the most important are StackOverflow (a Q&A site with hundreds of thousands of answers to all sorts of programming questions) and RStudio Community (a forum specifically designed for people using RStudio and the tidyverse (i.e. you)).
Additionally, we have a class chatroom at Gitter where anyone in the class can ask questions and anyone can answer. Ask questions about the readings, Learning Checks, lectures, or problem sets in the class Gitter. I will monitor the chatroom regularly, and you should also all do so as well. You’ll likely have similar questions as your peers, and you’ll likely be able to answer other peoples’ questions too.
Course policies
Be nice. Be honest. Don’t cheat.
We will also follow the full list of Marriott School and BYU classroom policies.
Counseling and Psychological Services (CAPS)
Life at BYU can be complicated and challenging. You might feel overwhelmed, experience anxiety or depression, or struggle with relationships or family responsibilities. Counseling and Psychological Services (CAPS) provides free, confidential support for students who are struggling with mental health and emotional challenges. The CAPS office is staffed by professional psychologists who are attuned to the needs of all types of college and professional students. Please do not hesitate to contact CAPS for assistance—getting help is a smart and courageous thing to do.
Basic needs security
If you have difficulty affording groceries or accessing sufficient food to eat every day, or if you lack a safe and stable place to live, and you believe this may affect your performance in this course, please contact the Dean of Students for support. Please also consider speaking with your local LDS bishop regarding Church welfare assistance regardless of whether or not you are LDS. Additionally, please talk to me if you are comfortable in doing so. This will enable me to provide any resources that I might possess.
Class conduct and expectations
On the first day of class, we came up a few specific rules, expectations, and policies for the course:
- Late work will be penalized by 10% each week it is late (starting the day after it is due)
- If you miss a reading quiz, that’s fine—let me know and I can reopen it for you without penalty
- Be responsible with laptop use.
- Be good and respectful. You’re all adults.
Laptops
This is a computer-heavy course and each class session will require extensive laptop use. Occasionally I may ask that laptops be closed for some in-class activities, you will be expected to use your computer. Please note that this is different from the laptop policy in other Romney Institute classes. Use your computer responsibly in class.
Teams
This class is team-based. In-class activities will be done largely in teams. Your problem sets will be turned in individually, but you can (and should!) work on them together in your assigned teams. Your final project will be completed and turned in as a team.
Please follow all the best practices you learned in Organizational Behavior to ensure that your team works well and that there is no free riding.
Assignments and grades
You can find descriptions for all the assignments on the assignments page.
Assignment | Points | Percent |
---|---|---|
Preparation (≈ 10.5 × 14) | 150 | 14.6% |
Problem sets (7 × 40) | 280 | 27.2% |
Code-through | 50 | 4.9% |
Exam 1 | 100 | 9.7% |
Exam 2 | 100 | 9.7% |
Exam 3 | 100 | 9.7% |
Final project | 250 | 24.3% |
Total | 1030 | — |
Grade | Range | Grade | Range |
---|---|---|---|
A | 93–100% | C | 73–76% |
A− | 90–92% | C− | 70–72% |
B+ | 87–89% | D+ | 67–69% |
B | 83–86% | D | 63–66% |
B− | 80–82% | D− | 60–62% |
C+ | 77–79% | F | < 60% |
Red pandas
Once you have read this entire syllabus and the assignments page, please click here and e-mail me a picture of a red panda. For real. Brownie points if it’s animated.