Tuesday, January 26, 2010

Notes on Stratified Sampling and Ratio Estimation

In addition to falling further behind schedule, I've been a bit slow to post the lecture notes. Here are the notes on stratified sampling and ratio estimation.

Wednesday, January 13, 2010

Finite Population Inference

Here are the notes from Lecture 3. They are a little shorter than previously, though I will quickly go over the material on unequal sampling that was not covered in Lecture 2 (or Lecture 1). This is fairly theoretical material, though not difficult, and most people find it interesting.

The first problem set is now available. It's due in class next Wednesday.

Monday is a holiday, so the next class will be Wednesday January 20. The pace will pick up substantially at that point, since we will be having class twice a week.

Wednesday, January 6, 2010

Simple Random Sampling Notes

The notes on simple random sampling (Lecture 2) are posted here. The Horvitz-Thompson material from Lecture 1, with some edits, appears at the end of these notes, though I doubt we'll get to them tomorrow.

The notes are a bit more technical than I'd intended, but most of this material should be familiar (aside maybe from finite population corrections). Some background references on finite probability are in the footnotes, but this really should be stuff that you know. (I remain eternally optimistic.)

There will be no class on Monday January 11. The next class will be on Wednesday January 13.

Monday, January 4, 2010

Syllabus and Notes from Lecture 1

Here are the syllabus and the notes from the first lecture. I covered sections 1 and 2 of the notes in class and will resume with section 3 on Wednesday. (I've corrected some typos in the notes, so they're a little different from what was distributed in class.) Please read chapter 2 of Lohr before class on Wednesday.

Sunday, January 3, 2010

First Class Meeting


The class meets on Monday and Wednesday mornings from 9:00-10:30 a.m. in Wallenberg Hall (Building 160), Room 329 (the cave-like room pictured above).

A few notes about the course: the approach to survey sampling in this course will be statistical and practical. By "statistical," I mean that it's about the effective use of quantitative data and includes such issues as model building, design, estimation, and inference (thought not necessarily in that order!). By "practical," I refer to the attention that will be paid to the large and small imperfections that occur in real world surveys. It is generally impossible to implement the sampling designs exactly, substantial amounts of non-response and self-selection are inevitable, and the models employed will be, at best, approximations.

I will not be discussing "survey methodology." How to write a questionnaire, how to train interviewers, or manage a Web panel are all important skills for conducting a survey. This is largely an art (as indicated by the title of Stanley Payne's The Art of Asking Questions, still one of my favorites) and best learned by doing. In recent years, numerous studies have been done testing various hypotheses about survey methods, but this literature tends to be an ad hoc collection of results, often of limited generality, and not, despite claims to the contrary, a coherent "new science." At least, that's my opinion.

The applications that will be covered in the course come largely from surveys of U.S. elections. This reflects primarily my personal interests, but the methods and results have much wider applicability. Between campaign and media polls, exit polls, academic surveys (such as the American National Election Studies), and Internet panels, we encounter all of the common designs (simple random samples, stratified samples, one and multi-stage cluster samples, probability proportional to size, systematic, and balanced selection), estimation methods (ratio and regression estimators, post-stratification, raking, propensity scores, matching, Hierarchical and empirical Bayes), and problems (frame imperfections, nonresponse, self-selection).