Research Methods I: Data Analysis for Economics and Policy
Syllabus
The detailed class syllabus is available here.
Instructor Information
- Instructor: Fernando Rios-Avila
- Email: friosavi@levy.org
- Office Hours: Wednesdays 1:30pm to 4:00pm, or by appointment. Other times can be arranged remotely.
- Class Time: Wednesday, 9:30 am - 12:45 pm
Course Description
This course focuses on providing students with the tools and skills necessary to conduct data analysis for economics and policy research. Students will be exposed to the entire process of data analysis, from formulating questions and collecting data to cleaning, exploring, analyzing, and presenting results. The course covers exploratory data analysis, regression analysis, and introduces topics on prediction with machine learning. Students will gain hands-on experience using Stata, with Quarto for reproducible reporting, and GitHub for version control and collaboration.
Course Objectives
By the end of this course, students will be able to:
- Apply advanced data analysis techniques to economic and policy questions.
- Use modern tools such as GitHub and Quarto for research collaboration and reproducibility.
- Formulate research questions and design appropriate data collection methods.
- Clean, organize, and explore data using various techniques and visualizations.
- Apply regression analysis techniques to analyze relationships between variables.
- Use machine learning methods for prediction and classification tasks.
- Implement data analysis techniques using Stata.
- Effectively communicate research findings through written reports and oral presentations.
Required Textbook
Békés, G., & Kézdi, G. (2021). Data Analysis for Business, Economics, and Policy. Cambridge University Press.
Software Requirements
Stata
: A student license will be provided.Quarto
: Free and open-source software for reproducible research.VSCode
: Free and open-source code editor.GitHub/GitHub-Desktop
: Free platform for version control and collaboration.Zotero
: Free reference manager.
All homework assignments are required to be submitted in Quarto format, using GitHub repositories to submit the assignments.
Audio Podcasts
If you are interested in listening to an audio-podcast like summary of the chapters, you can find them here.
The audiofiles were created using NotebookLM by Google. The audiofiles are generated using the text from the book. The audiofiles are not perfect, but they can be useful to listen to the content of the book while you are doing other activities.
Course Outline
Part I: Introduction to Modern Research Tools
Week 1: Course Overview and Tools Setup
- Introduction to GitHub and Quarto
- Data Organization and Management
- Slides
- Homework 1
Part II: Data Analysis and Exploration
Week 2: Introduction to Data Analysis
- Data Collection and Preparation
- Tidy Data Principles
- Reading: Békés & Kézdi (2021), Chapters 1-2
- Slides
- Homework 2
Week 3: Data Exploration
- Exploratory Data Analysis Techniques
- Data Cleaning and Tidy Data Principles
- Reading: Békés & Kézdi (2021), Chapters 3-4
- Slides
- Homework 3
Part III: Generalization and Regression Analysis
Week 4: Generalization: From Sample to Population
- Sampling and Hypothesis Testing
- Confidence Intervals and Errors
- Reading: Békés & Kézdi (2021), Chapters 5-6
- Slides
- Homework 4
Weeks 5-6: Regression Analysis I: Simple Regression
- Date: October 9 and 16
- Linear Regression and Causality
- Model Assumptions and Transformations
- Reading: Békés & Kézdi (2021), Chapters 7-9
- Slides
- Homework 5
- Homework 6
- Audio Summary v1. Chapter 7
Week 7: Regression Analysis II: Multiple Regression
- Date: October 18 (Make-up class)
- Estimation and Inference
- Interactions and Non-linearities
- Reading: Békés & Kézdi (2021), Chapter 10
- Slides
- Homework 7
Week 8: Regression Analysis III: Modeling Probabilities
- Date: October 23
- Logit and Probit Models
- Interpretation and Predictive Power
- Reading: Békés & Kézdi (2021), Chapter 11
- Slides
- Homework 8
Part IV: Advanced Topics
Week 9: Time Series Analysis
- Date: October 30
- Trend, Seasonality, and Stationarity
- Reading: Békés & Kézdi (2021), Chapter 12
- Slides
- Homework 9
Week 10: Prediction
- Date: November 6
- Model Fit and Cross-validation
- Reading: Békés & Kézdi (2021), Chapter 13
- Slides
- Homework 10
Week 11: Model Building for Prediction: LASSO
- Date: November 13
- LASSO for Prediction and Diagnosis
- Reading: Békés & Kézdi (2021), Chapter 14
- Slides
- Homework 11
Week 12: Predicting Probabilities and Classification
- Date: November 20
- Classification Techniques and ROC Curves
- Reading: Békés & Kézdi (2021), Chapter 17
- Slides
- Homework 12
Week 13: Forecasting Data
- Date: November 27 (TBD)
- ARIMA and Forecasting Techniques
- Reading: Békés & Kézdi (2021), Chapter 18
- Slides
- Homework 13
Grading Policy
- Weekly Quizzes: 10%
- Weekly Problem Sets: 30%
- Term Paper: 60%
Term Paper Schedule (60% of final grade)
Part I: Research Proposal (5%, due Week 2). See here for few examples of research proposals.
Part II: Data Collection and Cleaning (10%, due Week 4)
Interim Progress Report (5%, due Week 7): October 18
Peer Review Report: (Week 8): October 23
Part III: Data Analysis (15%, due Week 10): November 6
Peer Review Report: (Week 11): November 13
Part IV: (15%) Final Report: December 2
Presentation (5%) due Last Date of Class: December 4 or 11
Complete research paper should include:
Introduction
Literature Review
Data and Methodology
Robustness Checks or Sensitivity/Sub-group Analysis
Conclusion
References
Appendices (if any)
Presentation (5%)
- 15-minute presentation of your research to the class
See here for an example for the kind of report expected at each stage, based on the first research proposal on the impact of remote work on urban housing prices.
Additional Requirements
- You should submit a PDF of your report by the deadline, along with the Github repository link
- Your GitHub repository should include all code, data, and the Quarto document for your report
- At each stage, you should submit your work to GitHub to follow the progress of your project
Resources
- Textbook: Békés, G., & Kézdi, G. (2021). Data Analysis for Business, Economics, and Policy. Cambridge University Press. Additional resources are available on the book’s website: Data Analysis
Course Policies
- Attendance: Attendance is highly recommended. Classes will not be recorded, and except for exceptional cases, there will be no online classes.
- Late Assignments: Late assignments will not be accepted unless prior arrangements have been made with the instructor.
- Academic Integrity: All work submitted must be your own. Plagiarism will not be tolerated and will result in a failing grade for the assignment or course.
- AI Usage: The use of AI in the class is allowed. However, you must disclose any AI tools used in your assignments. AI is a tool you can use to generate ideas, edit your text, provide help with coding, etc. However, it is completely unacceptable to use AI to generate the entire assignment. You will have to be able to explain and defend your work in class.