Python for Business Research
Rice University
Fall 2024

Instructor

Kerry Back
J. Howard Creekmore Professor of Finance and Professor of Economics
kerryback@gmail.com

Class Meetings

Room 217, 2:15 – 3:45, 8/26/2024 – 10/7/2024

Course Description

This course is intended for PhD students in business and economics. However, it may be appropriate for students in other programs also.

Python is a general purpose programming language that has become especially important for machine learning and for data analysis more generally. It can be used as a substitute for MATLAB, Stata, SAS, and R and also for web scraping, natural language processing, and much more. This course provides an introduction to the language and to the libraries that are most useful for business research. The topics to be covered in the course are listed below. This is a hands-on course, and a part of each class session will be allocated to students working individually or in groups to apply and extend the class material.

Generative AI can write most of our code, so memorizing syntax is for the most part not necessary anymore. I recommend getting an academic subscription to Julius.ai, which combines various large language models (at the time of this writing: ChatGPT-4o, Claude 3.5 Sonnet, and Gemini) with an integrated python environment, so it can both write and run code. Because it can run code, it can catch and correct most of its errors.

It is fine if you have zero experience with python or any other programming language. It is also fine if you are experienced with python and are taking the course only to learn about certain libraries you haven’t used before. In the former case, I will not expect you to become a proficient programmer in six weeks. My goal in that case is to introduce you to the possibilities and show you how to get started. Googling and ChatGPT can take care of the rest.

Grading

There will be weekly assignments. You are allowed to google and/or use generative AI for assistance with the assignments. There will not be an exam.

Course Schedule

  1. Preliminaries: libraries, IDEs, environments
  2. Basic python: data types, conditional statements, loops, functions, classes
  3. Vectors and matrices: numpy
  4. Data handling: pandas
  5. Visualization: matplotlib, seaborn, plotly
  6. Scientific programming: scipy
  7. Statistics: statsmodels, linearmodels
  8. Tree models for machine learning: scikit-learn, xgboost
  9. Neural networks for machine learning: pytorch
  10. Neural networks for solving games, dynamic programming, and differential equations
  11. Reinforcement learning
  12. Web scraping: beautiful soup, selenium