Assignment 2
BUSI 520: Python for Business Research
Jones Graduate School of Business
Rice University

  1. Basics of pandas series and dataframes
    1. Create a series from a list of integers.
    2. Extract values at specific indices from the series.
    3. Change the index of the series to alphabetical letters.
    4. Create a dataFrame from a dictionary of lists.
    5. Extract specific columns from the dataframe.
    6. Add a new column to the dataframe.
    7. Create a dataframe filled with random numbers.
  2. Basic dataFrame operations:
    1. Calculate the summary statistics for a fataframe column.
    2. Sort the dataframe based on a specific column.
    3. Filter rows based on certain criteria.
    4. Replace specific values in a datadrame.
    5. Rename columns.
    6. Map values in a column to other values using a dictionary.
  3. Missing values:
    1. Find all missing values in a dataframe. b Fill missing values with zeros.
    2. Fill missing values in a column with the column’s mean value.
    3. Drop rows with missing data.
    4. Find duplicate rows.
    5. Drop all but the last row in each set of duplicate rows.
  4. Filtering and aggregation:
    1. Using the ‘tips’ dataset, filter the rows where the total bill is greater than $10.
    2. Create a new column in the ‘tips’ dataset called ‘bill_per_person’ which is the total bill divided by the size of the party.
    3. Group by the ‘day’ column and compute the average total bill for each day.
  5. WRDS: Create a dataset containing monthly CRSP returns and book-to-market ratios for 1970 through 2023. Use the Fama-French June 30 convention for book-to-market. Follow the Fama-French definitions for book equity and market equity (variable definitions are at French’s data library). You can use any resources for assistance except other people - for example, there might be advice at WRDS.