Assignment 2
BUSI 520: Python for Business Research
Jones Graduate School of Business
Rice University
- Basics of pandas series and dataframes
- Create a series from a list of integers.
- Extract values at specific indices from the series.
- Change the index of the series to alphabetical letters.
- Create a dataFrame from a dictionary of lists.
- Extract specific columns from the dataframe.
- Add a new column to the dataframe.
- Create a dataframe filled with random numbers.
- Basic dataFrame operations:
- Calculate the summary statistics for a fataframe column.
- Sort the dataframe based on a specific column.
- Filter rows based on certain criteria.
- Replace specific values in a datadrame.
- Rename columns.
- Map values in a column to other values using a dictionary.
- Missing values:
- Find all missing values in a dataframe. b Fill missing values with zeros.
- Fill missing values in a column with the column’s mean value.
- Drop rows with missing data.
- Find duplicate rows.
- Drop all but the last row in each set of duplicate rows.
- Filtering and aggregation:
- Using the ‘tips’ dataset, filter the rows where the total bill is greater than $10.
- Create a new column in the ‘tips’ dataset called ‘bill_per_person’ which is the total bill divided by the size of the party.
- Group by the ‘day’ column and compute the average total bill for each day.
- WRDS: Create a dataset containing monthly CRSP returns and book-to-market ratios for 1970 through 2023. Use the Fama-French June 30 convention for book-to-market. Follow the Fama-French definitions for book equity and market equity (variable definitions are at French’s data library). You can use any resources for assistance except other people - for example, there might be advice at WRDS.