- 🏆 20 points available
- ✏️ Last updated on 9/13/2022
▶️ First, run the code cell below to import unittest, a module used for 🧭 Check Your Work sections and the autograder.
# DO NOT MODIFY THE CODE IN THIS CELL
import unittest
tc = unittest.TestCase()🎯 Challenge 1: Import Pandas and NumPy¶
👇 Tasks¶
- ✔️ Import the following Python packages.- pandas: Use alias- pd.
- numpy: Use alias- np.
 
# YOUR CODE BEGINS
# YOUR CODE ENDS🧭 Check Your Work¶
Run the code cell below to test your solution.
- ✔️ If the code cell runs without errors, you’re good to move on.
- ❌ If the code cell produces an error, review your code and fix any mistakes.
_test_case = 'import-pandas-numpy'
_points = 2
tc.assertTrue("pd" in globals(), "Check whether you have correctly import Pandas with an alias.")
tc.assertTrue("np" in globals(), "Check whether you have correctly import NumPy with an alias.")🎯 Challenge 2: Create a Pandas Series¶
👇 Tasks¶
- ✔️ Create a new Pandas Seriesnamedsample_serieswith the following four values:-20,-10,10,20
🚀 Hint¶
The code below creates a new Pandas Series with the values 1 and 2.
my_new_series = pd.Series([1, 2])# YOUR CODE BEGINS
# YOUR CODE ENDS
print(sample_series)🧭 Check Your Work¶
Run the code cell below to test your solution.
- ✔️ If the code cell runs without errors, you’re good to move on.
- ❌ If the code cell produces an error, review your code and fix any mistakes.
_test_case = 'create-a-pandas-series'
_points = 2
pd.testing.assert_series_equal(sample_series, pd.Series(x * 10 for x in [-2, -1, 1, 2]))🎯 Challenge 3: Create a Pandas DataFrame¶
👇 Tasks¶
- ✔️ You are given two lists - brandsandrankingsthat contain the names of make-up products and the number of reviews on Sephora.com.
- ✔️ Using the two lists, create a new Pandas DataFramenameddf_brandsthat has the following two columns:- brand: Names of the brands
- ranking: Ranking of the brands
 
- ✔️ Note that the column names are singular.
🚀 Hint¶
The code below creates a new Pandas DataFrame from two series.
my_new_dataframe = pd.DataFrame({
    "column_one": my_series1,
    "column_two": my_series2
})🔑 Expected Output¶
| brand | ranking | |
|---|---|---|
| 0 | Apple | 1 | 
| 1 | Amazon | 2 | 
| 2 | 3 | 
brands = ["Apple", "Amazon", "Google"]
rankings = [1, 2, 3]
# YOUR CODE BEGINS
# YOUR CODE ENDS
display(df_brands)🧭 Check Your Work¶
Run the code cell below to test your solution.
- ✔️ If the code cell runs without errors, you’re good to move on.
- ❌ If the code cell produces an error, review your code and fix any mistakes.
_test_case = 'create-a-pandas-dataframe'
_points = 2
pd.testing.assert_frame_equal(
    df_brands.reset_index(drop=True),
    pd.DataFrame(
        {"brand": {0: "Apple", 1: "Amazon", 2: "Google"},
 "ranking": {0: 1, 1: 2, 2: 3}})
)Exercises using the Maven Toys Dataset¶
For the remainder of this exercise, you’ll be working with toy products data.
Data Source: Maven Analytics Datasets
📌 Load data¶
▶️ Run the code cell below to create a new DataFrame named df_products.
df_products = pd.read_csv("https://raw.githubusercontent.com/bdi475/datasets/main/maven-toys-data/products.csv")
# Used to keep a clean copy
df_products_copy = df_products.copy()
# Display the first 5 rows
df_products.head()The table below describes the columns in df_products.
| Field | Description | 
|---|---|
| Product_ID | Product ID | 
| Product_Name | Product name | 
| Product_Category | Product Category | 
| Product_Cost | Product cost (USD) | 
| Product_Price | Product retail price (USD) | 
🎯 Challenge 4: Find the number of rows and columns¶
👇 Tasks¶
- ✔️ Store the number of rows in df_productsto a new variable namednum_rows.
- ✔️ Store the number of columns in df_productsto a new variable namednum_cols.
- ✔️ Use .shape, notlen().
# YOUR CODE BEGINS
# YOUR CODE ENDS
print(f"Number of rows: {num_rows}")
print(f"Number of columns: {num_cols}")🧭 Check Your Work¶
Run the code cell below to test your solution.
- ✔️ If the code cell runs without errors, you’re good to move on.
- ❌ If the code cell produces an error, review your code and fix any mistakes.
_test_case = 'find-num-rows-and-cols'
_points = 2
tc.assertEqual(num_rows, len(df_products_copy.index), f"Number of rows should be {len(df_products_copy.index)}")
tc.assertEqual(num_cols, len(df_products_copy.columns), f"Number of columns should be {len(df_products_copy.columns)}")🎯 Challenge 5: Find all games¶
👇 Tasks¶
- ✔️ Using df_products, find all products in the"Games"category (df_products["Product_Category"] == "Games").
- ✔️ Store the result to a new variable named df_games.
- ✔️ df_productsshould remain unaltered.
🔑 Expected Output of df_games¶
| Product_ID | Product_Name | Product_Category | Product_Cost | Product_Price | |
|---|---|---|---|---|---|
| 3 | 4 | Chutes & Ladders | Games | 9.99 | 12.99 | 
| 4 | 5 | Classic Dominoes | Games | 7.99 | 9.99 | 
| 7 | 8 | Deck Of Cards | Games | 3.99 | 6.99 | 
| 13 | 14 | Glass Marbles | Games | 5.99 | 10.99 | 
| 15 | 16 | Jenga | Games | 2.99 | 9.99 | 
| 21 | 22 | Monopoly | Games | 13.99 | 19.99 | 
| 29 | 30 | Rubik’s Cube | Games | 17.99 | 19.99 | 
| 34 | 35 | Uno Card Game | Games | 3.99 | 7.99 | 
# YOUR CODE BEGINS
# YOUR CODE ENDS
display(df_games)🧭 Check Your Work¶
Run the code cell below to test your solution.
- ✔️ If the code cell runs without errors, you’re good to move on.
- ❌ If the code cell produces an error, review your code and fix any mistakes.
_test_case = 'find-all-games'
_points = 3
import base64
q = b'UHJvZHVjdF9DYXRlZ29yeSA9PSAnR2FtZXMn'
pd.testing.assert_frame_equal(
    df_games.sort_values(df_games.columns.to_list()).reset_index(drop=True),
    df_products_copy.query(base64.b64decode(q).decode('ascii')).sort_values(df_products_copy.columns.to_list()).reset_index(drop=True)
)
pd.testing.assert_frame_equal(
    df_products.reset_index(drop=True),
    df_products_copy.reset_index(drop=True),
    "The original DataFrame should remain unchanged."
)🎯 Challenge 6: Find electronics with a product cost over $10¶
👇 Tasks¶
- ✔️ Using df_products, find all products that matches the following two conditions:- in the "Electronics"category (df_products["Product_Category"] == "Electronics")
- and the product cost is over 10 dollars (df_products["Product_Cost"] > 10).
 
- in the 
- ✔️ Store the result to a new variable named df_electronics_over_10.
- ✔️ df_productsshould remain unaltered.
🔑 Expected Output of df_electronics_over_10¶
| Product_ID | Product_Name | Product_Category | Product_Cost | Product_Price | |
|---|---|---|---|---|---|
| 12 | 13 | Gamer Headphones | Electronics | 14.99 | 20.99 | 
| 33 | 34 | Toy Robot | Electronics | 20.99 | 25.99 | 
# YOUR CODE BEGINS
# YOUR CODE ENDS
display(df_electronics_over_10)🧭 Check Your Work¶
Run the code cell below to test your solution.
- ✔️ If the code cell runs without errors, you’re good to move on.
- ❌ If the code cell produces an error, review your code and fix any mistakes.
_test_case = 'find-electronics-over-10-dollars'
_points = 4
import base64
q = b'KFByb2R1Y3RfQ2F0ZWdvcnkgPT0gJ0VsZWN0cm9uaWNzJykgJiAoUHJvZHVjdF9Db3N0ID4gMTAp'
pd.testing.assert_frame_equal(
    df_electronics_over_10.sort_values(df_electronics_over_10.columns.to_list()).reset_index(drop=True),
    df_products_copy.query(base64.b64decode(q).decode('ascii')).sort_values(df_products_copy.columns.to_list()).reset_index(drop=True)
)
pd.testing.assert_frame_equal(
    df_products.reset_index(drop=True),
    df_products_copy.reset_index(drop=True),
    "The original DataFrame should remain unchanged."
)🎯 Challenge 7: Sort by Price in Descending Order¶
👇 Tasks¶
- ✔️ Sort df_productsby price (Product_Pricecolumn) in descending order.
- ✔️ Store the sorted result to a new variable named df_sorted_by_price.
- ✔️ df_productsshould remain unaltered.
# YOUR CODE BEGINS
# YOUR CODE ENDS
display(df_sorted_by_price)🧭 Check Your Work¶
Run the code cell below to test your solution.
- ✔️ If the code cell runs without errors, you’re good to move on.
- ❌ If the code cell produces an error, review your code and fix any mistakes.
_test_case = 'sort-by-price-desc'
_points = 2
pd.testing.assert_series_equal(
    df_sorted_by_price["Product_Price"].reset_index(drop=True),
    df_products_copy.sort_values("Product_Price").iloc[::-1]["Product_Price"].reset_index(drop=True)
)
pd.testing.assert_frame_equal(
    df_products.reset_index(drop=True),
    df_products_copy.reset_index(drop=True),
    "The original DataFrame should remain unchanged."
)🎯 Challenge 8: Sort by Product Category and Product Name¶
👇 Tasks¶
- ✔️ Sort df_productsby product category in ascending order and then by product price in descending order for products within each category.
- ✔️ Store the sorted result to a new variable named df_sorted_by_category_price.
- ✔️ If two rows have the same product category and the same price, the order of those two rows doesn’t matter.
- ✔️ df_productsshould remain unaltered.
🔑 Sample Output of df_sorted_by_category_price¶
| Product_ID | Product_Name | Product_Category | Product_Cost | Product_Price | |
|---|---|---|---|---|---|
| 25 | 26 | PlayDoh Playset | Art & Crafts | 20.99 | 24.99 | 
| 10 | 11 | Etch A Sketch | Art & Crafts | 10.99 | 20.99 | 
| 16 | 17 | Kids Makeup Kit | Art & Crafts | 13.99 | 19.99 | 
| 18 | 19 | Magic Sand | Art & Crafts | 13.99 | 15.99 | 
| 27 | 28 | Playfoam | Art & Crafts | 3.99 | 10.99 | 
| 26 | 27 | PlayDoh Toolkit | Art & Crafts | 3.99 | 4.99 | 
| 2 | 3 | Barrel O’ Slime | Art & Crafts | 1.99 | 3.99 | 
| 24 | 25 | PlayDoh Can | Art & Crafts | 1.99 | 2.99 | 
| 33 | 34 | Toy Robot | Electronics | 20.99 | 25.99 | 
| 12 | 13 | Gamer Headphones | Electronics | 14.99 | 20.99 | 
| 5 | 6 | Colorbuds | Electronics | 6.99 | 14.99 | 
| 21 | 22 | Monopoly | Games | 13.99 | 19.99 | 
| 29 | 30 | Rubik’s Cube | Games | 17.99 | 19.99 | 
| 3 | 4 | Chutes & Ladders | Games | 9.99 | 12.99 | 
| 13 | 14 | Glass Marbles | Games | 5.99 | 10.99 | 
| 4 | 5 | Classic Dominoes | Games | 7.99 | 9.99 | 
| 15 | 16 | Jenga | Games | 2.99 | 9.99 | 
| 34 | 35 | Uno Card Game | Games | 3.99 | 7.99 | 
| 7 | 8 | Deck Of Cards | Games | 3.99 | 6.99 | 
| 19 | 20 | Mini Basketball Hoop | Sports & Outdoors | 8.99 | 24.99 | 
| 23 | 24 | Nerf Gun | Sports & Outdoors | 14.99 | 19.99 | 
| 6 | 7 | Dart Gun | Sports & Outdoors | 11.99 | 15.99 | 
| 31 | 32 | Supersoaker Water Gun | Sports & Outdoors | 11.99 | 14.99 | 
| 11 | 12 | Foam Disk Launcher | Sports & Outdoors | 8.99 | 11.99 | 
| 20 | 21 | Mini Ping Pong Set | Sports & Outdoors | 6.99 | 9.99 | 
| 30 | 31 | Splash Balls | Sports & Outdoors | 7.99 | 8.99 | 
| 17 | 18 | Lego Bricks | Toys | 34.99 | 39.99 | 
| 28 | 29 | Plush Pony | Toys | 8.99 | 19.99 | 
| 0 | 1 | Action Figure | Toys | 9.99 | 15.99 | 
| 9 | 10 | Dinosaur Figures | Toys | 10.99 | 14.99 | 
| 1 | 2 | Animal Figures | Toys | 9.99 | 12.99 | 
| 32 | 33 | Teddy Bear | Toys | 10.99 | 12.99 | 
| 8 | 9 | Dino Egg | Toys | 9.99 | 10.99 | 
| 22 | 23 | Mr. Potatohead | Toys | 4.99 | 9.99 | 
| 14 | 15 | Hot Wheels 5-Pack | Toys | 3.99 | 5.99 | 
# YOUR CODE BEGINS
# YOUR CODE ENDS
display(df_sorted_by_category_price)🧭 Check Your Work¶
Run the code cell below to test your solution.
- ✔️ If the code cell runs without errors, you’re good to move on.
- ❌ If the code cell produces an error, review your code and fix any mistakes.
_test_case = 'sort-by-cat-asc-name-desc'
_points = 3
sample_sorted = df_products_copy.sort_values(["Product_Price", "Product_Category"][::-1], ascending=[False, True]).iloc[::-1]
pd.testing.assert_series_equal(
    df_sorted_by_category_price["Product_Category"].reset_index(drop=True),
    sample_sorted["Product_Category"].reset_index(drop=True)
)
pd.testing.assert_series_equal(
    df_sorted_by_category_price["Product_Price"].reset_index(drop=True),
    sample_sorted["Product_Price"].reset_index(drop=True)
)
pd.testing.assert_frame_equal(
    df_products.reset_index(drop=True),
    df_products_copy.reset_index(drop=True),
    "The original DataFrame should remain unchanged."
)