Skip to article frontmatterSkip to article content

Basic Queries in SQLite

SQLite is a lightweight, serverless, self-contained SQL database engine. It is widely used in applications ranging from small mobile apps to large-scale web applications due to its simplicity and ease of integration.

While SQLite can be used directly through SQL commands, we will use Python’s built-in sqlite3 module to interact with SQLite databases. This allows us to leverage the power of SQL while working within a Python environment.

▶️ Import pandas, numpy, and sqlite3.

import pandas as pd
import numpy as np
import sqlite3

💻 Connecting to a Database

Because SQLite is self-contained, connecting to a database is as simple as specifying the database file. If the file does not exist, SQLite will create it.

▶️ Run the code below to create a new SQLite database file.

# Connect to my_database.db database file
# If the file does not exist, create a new file
conn = sqlite3.connect("my_database.db")

▶️ Run the code below to create a new employees table.

# Create a cursor
# A cursor enables users of a DBMS to traverse through the result set
# without having to retrieve all results at once
c = conn.cursor()

# Drop (delete) employees table if it already exists
c.execute("DROP TABLE IF EXISTS employees")
conn.commit()

# Define a query
# Triple quotes (""") denote multiline strings
create_table_query = """
CREATE TABLE IF NOT EXISTS employees (
  emp_id INTEGER PRIMARY KEY,
  name TEXT NOT NULL,
  department TEXT,
  salary REAL
);
"""

c.execute(create_table_query)
conn.commit()

We can use the conn object to execute SQL commands and manage the database. The cursor object allows us to execute SQL queries and fetch results.

▶️ Run the code below to insert sample data into the employees table.

c.executescript(
    """
INSERT INTO employees (name, department, salary) VALUES ('Mia', 'HR', 140000.0);
INSERT INTO employees (name, department, salary) VALUES ('Michael', 'Marketing', 200000.0);
INSERT INTO employees (name, department, salary) VALUES ('Sean', 'HR', 250000.0);
INSERT INTO employees (name, department, salary) VALUES ('Dylan', 'Marketing', 205000.0);
"""
)

conn.commit()
LineFunctionExplanation
executescript()Run multiple SQL commandsInserts multiple employee records
commit()Save changesWrites all changes permanently

🔍 Retrieving Data

We can use pandas to run SQL queries and load the results into a DataFrame for easier analysis. This is easier than using the cursor object directly for data retrieval.

▶️ Query all records from the employees table and load them into a DataFrame named df.

Note that each row (record) corresponds to an employee, and each column corresponds to an attribute of the employee (e.g., name, department, salary).

df = pd.read_sql_query("SELECT * FROM employees;", con=conn)
display(df)
Loading...

▶️ Query the employees table for Marketing department employees sorted by salary in descending order.

df = pd.read_sql_query(
    """
SELECT emp_id, name, salary
FROM employees
WHERE department = "Marketing"
ORDER BY salary DESC;
""",
    con=conn,
)
display(df)
Loading...

📚 Delete and Update Records

SQLite supports the standard CRUD (Create, Read, Update, Delete) operations. We have already seen how to create tables and read data. Let’s look at how to update and delete records.


➖ Delete Records

The DELETE statement is used to remove records from a table. You can specify which records to delete using a WHERE clause.

▶️ Delete employees from the "HR" department.

c.execute("DELETE FROM employees WHERE department = 'HR';")
conn.commit()

You can verify the deletion by querying the employees table again. You should see that the employees from the HR department have been removed.

▶️ Print all remaining employees to verify the deletion.

df = pd.read_sql_query("SELECT * FROM employees;", con=conn)
display(df)
Loading...

➕ Update Records

The UPDATE statement is used to modify existing records in a table. You can specify which records to update using a WHERE clause.

Assume Michael has moved to the “Purchasing” department, and we need to update his record accordingly.

▶️ Update Michael’s department to "Purchasing".

c.execute("UPDATE employees SET department = 'Purchasing' WHERE name = 'Michael';")
conn.commit()

▶️ Print all employees to verify the update.

df = pd.read_sql_query("SELECT * FROM employees;", con=conn)
display(df)
Loading...

🆚 Comparing Pandas and SQL using Airbnb Listings

In this section, we will compare how to perform similar tasks using both pandas and SQL. We will use a dataset of Airbnb listings for this comparison.

▶️ Create a new SQLite database file named airbnb.db and create a table named listings.

# Connect to airbnb.db database file
# If the file does not exist, create a new file
conn = sqlite3.connect("airbnb.db")

# Create a cursor
# A cursor enables users of a DBMS to traverse through the result set
# without having to retrieve all results at once
c = conn.cursor()

# Drop (delete) listings table if it already exists
c.execute("DROP TABLE IF EXISTS listings")
conn.commit()

# Define a query
# Triple quotes (''') denote multiline strings
create_table_query = """
CREATE TABLE IF NOT EXISTS listings (
    neighbourhood TEXT,
    room_type TEXT,
    bathrooms REAL,
    bedrooms REAL,
    price REAL,
    number_of_reviews INTEGER,
    review_score REAL,
    is_superhost INT
)
"""

c.execute(create_table_query)
conn.commit()

▶️ Read a CSV file containing Airbnb listings into a pandas DataFrame.

df_listings = pd.read_csv(
    "https://github.com/bdi475/datasets/raw/main/chicago-airbnb-listings-small.csv"
)

display(df_listings.head(5))
Loading...

▶️ Populate the listings table with data from the CSV file.

tables = list(
    pd.read_sql_query("SELECT * FROM sqlite_master WHERE type='table';", con=conn)[
        "tbl_name"
    ]
)
if "listings" in tables:
    c.execute(f"DELETE FROM listings")
    conn.commit()

df_listings.to_sql(name="listings", index=False, con=conn, if_exists="append")
826

📌 Example 1: Select all rows and columns

▶️ Use pandas.

display(df_listings)
Loading...

▶️ Use SQLite.

query_select_all = """
SELECT *
FROM listings
"""

df_result = pd.read_sql_query(query_select_all, con=conn)
display(df_result)
Loading...

📌 Example 2: Select only three columns

Select only the "bathrooms", "bedrooms", and "price" columns.

▶️ Use pandas.

df_listings[["bathrooms", "bedrooms", "price"]]
Loading...

▶️ Use SQLite.

query_select_three_columns = """
SELECT bathrooms, bedrooms, price
FROM listings
"""

df_result = pd.read_sql_query(query_select_three_columns, con=conn)
display(df_result)
Loading...

📌 Example 3: Select all columns of expensive listings

Filter listings where the price is greater than $400.

▶️ Use pandas.

df_expensive = df_listings[df_listings["price"] > 400]

display(df_expensive)
Loading...

▶️ Use SQLite.

query_expensive_listings = """
SELECT *
FROM listings
WHERE price > 400
"""

df_result = pd.read_sql_query(query_expensive_listings, con=conn)
display(df_result)
Loading...

📌 Example 4: Select all columns of West Town listings

Filter listings located in the "West Town" neighborhood.

▶️ Use pandas.

df_west_town = df_listings[df_listings["neighbourhood"] == "West Town"]

display(df_west_town)
Loading...

▶️ Use SQLite.

query_west_town_listings = """
SELECT *
FROM listings
WHERE neighbourhood = 'West Town'
"""

df_result = pd.read_sql_query(query_west_town_listings, con=conn)
display(df_result)
Loading...

📌 Example 5: Sort all listings by price

Sort all listings by "price" in ascending order. Both pandas and SQL sort in ascending order by default.

▶️ Use pandas.

df_sorted_by_price = df_listings.sort_values(by="price")

display(df_sorted_by_price)
Loading...

▶️ Use SQLite.

query_sort_by_price = """
SELECT *
FROM listings
ORDER BY price
"""

df_result = pd.read_sql_query(query_sort_by_price, con=conn)
display(df_result)
Loading...

📌 Example 6: Select only the price column and sort in descending order

Select only the "price" column and sort the results in descending order.

▶️ Use pandas.

df_price = df_listings[["price"]].sort_values(by="price", ascending=False)

display(df_price)
Loading...

▶️ Use SQLite.

query_sort_by_price_desc = """
SELECT price
FROM listings
ORDER BY price DESC
"""

df_result = pd.read_sql_query(query_sort_by_price_desc, con=conn)
display(df_result)
Loading...

📌 Example 7: A complicated query

Find all listings with a price less than $100 and more than 500 reviews, sorted by number of reviews in descending order. Additionally, only select the "room_type", "bathrooms", "bedrooms", "price", "number_of_reviews" columns.

▶️ Use pandas.

df_complicated = df_listings[
    (df_listings["price"] < 100) & (df_listings["number_of_reviews"] > 500)
].sort_values(by="number_of_reviews", ascending=False)
df_complicated = df_complicated[
    ["room_type", "bathrooms", "bedrooms", "price", "number_of_reviews"]
]

display(df_complicated)
Loading...

▶️ Use SQLite.

query_final = """
SELECT room_type, bathrooms, bedrooms, price, number_of_reviews
FROM listings
WHERE (price < 100) AND (number_of_reviews > 500)
ORDER BY number_of_reviews DESC
"""

df_result = pd.read_sql_query(query_final, con=conn)
display(df_result)
Loading...