Skip to main content

Command Palette

Search for a command to run...

How to Build an Automated Ulta Price Tracker Using Python and Playwright

Updated
5 min read
How to Build an Automated Ulta Price Tracker Using Python and Playwright
J

Cloud Developer Hub is a space for developers exploring cloud, DevOps, and scalable systems.

Monitoring beauty product prices manually is a losing game. Between flash sales, "Buy 2 Get 1 Free" deals, and dynamic pricing, the cost of skincare or makeup fluctuates daily. If you're a developer, the solution isn't checking the website every morning; it's building a system to do it for you.

This guide covers how to build a production-ready price and promotion tracker for Ulta.com. We'll use Python and Playwright to extract data, automate the process with GitHub Actions, and visualize the findings using a Streamlit dashboard.

By the end of this tutorial, you'll have a pipeline that automatically identifies price drops and active promotions across a personalized beauty wishlist.

Prerequisites & Setup

To follow along, you'll need a basic understanding of Python. We will use the Ulta.com-Scrapers repository as a foundation for the extraction logic.

1. Clone the Repository

First, grab the starter code. Open your terminal and run:

git clone https://github.com/scraper-bank/Ulta.com-Scrapers.git
cd Ulta.com-Scrapers/python/playwright

2. Install Dependencies

Use a virtual environment to keep your global Python installation clean.

python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate
pip install playwright playwright-stealth pandas streamlit
playwright install chromium

3. Get a ScrapeOps API Key

Ulta employs anti-bot measures that can block standard scraper requests. To ensure high success rates, we'll use the ScrapeOps Proxy Aggregator to route requests through reliable residential IPs.

  1. Sign up for a free ScrapeOps account here.

  2. Copy your API key from the dashboard.

  3. Open product_data/scraper/ulta_scraper_product_data_v1.py and replace YOUR-API_KEY with your actual key.


Step 1: Configuring the Playwright Scraper

The repository includes a script designed to extract deep product details. Let's examine the ScrapedData dataclass in the Playwright implementation. This schema is designed to catch both the current price and the preDiscountPrice.

@dataclass
class ScrapedData:
    name: str = ""
    brand: str = ""
    price: float = 0.0
    preDiscountPrice: Optional[float] = None
    productId: str = ""
    url: str = ""
    # ... other fields

The scraper uses a detect_currency helper and targets JSON-LD (structured data) embedded in the page. This is more reliable than standard CSS selectors because e-commerce sites change their layouts frequently, but they rarely break the structured data used for SEO.

To track a specific wishlist, modify the execution loop to iterate through your target URLs:

# Create a list of products you want to monitor
wishlist_urls = [
    "https://www.ulta.com/p/advanced-night-repair-synchronized-multi-recovery-complex-pimprod2017431",
    "https://www.ulta.com/p/double-wear-stay-in-place-foundation-xlsImpprod1460283"
]

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(proxy=PROXY_CONFIG)
        pipeline = DataPipeline(jsonl_filename="wishlist_prices.jsonl")
        
        for url in wishlist_urls:
            page = await browser.new_page()
            await page.goto(url)
            data = await extract_data(page)
            if data:
                data.url = url
                pipeline.add_data(data)
            await page.close()
        
        await browser.close()

Step 2: Running the Scraper and Saving Data

Run the script from your terminal:

python product_data/scraper/ulta_scraper_product_data_v1.py

The scraper generates a .jsonl (JSON Lines) file. Unlike a standard JSON array, JSONL stores one object per line. This format is ideal for price tracking because it is streamable and append-friendly. Each day the scraper runs, we can append new lines to the file without loading the entire historical dataset into memory.

A successful entry looks like this:

{"name": "Advanced Night Repair", "price": 88.0, "preDiscountPrice": 110.0, "productId": "2017431", "url": "..."}

If preDiscountPrice is higher than price, you've caught a sale.


Step 3: Automating Scrapes with GitHub Actions

Running the script manually is inefficient. You can use GitHub Actions to run this script every morning at 8:00 AM for free.

Create a file at .github/workflows/daily_scrape.yml in your repository:

name: Daily Ulta Price Check

on:
  schedule:
    - cron: '0 8 * * *' # Runs at 08:00 UTC every day
  workflow_dispatch: # Allows manual trigger

jobs:
  scrape:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'
      
      - name: Install dependencies
        run: |
          pip install playwright playwright-stealth
          playwright install chromium
          
      - name: Run Scraper
        env:
          SCRAPEOPS_API_KEY: ${{ secrets.SCRAPEOPS_API_KEY }}
        run: python python/playwright/product_data/scraper/ulta_scraper_product_data_v1.py

      - name: Commit Data
        run: |
          git config --global user.name 'PriceBot'
          git config --global user.email 'bot@github.com'
          git add wishlist_prices.jsonl
          git commit -m "Update daily prices [skip ci]" || exit 0
          git push

To make this work, go to your GitHub Repo Settings, navigate to Secrets and Variables, then Actions, and add your SCRAPEOPS_API_KEY.


Step 4: Building the Analysis Dashboard

Once you have gathered data, you can visualize it. We'll use Pandas for data manipulation and Streamlit to create a web dashboard.

Create a file called dashboard.py:

import streamlit as st
import pandas as pd
import json

def load_data(file_path):
    data = []
    with open(file_path, 'r') as f:
        for line in f:
            data.append(json.loads(line))
    return pd.DataFrame(data)

st.title("💄 Ulta Price & Promo Tracker")

df = load_data('wishlist_prices.jsonl')

# Calculate Savings
df['discount_amount'] = df['preDiscountPrice'] - df['price']
df['savings_percent'] = (df['discount_amount'] / df['preDiscountPrice'] * 100).fillna(0)

# Filter for active deals
deals = df[df['price'] < df['preDiscountPrice']].copy()

st.header("🔥 Active Deals Detected")
if not deals.empty:
    st.dataframe(deals[['brand', 'name', 'price', 'preDiscountPrice', 'savings_percent']])
else:
    st.write("No active discounts found today. Check back tomorrow!")

st.header("Price Trends")
st.bar_chart(df.set_index('name')['price'])

Launch the dashboard by running: streamlit run dashboard.py


To Wrap Up

You've built a functional e-commerce intelligence tool. By combining the extraction logic from the ScrapeOps repository with modern automation, you now have a system that:

  • Handles Anti-Bots: Uses ScrapeOps proxies to prevent IP blocks.

  • Captures Precise Data: Distinguishes between current prices and original MSRP.

  • Runs Automatically: GitHub Actions handles daily execution.

  • Visualizes Insights: Streamlit highlights the best deals.