scrape_leafly
Leafly Strain Scraper – Harvest ALL strains into terpene_profiles.yaml.
Extracts strain data directly from Leafly’s __NEXT_DATA__ JSON embedded in listing pages. Each listing page contains ~18 strains with full terpene profiles, effects, cannabinoids, and metadata.
Total: ~9000 strains across ~500 pages.
# 💀🔥 scraping the entire weed bible 🌿 # # Usage: # python scrape_leafly.py # scrape ALL strains # python scrape_leafly.py –pages 5 # first 5 pages only # python scrape_leafly.py –merge # merge into terpene_profiles.yaml # python scrape_leafly.py –output my_strains.yaml
- scrape_leafly.parse_listing_strain(raw)[source]
Parse a single strain from listing page __NEXT_DATA__.
Each strain object in the listing contains: - slug, name, category - terps: {terpene_name: {score: float}} - effects: {effect_name: {score: float}} - cannabinoids: {thc: {percentile50: float}, …}
- scrape_leafly.scrape_all_strains(max_pages=None, output_path='leafly_strains.yaml', page_delay=1.5)[source]
Scrape all Leafly strains from listing pages.
Each listing page’s __NEXT_DATA__ contains ~18 strains with terpene profiles, effects, and cannabinoid data. No need to visit individual strain pages.