# Quick Start Guide ## Simple Usage (Recommended) 1. **Make sure you're in the project directory:** ```bash cd pokemon-disco ``` 2. **Run the complete scraper and PDF generator:** ```bash ./run.sh ``` This single command will: - Set up the Python virtual environment - Install all required packages - Scrape Pokemon TCG products from Dollar General - Generate a professional PDF catalog with barcodes - Create timestamped files for easy organization ## What You'll Get ### Generated Files: - **`pokemon_tcg_products_YYYYMMDD_HHMMSS.json`** - Raw data in JSON format - **`catalog_output/pokemon_tcg_catalog_YYYYMMDD_HHMMSS.pdf`** - Professional PDF catalog ### PDF Catalog Contents: - Product images (downloaded automatically) - Product details (title, price, stock, SKU) - UPC-A barcodes for each product (generated from SKU) - Table of contents for easy navigation - Professional formatting suitable for printing ## Alternative Commands If you prefer more control: ```bash # Activate virtual environment first source venv/bin/activate # Run only the scraper python scraper.py # Run only the PDF generator (after scraping) python pdf_generator.py pokemon_tcg_products_YYYYMMDD_HHMMSS.json # Run everything (installs requirements automatically) python run_scraper.py ``` ## Output Location All generated files will be in: - JSON data: Current directory - PDF catalog: `catalog_output/` directory - Product images: `catalog_output/images/` - Barcode images: `catalog_output/barcodes/` ## Requirements - Python 3.7+ - pandoc (for PDF generation) - Internet connection (for scraping) The script will automatically handle Python dependencies via virtual environment. ## Troubleshooting If you encounter issues: 1. **Permission denied:** Make sure the script is executable: ```bash chmod +x run.sh ``` 2. **Pandoc not found:** Install pandoc for your system: ```bash # Ubuntu/Debian sudo apt install pandoc # Arch Linux sudo pacman -S pandoc # macOS brew install pandoc ``` 3. **No products found:** The website may have anti-bot protection or changed structure. The script includes fallback mechanisms. 4. **PDF generation fails:** The markdown file will still be generated, which you can manually convert or view. ## File Naming Convention All output files include Unix-friendly timestamps: - Format: `YYYYMMDD_HHMMSS` (e.g., `20241221_143025`) - This ensures chronological sorting with `ls` command - No spaces or special characters for script-friendly handling ## Example Output ``` pokemon-disco/ ├── pokemon_tcg_products_20241221_143025.json # Scraped data ├── catalog_output/ │ ├── pokemon_tcg_catalog_20241221_143025.pdf # Final catalog │ ├── pokemon_tcg_catalog_20241221_143025.md # Markdown source │ ├── images/ │ │ ├── product_1_SKU123456.jpg # Product images │ │ └── product_2_SKU789012.jpg │ └── barcodes/ │ ├── barcode_SKU123456.png # UPC-A barcodes │ └── barcode_SKU789012.png ```