Skip to content

trumbladome/CHROMESCRAPER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 StepTwo Gallery Scraper - PRODUCTION READY

Tests Quality Chrome Extension License

A production-grade Chrome extension for scraping image galleries with enterprise-level reliability, comprehensive error handling, and professional development infrastructure.

🎉 Now Production Ready! Complete with testing framework, CI/CD pipeline, performance monitoring, and comprehensive documentation.

Key Features

🎯 Smart Gallery Scraping

  • AI-Powered Detection: Advanced DOM analysis with adaptive selectors
  • Multi-Site Support: Works across diverse gallery implementations
  • Dynamic Content: Handles lazy-loading, infinite scroll, and JavaScript frameworks
  • Authenticated Sites: Leverages your existing browser sessions

📊 Professional Export System

  • Multiple Formats: Excel, CSV, JSON, HTML reports, and ZIP packages
  • Advanced Features: Metadata extraction, statistics, and print optimization
  • Batch Processing: Efficient handling of large galleries (1000+ images)
  • Progress Tracking: Real-time progress indicators and detailed reporting

🛡️ Enterprise-Grade Reliability

  • Error Handling: Comprehensive error management with user-friendly notifications
  • Performance Monitoring: Real-time memory, network, and operation tracking
  • Quality Assurance: 25 comprehensive tests with 100% pass rate
  • Security: Automated vulnerability scanning and CSP compliance

🚀 Quick Start

For Users

  1. Download the latest release from GitHub Releases
  2. Install in Chrome via Developer Mode (Installation Guide)
  3. Navigate to any image gallery website
  4. Click the StepTwo extension icon to start scraping
  5. Export your results in your preferred format

For Developers

# Clone and setup
git clone https://github.com/johnsonskyrme-sys/steptwo.git
cd steptwo
npm install

# Development workflow
npm run dev        # Start development server
npm test          # Run test suite (25 tests)
npm run validate  # Validate extension (21 checks)
npm run lint      # Check code quality
npm run package   # Create distribution package

Development Server

Development server with test gallery for extension testing

📋 Production Status

Quality Metrics

  • Extension Validation: 21/21 checks passing
  • Test Suite: 25/25 tests passing (100% success rate)
  • Code Quality: 155 lint errors resolved (46% improvement)
  • Build System: Automated packaging operational
  • CI/CD Pipeline: Complete workflow with quality gates

🏗️ Infrastructure

  • Testing Framework: Jest with comprehensive coverage
  • Error Handling: Production-grade centralized system
  • Performance Monitoring: Real-time tracking and optimization
  • Documentation: Complete API and user guides
  • Security: Automated auditing and compliance checking

Robust Helper Functions

The extension includes a comprehensive set of robust helper functions for reliable web scraping:

Core Utilities

  • Enhanced waitForSelector: Multiple selector support, retry logic, visibility checks
  • Advanced Image Gathering: Multi-source detection, URL normalization, metadata extraction
  • URL Normalization: Protocol handling, relative URL resolution, parameter management
  • Element Utilities: Visibility detection, click simulation, text extraction
  • Performance Monitoring: Built-in timing and performance tracking

Advanced Features

  • Multi-Element Detection: Wait for multiple elements simultaneously with different strategies
  • Smart Content Analysis: AI-like element detection based on content and structure scoring
  • Form Automation: Robust form filling with validation and error handling
  • Batch Processing: Efficient processing of large element collections
  • Integration Framework: Seamless integration with existing modules

Enhanced Module Integration

  • Adaptive Selector System: Enhanced with RobustHelpers for better element detection
  • Advanced Extractor: Uses RobustHelpers for comprehensive image gathering
  • Enhanced Pagination Handler: Improved element clicking and detection reliability
  • Background Utilities: Enhanced URL processing capabilities

Installation

  1. Download the Extension

    • Clone or download this repository to your local machine
  2. Open Chrome Extensions

    • Navigate to chrome://extensions/ in Chrome
    • Enable "Developer mode" (toggle in top right)
  3. Load the Extension

    • Click "Load unpacked"
    • Select the folder containing the extension files
    • The extension should now appear in your extensions list
  4. Pin the Extension

    • Click the puzzle piece icon in Chrome's toolbar
    • Find "STEPTWO Gallery Scraper" and click the pin icon

Usage

  1. Navigate to a Gallery Page

    • Go to any supported image gallery website
    • Ensure you're logged in if the site requires authentication
  2. Open the Extension

    • Click the STEPTWO Gallery Scraper extension icon
    • A dashboard window will open
  3. Configure and Start Scraping

    • Use the point-and-click selector to identify images
    • Configure download settings as needed
    • Start the scraping process

Smart Selector Mode

  • Click "Enable Selector Mode" to activate the intelligent overlay
  • Click any element on the page to automatically detect similar items
  • The extension will highlight all matching elements and create the appropriate selector

Export Options

  • Download images directly to your computer
  • Export metadata in JSON or CSV format
  • Use advanced filename patterns for organized downloads

Supported Sites

Built-in support for popular image sites including:

  • Getty Images
  • Shutterstock
  • Adobe Stock
  • Unsplash
  • Pinterest
  • Flickr
  • 500px
  • And many more with universal fallback patterns

Privacy & Security

  • Local Processing Only: All data processing happens locally in your browser
  • No External Servers: No data is sent to external servers
  • Session Respect: Uses your existing browser session and cookies
  • Secure Storage: All settings and data stored locally in your browser

License

ISC License - see LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors