Skip to content

πŸ€– AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.

License

Notifications You must be signed in to change notification settings

ScrapeGraphAI/scrapecraft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ScrapeCraft - AI-Powered Web Scraping Editor

ScrapeCraft is a web-based scraping editor similar to Cursor but specialized for web scraping. It uses AI assistance to help users build scraping pipelines with the ScrapeGraphAI API.

Features

  • πŸ€– AI-powered assistant using OpenRouter (Kimi-k2 model)
  • πŸ”— Multi-URL bulk scraping support
  • πŸ“‹ Dynamic schema definition with Pydantic
  • πŸ’» Python code generation with async support
  • πŸš€ Real-time WebSocket streaming
  • πŸ“Š Results visualization (table & JSON views)
  • πŸ”„ Auto-updating deployment with Watchtower

Tech Stack

  • Backend: FastAPI, LangGraph, ScrapeGraphAI
  • Frontend: React, TypeScript, Tailwind CSS
  • Database: PostgreSQL
  • Cache: Redis
  • Deployment: Docker, Docker Compose, Watchtower

Prerequisites

  • Docker and Docker Compose
  • OpenRouter API key (Get it from OpenRouter)
  • ScrapeGraphAI API key (Get it from ScrapeGraphAI)

Quick Start with Docker

  1. Clone the repository

    git clone https://github.com/ScrapeGraphAI/scrapecraft.git
    cd scrapecraft
  2. Set up environment variables

    cp .env.example .env

    Edit the .env file and add your API keys:

  3. Start the application with Docker

    docker compose up -d
  4. Access the application

  5. Stop the application

    docker compose down

Development Mode

If you want to run the application in development mode without Docker:

Backend Development

cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Frontend Development

cd frontend
npm install
npm start

Usage

  1. Create a Pipeline: Click "New Pipeline" to start
  2. Add URLs: Use the URL Manager to add websites to scrape
  3. Define Schema: Create fields for data extraction
  4. Generate Code: Ask the AI to generate scraping code
  5. Execute: Run the pipeline to scrape data
  6. Export Results: Download as JSON or CSV

Remote Updates

The application includes Watchtower for automatic updates:

  1. Push new Docker images to your registry
  2. Watchtower will automatically detect and update containers
  3. No manual intervention required

API Endpoints

  • POST /api/chat/message - Send message to AI assistant
  • GET /api/pipelines - List all pipelines
  • POST /api/pipelines - Create new pipeline
  • PUT /api/pipelines/{id} - Update pipeline
  • POST /api/pipelines/{id}/run - Execute pipeline
  • WS /ws/{pipeline_id} - WebSocket connection

Environment Variables

Variable Description How to Get
OPENROUTER_API_KEY Your OpenRouter API key Get API Key
SCRAPEGRAPH_API_KEY Your ScrapeGraphAI API key Get API Key
JWT_SECRET Secret key for JWT tokens Generate a random string
DATABASE_URL PostgreSQL connection string Auto-configured with Docker
REDIS_URL Redis connection string Auto-configured with Docker

License

MIT

About

πŸ€– AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •