ScrapeCraft - AI-Powered Web Scraping Editor

ScrapeCraft is a web-based scraping editor similar to Cursor but specialized for web scraping. It uses AI assistance to help users build scraping pipelines with the ScrapeGraphAI API.

Features

🤖 AI-powered assistant using OpenRouter (Kimi-k2 model)
🔗 Multi-URL bulk scraping support
📋 Dynamic schema definition with Pydantic
💻 Python code generation with async support
🚀 Real-time WebSocket streaming
📊 Results visualization (table & JSON views)
🔄 Auto-updating deployment with Watchtower

Tech Stack

Backend: FastAPI, LangGraph, ScrapeGraphAI
Frontend: React, TypeScript, Tailwind CSS
Database: PostgreSQL
Cache: Redis
Deployment: Docker, Docker Compose, Watchtower

Prerequisites

Docker and Docker Compose
OpenRouter API key (Get it from OpenRouter)
ScrapeGraphAI API key (Get it from ScrapeGraphAI)

Quick Start with Docker

Clone the repository

git clone https://github.com/ScrapeGraphAI/scrapecraft.git
cd scrapecraft

Set up environment variables
```
cp .env.example .env
```
Edit the .env file and add your API keys:
- OPENROUTER_API_KEY: Get from OpenRouter
- SCRAPEGRAPH_API_KEY: Get from ScrapeGraphAI
Start the application with Docker
```
docker compose up -d
```
Access the application
- Frontend: http://localhost:3000
- API: http://localhost:8000
- API Docs: http://localhost:8000/docs
Stop the application
```
docker compose down
```

Development Mode

If you want to run the application in development mode without Docker:

Backend Development

cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Frontend Development

cd frontend
npm install
npm start

Usage

Create a Pipeline: Click "New Pipeline" to start
Add URLs: Use the URL Manager to add websites to scrape
Define Schema: Create fields for data extraction
Generate Code: Ask the AI to generate scraping code
Execute: Run the pipeline to scrape data
Export Results: Download as JSON or CSV

Remote Updates

The application includes Watchtower for automatic updates:

Push new Docker images to your registry
Watchtower will automatically detect and update containers
No manual intervention required

API Endpoints

POST /api/chat/message - Send message to AI assistant
GET /api/pipelines - List all pipelines
POST /api/pipelines - Create new pipeline
PUT /api/pipelines/{id} - Update pipeline
POST /api/pipelines/{id}/run - Execute pipeline
WS /ws/{pipeline_id} - WebSocket connection

Environment Variables

Variable	Description	How to Get
OPENROUTER_API_KEY	Your OpenRouter API key	Get API Key
SCRAPEGRAPH_API_KEY	Your ScrapeGraphAI API key	Get API Key
JWT_SECRET	Secret key for JWT tokens	Generate a random string
DATABASE_URL	PostgreSQL connection string	Auto-configured with Docker
REDIS_URL	Redis connection string	Auto-configured with Docker

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ScrapeCraft - AI-Powered Web Scraping Editor

Features

Tech Stack

Prerequisites

Quick Start with Docker

Development Mode

Backend Development

Frontend Development

Usage

Remote Updates

API Endpoints

Environment Variables

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

ScrapeGraphAI/scrapecraft

Folders and files

Latest commit

History

Repository files navigation

ScrapeCraft - AI-Powered Web Scraping Editor

Features

Tech Stack

Prerequisites

Quick Start with Docker

Development Mode

Backend Development

Frontend Development

Usage

Remote Updates

API Endpoints

Environment Variables

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages