Web Scraping API

Crawlee API

Extract structured metadata from any URL

https://crawl.vincenzo.web.id

Interactive Tester

Enter a URL below to test the crawl endpoint in real-time.

Title
-
H1
-
Description
-
Links
Images
-
Keywords
-
Text Preview

-


                            

API Endpoints

Complete reference for the Crawlee API endpoints.

GET /health

Health check endpoint to verify the API is running and responsive.

Response Example
Example Response
{
"status": "ok"
}
GET /crawl?url=

Crawl a single URL and extract page data including title, description, keywords, headings, text content, links, and images.

Try it
Example Request (cURL)
curl "https://crawl.vincenzo.web.id/crawl?url=https%3A%2F%2Fexample.com"
Response Example
Example Response
{
  "success": true,
  "resultId": "abc123...",
  "resultUrl": "https://example.com",
  "data": {
    "url": "https://example.com",
    "title": "Example Domain",
    "description": "...",
    "keywords": "...",
    "h1": "Example Domain",
    "h2": "...",
    "text": "...",
    "links": ["..."],
    "images": ["..."]
  }
}
POST /crawl

Crawl a URL using POST request with JSON body. Useful for URLs with special characters.

Try it
Example Request (cURL)
curl -X POST "https://crawl.vincenzo.web.id/crawl" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'
Response Example
Example Response
{
  "success": true,
  "resultId": "abc123...",
  "resultUrl": "https://example.com",
  "data": {
    "url": "https://example.com",
    "title": "Example Domain",
    "description": "...",
    "keywords": "...",
    "h1": "Example Domain",
    "h2": "...",
    "text": "...",
    "links": ["..."],
    "images": ["..."]
  }
}
GET /storage

List all files stored in the API's storage. Returns file names, URLs, and sizes.

Response Example
Example Response
{
  "files": [
    {
      "name": "screenshot_2024-01-15.png",
      "url": "https://crawl.vincenzo.web.id/storage/screenshot_2024-01-15.png",
      "size": 245632
    }
  ]
}
GET /storage/

Download a specific file from storage by filename.

Try it
Example Request (cURL)
curl -O "https://crawl.vincenzo.web.id/storage/screenshot.png"

Data Schema

Complete reference for all data fields returned by the crawl endpoint.

Field Type Description
success boolean Indicates if the crawl was successful
resultId string Unique identifier for this crawl result
resultUrl string The URL that was crawled (after redirects)
data.url string Original URL that was requested
data.title string Page title from the HTML title tag
data.description string Meta description content
data.keywords string Meta keywords content
data.h1 string Primary heading (first h1 tag)
data.h2 string Secondary headings (comma-separated)
data.text string Extracted text content (HTML stripped)
data.links string[] Array of all URLs found in anchor tags
data.images string[] Array of all image URLs found

Code Examples

Copy and paste ready-to-use code snippets for integrating with the Crawlee API.

# Health Check
curl "https://crawl.vincenzo.web.id/health"

# GET Request (URL encode your URL)
curl "https://crawl.vincenzo.web.id/crawl?url=https%3A%2F%2Fexample.com"

# POST Request
curl -X POST "https://crawl.vincenzo.web.id/crawl" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

# List Storage
curl "https://crawl.vincenzo.web.id/storage"

# Download File
curl -O "https://crawl.vincenzo.web.id/storage/screenshot.png"
// Health Check
fetch('https://crawl.vincenzo.web.id/health')
  .then(res => res.json())
  .then(console.log);

// GET Request
const url = encodeURIComponent('https://example.com');
fetch(`https://crawl.vincenzo.web.id/crawl?url=${url}`)
  .then(res => res.json())
  .then(console.log);

// POST Request
fetch('https://crawl.vincenzo.web.id/crawl', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ url: 'https://example.com' })
})
  .then(res => res.json())
  .then(console.log);

// List Storage
fetch('https://crawl.vincenzo.web.id/storage')
  .then(res => res.json())
  .then(console.log);
import requests

# Health Check
resp = requests.get('https://crawl.vincenzo.web.id/health')
print(resp.json())

# GET Request
params = {'url': 'https://example.com'}
resp = requests.get('https://crawl.vincenzo.web.id/crawl', params=params)
print(resp.json())

# POST Request
resp = requests.post(
    'https://crawl.vincenzo.web.id/crawl',
    json={'url': 'https://example.com'}
)
print(resp.json())

# List Storage
resp = requests.get('https://crawl.vincenzo.web.id/storage')
print(resp.json())

# Download File
resp = requests.get('https://crawl.vincenzo.web.id/storage/screenshot.png')
with open('screenshot.png', 'wb') as f:
    f.write(resp.content)