Web Scraping API

Crawlee API

Extract structured metadata from any URL

https://crawl.vincenzo.web.id

Interactive Tester

Enter a URL below to test the crawl endpoint in real-time.

Title

Description

Links

Images

Keywords

Text Preview

API Endpoints

Complete reference for the Crawlee API endpoints.

GET /health

Health check endpoint to verify the API is running and responsive.

Response Example

Example Response

{
"status": "ok"
}

GET /crawl?url=

Crawl a single URL and extract page data including title, description, keywords, headings, text content, links, and images.

Try it

Example Request (cURL)

curl "https://crawl.vincenzo.web.id/crawl?url=https%3A%2F%2Fexample.com"

Response Example

Example Response

{
  "success": true,
  "resultId": "abc123...",
  "resultUrl": "https://example.com",
  "data": {
    "url": "https://example.com",
    "title": "Example Domain",
    "description": "...",
    "keywords": "...",
    "h1": "Example Domain",
    "h2": "...",
    "text": "...",
    "links": ["..."],
    "images": ["..."]
  }
}

POST /crawl

Crawl a URL using POST request with JSON body. Useful for URLs with special characters.

Try it

Example Request (cURL)

curl -X POST "https://crawl.vincenzo.web.id/crawl" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Response Example

Example Response

{
  "success": true,
  "resultId": "abc123...",
  "resultUrl": "https://example.com",
  "data": {
    "url": "https://example.com",
    "title": "Example Domain",
    "description": "...",
    "keywords": "...",
    "h1": "Example Domain",
    "h2": "...",
    "text": "...",
    "links": ["..."],
    "images": ["..."]
  }
}

GET /storage

List all files stored in the API's storage. Returns file names, URLs, and sizes.

Response Example

Example Response

{
  "files": [
    {
      "name": "screenshot_2024-01-15.png",
      "url": "https://crawl.vincenzo.web.id/storage/screenshot_2024-01-15.png",
      "size": 245632
    }
  ]
}

GET /storage/

Download a specific file from storage by filename.

Try it

Example Request (cURL)

curl -O "https://crawl.vincenzo.web.id/storage/screenshot.png"

Data Schema

Complete reference for all data fields returned by the crawl endpoint.

Field	Type	Description
`success`	boolean	Indicates if the crawl was successful
`resultId`	string	Unique identifier for this crawl result
`resultUrl`	string	The URL that was crawled (after redirects)
`data.url`	string	Original URL that was requested
`data.title`	string	Page title from the HTML title tag
`data.description`	string	Meta description content
`data.keywords`	string	Meta keywords content
`data.h1`	string	Primary heading (first h1 tag)
`data.h2`	string	Secondary headings (comma-separated)
`data.text`	string	Extracted text content (HTML stripped)
`data.links`	string[]	Array of all URLs found in anchor tags
`data.images`	string[]	Array of all image URLs found

Code Examples

Copy and paste ready-to-use code snippets for integrating with the Crawlee API.

# Health Check
curl "https://crawl.vincenzo.web.id/health"

# GET Request (URL encode your URL)
curl "https://crawl.vincenzo.web.id/crawl?url=https%3A%2F%2Fexample.com"

# POST Request
curl -X POST "https://crawl.vincenzo.web.id/crawl" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

# List Storage
curl "https://crawl.vincenzo.web.id/storage"

# Download File
curl -O "https://crawl.vincenzo.web.id/storage/screenshot.png"

// Health Check
fetch('https://crawl.vincenzo.web.id/health')
  .then(res => res.json())
  .then(console.log);

// GET Request
const url = encodeURIComponent('https://example.com');
fetch(`https://crawl.vincenzo.web.id/crawl?url=${url}`)
  .then(res => res.json())
  .then(console.log);

// POST Request
fetch('https://crawl.vincenzo.web.id/crawl', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ url: 'https://example.com' })
})
  .then(res => res.json())
  .then(console.log);

// List Storage
fetch('https://crawl.vincenzo.web.id/storage')
  .then(res => res.json())
  .then(console.log);

import requests

# Health Check
resp = requests.get('https://crawl.vincenzo.web.id/health')
print(resp.json())

# GET Request
params = {'url': 'https://example.com'}
resp = requests.get('https://crawl.vincenzo.web.id/crawl', params=params)
print(resp.json())

# POST Request
resp = requests.post(
    'https://crawl.vincenzo.web.id/crawl',
    json={'url': 'https://example.com'}
)
print(resp.json())

# List Storage
resp = requests.get('https://crawl.vincenzo.web.id/storage')
print(resp.json())

# Download File
resp = requests.get('https://crawl.vincenzo.web.id/storage/screenshot.png')
with open('screenshot.png', 'wb') as f:
    f.write(resp.content)