Web Scraping API
Crawlee API
Extract structured metadata from any URL
https://crawl.vincenzo.web.id
Interactive Tester
Enter a URL below to test the crawl endpoint in real-time.
Title
-
H1
-
Description
-
Links
-
Images
-
Keywords
-
Text Preview
-
API Endpoints
Complete reference for the Crawlee API endpoints.
GET
/health
Health check endpoint to verify the API is running and responsive.
Response Example
Example Response
{
"status": "ok"
}
GET
/crawl?url=
Crawl a single URL and extract page data including title, description, keywords, headings, text content, links, and images.
Try it
Example Request (cURL)
curl "https://crawl.vincenzo.web.id/crawl?url=https%3A%2F%2Fexample.com"
Response Example
Example Response
{
"success": true,
"resultId": "abc123...",
"resultUrl": "https://example.com",
"data": {
"url": "https://example.com",
"title": "Example Domain",
"description": "...",
"keywords": "...",
"h1": "Example Domain",
"h2": "...",
"text": "...",
"links": ["..."],
"images": ["..."]
}
}
POST
/crawl
Crawl a URL using POST request with JSON body. Useful for URLs with special characters.
Try it
Example Request (cURL)
curl -X POST "https://crawl.vincenzo.web.id/crawl" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'
Response Example
Example Response
{
"success": true,
"resultId": "abc123...",
"resultUrl": "https://example.com",
"data": {
"url": "https://example.com",
"title": "Example Domain",
"description": "...",
"keywords": "...",
"h1": "Example Domain",
"h2": "...",
"text": "...",
"links": ["..."],
"images": ["..."]
}
}
GET
/storage
List all files stored in the API's storage. Returns file names, URLs, and sizes.
Response Example
Example Response
{
"files": [
{
"name": "screenshot_2024-01-15.png",
"url": "https://crawl.vincenzo.web.id/storage/screenshot_2024-01-15.png",
"size": 245632
}
]
}
GET
/storage/
Download a specific file from storage by filename.
Try it
Example Request (cURL)
curl -O "https://crawl.vincenzo.web.id/storage/screenshot.png"
Data Schema
Complete reference for all data fields returned by the crawl endpoint.
| Field | Type | Description |
|---|---|---|
success |
boolean | Indicates if the crawl was successful |
resultId |
string | Unique identifier for this crawl result |
resultUrl |
string | The URL that was crawled (after redirects) |
data.url |
string | Original URL that was requested |
data.title |
string | Page title from the HTML title tag |
data.description |
string | Meta description content |
data.keywords |
string | Meta keywords content |
data.h1 |
string | Primary heading (first h1 tag) |
data.h2 |
string | Secondary headings (comma-separated) |
data.text |
string | Extracted text content (HTML stripped) |
data.links |
string[] | Array of all URLs found in anchor tags |
data.images |
string[] | Array of all image URLs found |
Code Examples
Copy and paste ready-to-use code snippets for integrating with the Crawlee API.
# Health Check
curl "https://crawl.vincenzo.web.id/health"
# GET Request (URL encode your URL)
curl "https://crawl.vincenzo.web.id/crawl?url=https%3A%2F%2Fexample.com"
# POST Request
curl -X POST "https://crawl.vincenzo.web.id/crawl" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'
# List Storage
curl "https://crawl.vincenzo.web.id/storage"
# Download File
curl -O "https://crawl.vincenzo.web.id/storage/screenshot.png"
// Health Check
fetch('https://crawl.vincenzo.web.id/health')
.then(res => res.json())
.then(console.log);
// GET Request
const url = encodeURIComponent('https://example.com');
fetch(`https://crawl.vincenzo.web.id/crawl?url=${url}`)
.then(res => res.json())
.then(console.log);
// POST Request
fetch('https://crawl.vincenzo.web.id/crawl', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ url: 'https://example.com' })
})
.then(res => res.json())
.then(console.log);
// List Storage
fetch('https://crawl.vincenzo.web.id/storage')
.then(res => res.json())
.then(console.log);
import requests
# Health Check
resp = requests.get('https://crawl.vincenzo.web.id/health')
print(resp.json())
# GET Request
params = {'url': 'https://example.com'}
resp = requests.get('https://crawl.vincenzo.web.id/crawl', params=params)
print(resp.json())
# POST Request
resp = requests.post(
'https://crawl.vincenzo.web.id/crawl',
json={'url': 'https://example.com'}
)
print(resp.json())
# List Storage
resp = requests.get('https://crawl.vincenzo.web.id/storage')
print(resp.json())
# Download File
resp = requests.get('https://crawl.vincenzo.web.id/storage/screenshot.png')
with open('screenshot.png', 'wb') as f:
f.write(resp.content)