Serverless Webhooks with Cerebrium
Processing large PDF files often leads to HTTP timeouts. You send a document, wait, and the connection dies before processing completes. Cerebrium’s serverless platform solves this with custom FastAPI webhooks and built-in security.
· 4 min read
The code base can be found on GitHub
What is Cerebrium? #
Cerebrium is a serverless GPU provider that enables on-demand serverless applications. This is perfect for our PDF processing use case since we only pay for GPU time when actually converting documents, avoiding the cost of running dedicated servers.
GPU pricing is straightforward: L4 costs $0.000222/second while L40s costs $0.000542/second. For a 600-page PDF, L4 takes about 20 minutes ($0.27) while L40s completes in 10 minutes ($0.33). The $30 free credit covers 100+ large document conversions.
The Problem #
Large PDFs can take minutes to convert to markdown. A 600-page PDF takes 10-20 minutes depending on GPU tier, and further optimization is possible with deployment tuning.
Webhooks #
Cerebrium allows you to write custom FastAPI endpoints with built-in webhook security. Instead of keeping HTTP connections open:
- Send PDF → Get immediate “processing started” response
- Process in background → Cerebrium serverless function converts PDF
- Receive callback → Webhook delivers results to your endpoint
Cerebrium provides $30 of free credits for testing, making it easy to get started with serverless PDF processing.
Implementation #
Submit PDF with webhook:
import urllib.parse
webhook_url_encoded = urllib.parse.quote(webhook_url, safe="")
api_url = f"{pdf_endpoint}?async=true&webhookEndpoint={webhook_url_encoded}"
headers = {
"accept": "application/json",
"Authorization": f"Bearer {api_key}",
"X-Webhook-Secret": webhook_secret,
}
response = requests.post(api_url, headers=headers, files=files)
# Returns immediately: {"run_id": "random-uuid-here"}
Webhook server receives results:
@app.post("/webhook/pdf-results")
async def receive_pdf_results(request: Request):
result = await request.json()
# You can implement your own saving functions
save_markdown(result["data"]["text"])
return {"status": "received"}
Cerebrium recommends verifying webhook signatures to ensure security.
Webhook Signature Verification
import hmac
import hashlib
def verify_webhook_signature(request_id, timestamp, body, signature, secret):
# Remove the 'v1,' prefix from the signature
signature = signature.split(',')[1] if ',' in signature else signature
# Construct the signed content
signed_content = f"{request_id}.{timestamp}.{body}"
# Calculate expected signature
expected_signature = hmac.new(
secret.encode(),
signed_content.encode(),
hashlib.sha256
).hexdigest()
# Compare signatures
return hmac.compare_digest(expected_signature, signature)
Benefits #
- No timeouts - HTTP connection closes immediately
- Scalable - Handle multiple PDFs concurrently
- Reliable - Results delivered even if processing takes hours
- Connection resilience - Short-lived HTTP requests handle unstable connections better than long-running ones
- Batch processing - Perfect for evaluating multiple documents where immediate responses aren’t needed
Caveats #
Cerebrium webhook do not have a retry mechanism built-in. Other providers may have retries.
Accessibility #
One challenge with webhooks is making your endpoint accessible from serverless functions. During development, you can use tunneling services:
# Using ngrok
ngrok http 5000
# Or cloudflared tunnel
cloudflared tunnel --url http://localhost:5000
These create public URLs that forward to your local webhook server, allowing serverless functions to reach your endpoint during development.
When to Use Webhooks #
- Use webhooks for long-running tasks where immediate response isn’t feasible.
- Ideal for batch processing, large datasets or operations taking minutes to hours.
Production Patterns #
When deploying webhook-based processing to production, consider these patterns:
- Track job IDs - Store the
run_idreturned from initial requests with document metadata for tracking and retrieval - Webhook reliability - Cerebrium has no retry mechanism. Lost webhooks mean lost results. Ensure your webhook endpoint is highly available
- Queue systems - For high-volume processing, implement a queue-based system to manage async jobs and handle backpressure
- Database tracking - Maintain job state in your database (pending, processing, completed, failed) to provide status updates to users
Extensions #
This webhook pattern serves as a foundation for more complex systems:
- Caching strategies - Cache extracted content by document hash to prevent repeated processing. Use Redis, disk cache, or object storage with TTLs based on document update frequency
- Self-hosted queues - Implement Celery, BullMQ, or similar queue systems for managing async job lifecycles with retry logic
- Custom monitoring - Build dashboards tracking job completion rates, processing times, and failure patterns
- Webhook middleware - Add retry mechanisms, circuit breakers, and fallback handlers to handle delivery failures
- Hybrid approaches - Combine serverless processing with self-hosted orchestration for cost optimization
Conclusion #
Serverless webhooks eliminate the issues of timeouts in long running jobs. By leveraging Cerebrium’s FastAPI endpoints and secure webhook system, you can efficiently process large PDFs without worrying about connection drops or timeouts.
The pattern works great for PDF processing, image conversion, or any operation that takes more than a few minutes to complete.