Scrapyrt 实战：HTTP API 调用爬虫

📂 所属阶段：第五阶段 — 战力升级（分布式与进阶篇）

1. 安装与启动

pip install scrapyrt

# 启动 Scrapyrt 服务
scrapyrt -p 6023

2. HTTP 调用

# 调用爬虫
curl "http://localhost:6023/crawl.json?spider_name=example&url=http://example.com"

# 返回结果
{
  "status": "ok",
  "items": [
    {"title": "...", "price": "..."}
  ]
}

3. Python 调用

import requests

response = requests.get(
    'http://localhost:6023/crawl.json',
    params={
        'spider_name': 'example',
        'url': 'http://example.com'
    }
)

items = response.json()['items']

4. 小结

Scrapyrt 优势：

1. HTTP API：易于集成
2. 实时爬虫：按需调用
3. 无状态：易于扩展

应用场景：
- 微服务架构
- 按需爬虫
- API 网关

💡 记住：Scrapyrt 让爬虫变成了服务。这是现代爬虫架构的标准做法。

🔗 扩展阅读

Scrapyrt

#Scrapyrt 实战：HTTP API 调用爬虫

#1. 安装与启动

#2. HTTP 调用

#3. Python 调用

#4. 小结

Scrapyrt 实战：HTTP API 调用爬虫

1. 安装与启动

2. HTTP 调用

3. Python 调用

4. 小结