Feed URL lists to AI agents and RAG pipelines
LLM agents need to discover all pages on a domain before crawling them for content. SitemapKit provides a clean URL list that agents can use as seed URLs for Crawl4AI, Firecrawl, or custom crawlers.
import requests
# Step 1: Get all URLs from sitemap
response = requests.post(
"https://sitemapkit.com/api/v1/sitemap/full",
headers={"x-api-key": "YOUR_API_KEY"},
json={"url": "docs.example.com"}
)
urls = [u["loc"] for u in response.json()["urls"]]
# Step 2: Feed to your RAG pipeline
for url in urls:
content = crawl_page(url) # Crawl4AI, Firecrawl, etc.
index.add(content) # Add to vector storeFree tier includes 100 API calls/month. No credit card required.