← All integrations
AI Frameworks

SitemapKit + LlamaIndex

Feed sitemap URLs into LlamaIndex for building knowledge bases. Discover all pages on a domain, then index them for RAG-based question answering.

Quick Start

import requests
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader

# Get all URLs from sitemap
resp = requests.post(
    "https://sitemapkit.com/api/v1/sitemap/full",
    headers={"x-api-key": "YOUR_API_KEY", "Content-Type": "application/json"},
    json={"url": "docs.example.com"}
)
urls = [u["loc"] for u in resp.json()["urls"]]

# Load and index pages
documents = SimpleWebPageReader(html_to_text=True).load_data(urls[:100])
index = VectorStoreIndex.from_documents(documents)

# Query the index
query_engine = index.as_query_engine()
response = query_engine.query("How do I configure authentication?")
print(response)

How it works

  1. Get your API key — Sign up for a free SitemapKit account to get your sk_live_* API key.
  2. Call the API — Use the /api/v1/sitemap/full endpoint to discover and extract all sitemaps from a domain in one call.
  3. Process the data — The response includes structured JSON with all URLs, lastmod dates, and sitemap metadata.

API Endpoints

  • POST /api/v1/sitemap/discover — Find all sitemaps on a domain
  • POST /api/v1/sitemap/extract — Parse a sitemap URL and extract all URLs
  • POST /api/v1/sitemap/full — Discover + extract in one call (recommended)

Start using SitemapKit with LlamaIndex

100 free API calls/month. No credit card required.

More AI Frameworks integrations