Connect Diffbot to your AI agent

AI Tools 35 actions available

Diffbot provides AI-powered tools to extract and structure data from web pages, transforming unstructured web content into structured, linked data.

We set up the connection using your own Diffbot account, with keys you control, and keep it running. Your agent picks it up and starts doing the work.

What your agent can do in Diffbot

Each one is a real action the agent can take on its own, the same things a person clicking around Diffbot could do. Read-only by default; write actions are confirmed against your policy.

  • Combine Entity Profiles Combine multiple entity profiles into a unified view using the Diffbot Knowledge Graph. Returns enhanced person or organization data by matching on identifying attributes like name, email, employer, or URL. Use this to…
  • Create Bulk Extract Job Tool to submit a bulk extract job to process multiple URLs with Extract APIs. Use when you need to process many URLs asynchronously using any Extract API. The job will process URLs in the background and provide download…
  • Create or Update Custom API Tool to create or update the parameters and ruleset of a Custom API. Use this when you need to define custom extraction rules for specific websites that require tailored parsing logic beyond standard Diffbot APIs. Allow…
  • Create Bulk Enhance Job Tool to submit a bulk enhance job to enrich multiple entities asynchronously. Use when you need to process many Person or Organization records in batch. The API accepts entity descriptions and returns enriched data from…
  • Delete Custom API Tool to delete custom API definitions for a given URL pattern. Removes custom extraction rules from your account. Use when you need to remove previously configured custom APIs.
  • Delete KG Enhance Bulkjob Tool to delete an Enhance Bulkjob. Removes the bulk job and its results from the system. Use when cleaning up completed or failed jobs.
  • Download Bulk Job Results Tool to download results of a bulk enhance job with filtering options via POST request. Use this to retrieve processed results from a completed or running bulk job. Supports multiple export formats (json, jsonl, csv, xl…
  • Enhance Entity with Knowledge Graph Enrich a person or organization with comprehensive data from the Diffbot Knowledge Graph. Provide identifiers like name, email, employer, or URL and receive detailed entity information including employment history, educ…
  • Diffbot Extract Job Tool to extract structured job posting data from job listing pages. Returns job title, company, location, salary, requirements, skills, and other job-related information. Use when you need to parse and structure data fr…
  • Diffbot Extract List Tool to extract structured data from list-style pages like news indexes, product listings, and directory pages. Returns an array of items with their titles, links, and descriptions. Use when you need to extract multiple…
  • Get Diffbot Account Details Retrieves comprehensive Diffbot account information including subscription plan details, credit balance, usage history, and account status. Returns account holder name, email, current plan, available credits, and daily…
  • Diffbot Analyze Automatically analyzes a web page to determine its type and extract structured data. The Analyze API intelligently classifies pages into types (article, product, discussion, image, video, organization, etc.) and extract…
  • Get Article Data Tool to extract information from articles, including authors, publication dates, and images. Use when you need structured metadata from a web article URL.
  • Get Bulk Job Data Tool to download extracted results from a completed bulk job. Use after a bulk job has finished processing to retrieve the data. Supports JSON and CSV formats.
  • Get Bulk Job Status Tool to poll the status of a specific Diffbot Knowledge Graph Enhance bulk job. Use when you need to check the progress, completion status, or details of a bulk enhancement job.
  • Get Bulk Job Results Tool to download the results of a completed Enhance Bulkjob. Returns enriched records from the bulk job. Use after a bulk enhance job has completed processing.
  • Get Bulk Single Result Tool to download the result of a single job within a Diffbot bulk enhance job. Returns enriched entity data for a specific input record by its index. Use after a bulk enhance job has completed to retrieve individual res…
  • Get Crawl Data Download extracted results from a completed crawl job. Returns all structured data extracted during crawl processing (articles, products, etc.). Use after a crawl job has completed to retrieve the collected data.
  • Get Discussion Thread Extract structured discussion threads from web pages including forums, comment sections, product reviews, Reddit discussions, and blog comments. Returns posts with author info, timestamps, content, and hierarchical rela…
  • Diffbot Get Event Tool to extract event details from web pages. Use when you need structured event data such as venue, date, and description.
  • Diffbot Get Image Tool to extract detailed information about images, including dimensions and recognition data. Use after confirming the image URL is publicly accessible.
  • Get KG Coverage Report by ID Download Knowledge Graph coverage report by report ID. Returns detailed CSV coverage statistics showing field presence across query results. Use this after generating a coverage report from a DQL query to retrieve the s…
  • Diffbot Get Product Tool to extract product information such as specifications, prices, availability, and reviews. Use when you need structured product data including specs, pricing, and reviews.
  • Get Video Data Tool to extract information from videos, including titles, descriptions, and embedded HTML. Use when you need structured video metadata from any web page.
  • List Bulk Jobs Tool to list all Bulk jobs associated with a specific token. Use after authenticating to retrieve statuses of all jobs for the account.
  • List Bulk Jobs Status For Token Tool to get the status of all bulk enhance jobs for a token. Returns list of all bulk jobs associated with your API token. Use when you need to monitor or retrieve the status of multiple bulk jobs at once.
  • List Custom APIs Tool to retrieve all Custom APIs and their extraction rules currently defined on your Diffbot token. Use when you need to list, review, or audit custom API configurations for your account.
  • Manage Crawl Job Manages Diffbot crawl jobs: pause, restart, delete, or view status. Returns list of all active crawl jobs when called without parameters. Use 'name' parameter with action flags (pause=1, restart=1, delete=1) to control…
  • Resolve Lost ID Tool to resolve lost IDs in the Knowledge Graph. Use when you need to map a lost identifier to its canonical counterpart for data consistency.
  • Diffbot Knowledge Graph Search Search the Diffbot Knowledge Graph using DQL (Diffbot Query Language). Query billions of entities including organizations, people, articles, products, and more. Use structured queries to filter by type, fields, and rela…
  • Search Crawl Job Data Tool to query crawl job collections using DQL (Diffbot Query Language). Use when you need to search extracted data from completed crawl or bulk jobs by collection name.
  • Start Bulk Job Tool to start a Bulk Extract job. Use when processing large numbers of URLs asynchronously. The Diffbot Bulk API uses GET requests with query parameters to create jobs.
  • Start Crawl Job Initiates a Diffbot crawl job that spiders a website starting from seed URLs and processes discovered pages with a specified Extract API. The crawler follows links within the domain, collects structured data (articles,…
  • Stop Bulk Job Tool to pause (stop) a running Bulk job. Pausing halts further processing of URLs while preserving existing progress. To resume, use the appropriate resume action. Specify the exact job name (case-sensitive) as provided…
  • Stop KG Bulk Job By ID Tool to stop an active Knowledge Graph Enhance bulk job by its ID. Halts processing of a running KG bulk job immediately. Use when you need to stop a specific KG bulk job using its bulkjobId.

How we connect it

  1. 1

    Connect your account

    You create a key in Diffbot, a key you create and control, and paste it in once. It lives in a secrets store on your server, not with us.

  2. 2

    Set the guardrails

    Read-only by default. You choose which write actions the agent may take, and anything outside that policy gets confirmed with you first.

  3. 3

    We keep it running

    Health checks on every connection, updates handled for you, and we watch the first week of activity to make sure the work lands.

Diffbot questions, answered.

With a key you create and control. You paste it in once, it is stored in a secrets store on your server, permissions are scoped to the minimum the agent needs, and you can revoke it at any time.
The actions Diffbot's API allows, the same things a person clicking around the app could do. Connections start read-only by default; write actions are confirmed against the policy you set before the agent takes them.
Connections are priced per tool on top of the base plan. Some are included, some are premium. See pricing for how connection charges work.
Standard tools are ready inside 7 business days of the setup call. We test the connection end to end, walk you through how the agent uses it, and watch the first week of activity.

Ready to put Diffbot to work?

Tell us what your team runs on. We set up the connection, secure it, and your agent takes it from there.

All product names, logos, and brands are property of their respective owners; used for identification only. ZeroToClaw is not affiliated with or endorsed by Diffbot.