Connect Honeyhive to your AI agent
HoneyHive is a modern AI observability and evaluation platform that enables developers and domain experts to collaboratively build reliable AI applications faster.
We set up the connection using your own Honeyhive account, with keys you control, and keep it running. Your agent picks it up and starts doing the work.
What your agent can do in Honeyhive
Each one is a real action the agent can take on its own, the same things a person clicking around Honeyhive could do. Read-only by default; write actions are confirmed against your policy.
- Add datapoints to dataset Tool to add datapoints to a dataset. Use when you need to append multiple entries with specified input, ground truth, and history mappings.
- Compare Experiment Runs Tool to retrieve experiment comparison between two evaluation runs. Use when you need to analyze the differences in metrics, datapoints, and events between two runs.
- Compare Runs Events Tool to compare events between two experiment runs side-by-side. Use when analyzing differences in model behavior, performance metrics, or outputs between evaluation runs. Returns matched event pairs with their respecti…
- Batch Create Datapoints Tool to create multiple datapoints in a single batch operation. Use when you need to bulk-import events into a dataset or create many datapoints at once. Supports filtering by date range, event IDs, or custom criteria.…
- Create Batch Model Events Tool to create multiple model events in a single request. Use when you need to log a batch of event interactions to HoneyHive.
- Create Batch Tool Events Tool to log a batch of external API calls as tool events. Use when you need to record multiple tool events in one request—use after gathering all event data.
- Create Configuration Creates a new configuration in HoneyHive for managing LLM or pipeline settings. Use this to define reusable configurations with specific models, prompts, and parameters that can be deployed across different environments…
- Create Datapoint Tool to create a new datapoint with input-output pairs. Use when you need to add a single datapoint with inputs, ground truth, conversation history, and metadata.
- Create Dataset Tool to create a dataset. Use when you need to initialize a new dataset within a project.
- Create Event Tool to create a new event in HoneyHive to track execution of different parts of your application. Use when you need to log a model call, tool execution, or chain step. Events can be grouped into sessions and nested hie…
- Create Metric Tool to create a new metric in HoneyHive. Use when you need to define how to evaluate model outputs, whether through code (PYTHON), AI evaluation (LLM), human review (HUMAN), or combining multiple metrics (COMPOSITE). I…
- Create Model Event Tool to create a new model event to log LLM call data. Use when you need to track a single model interaction including messages, responses, usage, and metadata.
- Create Tool Creates a new tool definition in a HoneyHive project. Use this to register functions or plugins that can be invoked and tracked within HoneyHive. Tools are defined with a JSON Schema for their parameters, allowing Honey…
- Delete Datapoint Tool to delete a specific datapoint by its ID. Use when you need to remove a datapoint from HoneyHive after confirming its identifier.
- Delete Dataset Tool to delete a dataset by ID. Use when you need to remove a dataset after confirming its ID.
- End Evaluation Run Tool to update an evaluation run's status and metadata. Use to mark a run as completed after finishing evaluations, or update run properties like name, metadata, configuration, and associated event/datapoint IDs.
- Get Configurations Tool to retrieve a list of configurations. Use when you need to fetch all configurations for a specific project before making changes.
- Get Datasets Retrieve datasets from HoneyHive for a specified project. Use this tool when you need to: - List all datasets within a project - Find datasets by type (evaluation or fine-tuning) - Retrieve a specific dataset by its ID…
- Get Events Tool to query events with filters and projections from HoneyHive. Use this action when you need to retrieve events with lightweight filtering (limit 1000 results). For bulk exports or more complex queries, use the Retri…
- Get Events By Session ID Tool to retrieve the complete tree of nested events for a specific session. Use when you need to analyze all events (model calls, tool calls, chains) that occurred within a session, including their hierarchical relation…
- Get Events Chart Tool to retrieve charting and analytics data for events over time. Use when you need aggregated metrics (duration, cost, token usage) grouped by time buckets or fields. Supports percentile analysis (p50, p95, p99) for l…
- Get Metrics Retrieves all metrics associated with a HoneyHive project. Returns a list of metrics including their configuration (name, type, description, thresholds, evaluator details) and metadata (creation/update timestamps, sampl…
- Get Projects Tool to retrieve all projects in the HoneyHive account. Use when you need to list available projects, get project IDs for use in other API calls, or search for a specific project by name.
- Get Evaluation Run Details Tool to get details of an evaluation run by its UUID. Use when you need to check the status, configuration, results, or metadata of a specific evaluation run.
- Get Run Metrics Tool to get event metrics for an experiment run. Use when you need to retrieve metrics computed on events within a specific experiment run. Returns an array of event objects with their associated metrics, which can be f…
- Get Evaluation Runs Tool to retrieve a list of evaluation runs from HoneyHive. Use when you need to: - List all evaluation runs for analysis - Find runs by status, name, or dataset - Get specific runs by their IDs - Paginate through large…
- Get Runs Schema Tool to retrieve the schema for experiment runs in HoneyHive. Use when you need to understand available fields, datasets, and mappings for experiment runs.
- Get Session Retrieve a complete session tree by session ID from HoneyHive. Use this tool to fetch the full session hierarchy including all nested events (model calls, tool calls, chains) with their inputs, outputs, durations, and m…
- List Tools Tool to list all available Honeyhive tools. Use when you need to discover which functions or plugins are registered for use.
- Retrieve Datapoint Retrieve a specific datapoint by its ID from HoneyHive. Use this tool when you need the full details of a single datapoint, including its inputs, ground truth, conversation history, linked datasets, and metadata. Prereq…
- Retrieve Datapoints Retrieve datapoints from a HoneyHive project. Use this tool to fetch evaluation datapoints containing inputs, ground truth, and metadata. Supports filtering by specific datapoint IDs or dataset name. Commonly used to: -…
- Retrieve Events Retrieve and export events from a HoneyHive project. Use this tool to query traced events (model calls, tool calls, sessions, chains) with optional filters by event_type, metadata, feedback scores, or date range. Return…
- Retrieve Experiment Result Tool to retrieve the result of a specific experiment run. Use when you need the status, metrics, and datapoint-level details of a completed experiment.
- Start Evaluation Run Creates a new evaluation run to group and track multiple session events for analysis. Use this action when you want to: - Compare model performance across multiple sessions - Create evaluation batches for quality assura…
- Start Session Start a new HoneyHive session for tracing and observability. Use this tool to initiate a tracking session that groups together related model, tool, and chain events. Returns a session_id that should be used to link subs…
- Update Configuration Tool to update an existing HoneyHive configuration. Use when you need to modify a configuration's name, provider, model parameters, environments, or other settings. You must provide the configuration ID (obtainable via…
- Update Datapoint Update an existing datapoint by ID. Use this to modify any combination of inputs, ground_truth, history, metadata, linked_datasets, or linked_evals for a datapoint. Requires a valid datapoint ID obtained from retrieve_d…
- Update Dataset Tool to update an existing dataset. Use when you need to modify a dataset's details (name, description, datapoints, linked evaluations, or metadata) after confirming its ID.
- Update Event Update an existing HoneyHive event by ID. Use to attach feedback, metrics, metadata, outputs, config, user properties, or update duration on events created via start_session or batch event creation. At least one optiona…
- Update Metric Tool to update an existing metric. Use when you need to modify a metric’s properties after creation. Ensure you retrieve the metric first to verify its current state.
- Update Project Updates an existing HoneyHive project's name or description. Use this action to modify project metadata after creation. You must provide the project_id and at least one field to update (name or description). To find pro…
- Update Tool Tool to update an existing tool in HoneyHive. Use when you need to modify a tool's name, description, parameters, or type after confirming its ID. At least one optional field must be provided alongside the required tool…
How we connect it
- 1
Connect your account
You create a key in Honeyhive, a key you create and control, and paste it in once. It lives in a secrets store on your server, not with us.
- 2
Set the guardrails
Read-only by default. You choose which write actions the agent may take, and anything outside that policy gets confirmed with you first.
- 3
We keep it running
Health checks on every connection, updates handled for you, and we watch the first week of activity to make sure the work lands.
FAQ
Honeyhive questions, answered.
Ready to put Honeyhive to work?
Tell us what your team runs on. We set up the connection, secure it, and your agent takes it from there.
All product names, logos, and brands are property of their respective owners; used for identification only. ZeroToClaw is not affiliated with or endorsed by Honeyhive.