-
Notifications
You must be signed in to change notification settings - Fork 85
Description
The documentation for JudgevalCallbackHandler
mentions that it "captures node executions, tool calls, and LLM interactions," but it lacks clarity on key behavior aspects developers care about in production or debugging contexts:
1. Retry Behavior:
It's unclear whether the handler logs all retry attempts of a failed LLM call or tool invocation, or if it only captures the final (successful or failed) attempt. This makes it difficult to understand or diagnose multiple retries during execution.
2. Intermediate Outputs:
There is no explanation of whether intermediate state updates or tool responses—especially those that don't result in a final graph output—are included in the trace. Developers often need insight into these mid-graph steps for debugging.
3.Failures and Exceptions:
It is not specified whether:
- Failed tool executions (e.g., tools that raise exceptions or return error messages) are logged,
- Nodes that raise unhandled exceptions are still captured in the trace with partial context.
4.Trace Completeness Guarantees:
The documentation doesn’t describe whether the trace is guaranteed to be complete even when the graph crashes partway, or what guarantees exist around span closure for partial runs.
Suggested Fix -
Please enhance the documentation for JudgevalCallbackHandler by adding a dedicated section that covers:
- What does JudgevalCallbackHandler trace?
- Retries
- Failures
- Partial Runs
- Intermediate Outputs
Adding this section will significantly improve clarity for developers who rely on the handler for debugging and observability in LangGraph workflows.