This feature is currently in public preview. Functionality and behavior may change in future updates.
Pre-requisites
Run the following command to enable watsonx Orchestrate Developer Edition to process documents:BASH
Note:
You need to configure a minimum allocation of 20GB RAM to your Docker engine during installation of watsonx Orchestrate Developer edition to support document processing features.
Note:
To run the document classifier, you must define the
WO_INSTANCE
, WO_API_KEY
, and AUTHORIZATION_URL
credentials in your .env
file. For more information on configuring the .env
file, see Installing the watsonx Orchestrate Developer Edition.Configuring document extractor node in agentic workflows
-
Define document classes
Create a class that defines the document classes to classify. Each document class must follow this structure:
Python
- Configure the document extract node
docclassifier()
method in your agentic workflow to classify the document. This method accepts the following input arguments:
Parameter | Type | Required | Description |
---|---|---|---|
name | string | Yes | Unique identifier for the node. |
llm | string | Yes | The LLM used for document classification. |
display_name | string | No | Display name for the node. |
classes | object | Yes | The document classification classes. |
description | string | No | Description of the node. |
min_confidence | float | No | Minimum confidence threshold for classification. |
review_fields | List[string] | No | The fields that require user review. |
input_map | DataMap | No | Define input mappings using a structured collection of Assignment objects. |
enable_review | bool | No | Enables or disables the human-in-the-loop feature. Set to True to activate it and False to deactivate. The default value is False . |
Note:The
min_confidence
and review_fields
settings control the human-in-the-loop feature. This feature only works when you run the Flow from a chat session.
If a field is extracted with confidence lower than min_confidence
, and its name appears in review_fields
, the agent opens a review window in the chat. You can then review and confirm the extracted values.docext
node in a agentic workflow:
Python