Creating Knowledge Bases
With the ADK, you can create a knowlege bases for your agents, either by connecting to your own ElasticSearch or Milvus instance, or by uploading your documents.
Use YAML, JSON or Python files to create your knowledge bases for watsonx Orchestrate.
Creating built-in Milvus knowledge bases
If you don’t have an existing Milvus or Elasticsearch instance to connect to, you can create a knowledge base by simply uploading your documents. These documents will be ingested into the built-in Milvus instance, which will serve as the backend for your knowledge base.
Once the knowledge base is created, you can check its status to see when it’s ready for use.
Creating external knowledge bases
External knowledge bases allow you to connect your existing Milvus or Elasticsearch databases as a knowledge source for your agent. To configure a knowledge base with your external database, use the conversational_search_tool.index_config
to define the connection details for your Milvus or Elasticsearch instance.
Use the field_mapping
in your index_config
to to specify which fields from the search results are used for the title
, body
and optionally url
of the search result
Milvus
When connecting to a Milvus instance:
Ensure the provided embedding_model_id
is the one used when ingesting the documents in your index.
Additionally, ensure you use the GRPC host and port from your Milvus instance Connections will fail if you use the HTTP host or port.
ElasticSearch
For Elasticsearch, you can provide a custom query_body
that will be sent as the POST body in the search request. This allows for advanced query customization.
- If provided, the
query_body
must include the $QUERY token, which will be replaced by the user’s query at runtime. - If no custom
query_body
is provided, a keyword search will be used.
To further customize the ElasticSearch query, result_filter
can be set to an array of ElasticSearch filters. If using both query_body
and result_filter
, the query_body
must include the $FILTER token, which will be replaced by the result_filter
array at runtime.
For more information about ElasticSearch query body and filters customizations, see How to configure the advanced Elasticsearch settings
Custom search engine
You can create knowledge bases for your own custom search engine by following these examples:
Configuring generation options
With the ADK, you can further fine-tune how your agent uses knowledge through the conversational_search_tool
configuration in your knowledge base.
You can apply these settings to both built-in Milvus knowledge bases and external knowledge bases. Below are the configurable options available within the conversational_search_tool
section:
Configuration | Description |
---|---|
prompt_instruction | Set this under generation . If specified, this instruction will be included in the prompt sent to the language model to guide response generation. |
generated_response_length | Set this under generation to one of Concise , Moderate or Verbose . This setting adjusts the prompt to request responses of the specified length. If not set, the default is Moderate . |
retrieval_confidence_threshold | Set this under confidence_thresholds to one of Lowest , Low , High or Highest . This threshold determines the minimum confidence required that the retrieved documents answer the user’s query. If the confidence is below the threshold, the agent will return a default “I don’t know” response instead of generating a response. The default is “Low”. |
response_confidence_threshold | Set this under confidence_thresholds to one of Lowest , Low , High or Highest . This threshold evaluates the confidence that both the generated response and the retrieved documents answer the user’s query. If the confidence is below the threshold, the agent will return a default “I don’t know” response. The default is Low . |
query_rewrite | If enabled, the user’s query is rewritten using the context of the conversation to support multi-turn interactions. This setting is enabled by default. |
citations_shown | Set this under citations . This controls the maximum number of citations shown to the user in a knowledge-based response. If not set, the default is -1 , which means all available citations will be displayed |
Configuring the Hate, Abuse, and Profanity (HAP) filter
A Hate, Abuse, and Profanity (HAP) filter, is a feature that helps maintain an inclusive environment by identifying and addressing hate speech, abuse, and profanity. This filter is used to provide a positive online atmosphere and a safe community for users. It filters content to prevent the generation of hate speech, abuse, and profanity, and provides a generic fallback response if such content is detected.
You can configure HAP settings for your knowledge bases by using the enabled
and threshold
parameters. You must set both parameters under conversational_search_tool
> hap_filtering
> output
in the knowledge base schema.
Parameter | Description |
---|---|
enabled | Turn HAP on or off in your knowledge base. Set it to true to enable HAP filtering, or false to disable it. |
threshold | Set how sensitive the HAP filter is. Use a value between 0 and 1:
|