Skip to main content

Documentation Index

Fetch the complete documentation index at: https://developer.watson-orchestrate.ibm.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Embeddings is used to turn text into vectors that capture semantic meaning, so similar texts end up close together in vector space. This makes it useful for search, clustering, recommendations, classification, anomaly detection, and semantic similarity checks like finding duplicate or related content. In practice, you often use it in retrieval-augmented generation (RAG): embed your documents, embed the user’s question, compare the vectors, and return the most relevant passages to a model. It’s also commonly used for text search and “find things like this” workflows rather than simple keyword matching. Embeddings support the same API as langchain’s embeddings abstractions, and may be used as a direct replacement for running inside Orchestrate.

Initialization patterns

From Instance Credentials (Standalone/Runs-Elsewhere Mode)

PYTHON
from ibm_watsonx_orchestrate_sdk.langchain import WxOEmbeddings

embeddings = WxOEmbeddings.from_instance_credentials(
    instance_url="https://your-instance.cloud.ibm.com",
    api_key="your-wxo-api-key",
    model="openai/text-embedding-3-small"
)

# Embed a single query
query_embedding = embeddings.embed_query("What is machine learning?")
print(f"Embedding dimension: {len(query_embedding)}")
PYTHON
from ibm_watsonx_orchestrate_sdk.langchain import WxOEmbeddings
from langgraph.graph.state import RunnableConfig

def create_agent(config: RunnableConfig):
    embeddings = WxOEmbeddings.from_runnable_config(
        config=config,
        model="openai/text-embedding-3-small"
    )
    
    # Embed a single query
    query_embedding = embeddings.embed_query("What is machine learning?")
    print(f"Embedding dimension: {len(query_embedding)}")
    
    return embeddings

From Execution Context (Runtime/Runs-On Mode)

PYTHON
from ibm_watsonx_orchestrate_sdk.langchain import WxOEmbeddings

# Execution context provided by WxO runtime
exection_context = runnable_config.get("configurable", {}).get("execution_context")

embeddings = WxOEmbeddings.from_execution_context(
    execution_context=execution_context,
    model="openai/text-embedding-3-small"
)

query_embedding = embeddings.embed_query("What is machine learning?")
print(f"Embedding dimension: {len(query_embedding)}")

Usage examples

Basic Embeddings

PYTHON
from ibm_watsonx_orchestrate_sdk.langchain import WxOEmbeddings

embeddings = WxOEmbeddings.from_instance_credentials(
    instance_url="https://your-instance.cloud.ibm.com",
    api_key="your-api-key",
    model="openai/text-embedding-3-small"
)

# Embed a single query
query = "What is the capital of France?"
query_embedding = embeddings.embed_query(query)
print(f"Query embedding: {len(query_embedding)} dimensions")

# Embed multiple documents
documents = [
    "Paris is the capital of France.",
    "London is the capital of England.",
    "Berlin is the capital of Germany."
]
doc_embeddings = embeddings.embed_documents(documents)
print(f"Embedded {len(doc_embeddings)} documents")

Async Embeddings

PYTHON
import asyncio

async def embed_async():
    embeddings = WxOEmbeddings.from_instance_credentials(
        instance_url="https://your-instance.cloud.ibm.com",
        api_key="your-api-key",
        model="openai/text-embedding-3-small"
    )
    
    # Async single query
    query_embedding = await embeddings.aembed_query("What is AI?")
    print(f"Query embedding: {len(query_embedding)} dimensions")
    
    # Async multiple documents
    documents = ["Document 1", "Document 2", "Document 3"]
    doc_embeddings = await embeddings.aembed_documents(documents)
    print(f"Embedded {len(doc_embeddings)} documents")

asyncio.run(embed_async())

Semantic Search with Vector Store

PYTHON
from ibm_watsonx_orchestrate_sdk.langchain import WxOEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document

# Initialize embeddings
embeddings = WxOEmbeddings.from_instance_credentials(
    instance_url="https://your-instance.cloud.ibm.com",
    api_key="your-api-key",
    model="openai/text-embedding-3-small"
)

# Create documents
documents = [
    Document(page_content="Paris is the capital of France.", metadata={"country": "France"}),
    Document(page_content="London is the capital of England.", metadata={"country": "England"}),
    Document(page_content="Berlin is the capital of Germany.", metadata={"country": "Germany"}),
    Document(page_content="Madrid is the capital of Spain.", metadata={"country": "Spain"}),
]

# Create vector store
vectorstore = FAISS.from_documents(documents, embeddings)

# Perform similarity search
query = "What is the capital of France?"
results = vectorstore.similarity_search(query, k=2)

for doc in results:
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}\n")

RAG (Retrieval-Augmented Generation)

PYTHON
from ibm_watsonx_orchestrate_sdk.langchain import ChatWxO, WxOEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

# Initialize embeddings and LLM
embeddings = WxOEmbeddings.from_instance_credentials(
    instance_url="https://your-instance.cloud.ibm.com",
    api_key="your-api-key",
    model="openai/text-embedding-3-small"
)

llm = ChatWxO.from_instance_credentials(
    instance_url="https://your-instance.cloud.ibm.com",
    api_key="your-api-key",
    model="watsonx/ibm/granite-3-8b-instruct"
)

# Create knowledge base
documents = [
    Document(page_content="Python is a high-level programming language."),
    Document(page_content="JavaScript is used for web development."),
    Document(page_content="Java is an object-oriented programming language."),
]

vectorstore = FAISS.from_documents(documents, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# Create RAG chain
template = """Answer the question based on the following context:

Context: {context}

Question: {question}

Answer:"""

prompt = ChatPromptTemplate.from_template(template)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
)

# Ask a question
response = rag_chain.invoke("What is Python?")
print(response.content)

Similarity Calculation

PYTHON
import numpy as np
from ibm_watsonx_orchestrate_sdk.langchain import WxOEmbeddings

embeddings = WxOEmbeddings.from_instance_credentials(
    instance_url="https://your-instance.cloud.ibm.com",
    api_key="your-api-key",
    model="openai/text-embedding-3-small"
)

# Embed texts
text1 = "Machine learning is a subset of artificial intelligence"
text2 = "AI includes machine learning and deep learning"
text3 = "The weather is nice today"

embedding1 = embeddings.embed_query(text1)
embedding2 = embeddings.embed_query(text2)
embedding3 = embeddings.embed_query(text3)

# Calculate cosine similarity
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

sim_1_2 = cosine_similarity(embedding1, embedding2)
sim_1_3 = cosine_similarity(embedding1, embedding3)

print(f"Similarity between text1 and text2: {sim_1_2:.4f}")
print(f"Similarity between text1 and text3: {sim_1_3:.4f}")

Supported methods

OpenAIEmbeddings supports the following methods:
  • embed_query(text) - Embed a single text query
  • embed_documents(texts) - Embed multiple documents
  • aembed_query(text) - Async embed a single text query
  • aembed_documents(texts) - Async embed multiple documents

Class methods

Embeddings supports the following class methods:
  • from_instance_credentials(instance_url, api_key, model, **kwargs) - Create from instance credentials (standalone/runs-elsewhere)
  • from_execution_context(execution_context, model, **kwargs) - Create from execution context (runtime/runs-on)
  • from_session(session, model, **kwargs) - Create from AgenticSession (runtime/runs-on)
  • from_runnable_config(config, model, **kwargs) - Create from RunnableConfig (runtime/runs-on)

Embedding model IDs

Use the model ID formats returned by the watsonx Orchestrate /models endpoint:
PYTHON
provider/model-name
Examples:
  • openai/text-embedding-3-small
  • openai/text-embedding-3-large
  • openai/text-embedding-ada-002
  • watsonx/ibm/slate-30m-english-rtrvr
  • Embeddings provides a drop-in replacement for embeddings usage in LangChain-based agents.
  • The model ID must follow the format returned by the platform.
  • Authentication and request routing are handled through the SDK interface.

References