This page contains the complete API reference for the langchain-couchbase package. For usage examples and tutorials, see the Usage page.

The package provides three main components:

Vector Stores - Store and search document embeddings in Couchbase
- Couchbase Query Vector Store (Query and Index service)
- Couchbase Search Vector Store (Search service)
Caching - Cache LLM responses and enable semantic caching
Chat Message History - Store conversation history in Couchbase

Vector Stores

Couchbase Query Vector Store

Note

This vector store uses the Query and Index service to store and search document embeddings. And is available in Couchbase Server versions 8.0 and above.

class langchain_couchbase.vectorstores.query_vector_store.CouchbaseQueryVectorStore(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, embedding: Embeddings, distance_metric: DistanceStrategy, *, text_key: str | None = 'text', embedding_key: str | None = 'embedding')[source]

Bases: BaseCouchbaseVectorStore

__Couchbase__ vector store integration using Query and Index service.

Setup:

Install langchain-couchbase and head over to Couchbase Capella and create a new cluster with a bucket and collection.

For more information on the indexes, see Hyperscale Vector Index documentation or Composite Vector Index documentation.

pip install -U langchain-couchbase

import getpass

COUCHBASE_CONNECTION_STRING = getpass.getpass("Enter the connection string for the Couchbase cluster: ")
DB_USERNAME = getpass.getpass("Enter the username for the Couchbase cluster: ")
DB_PASSWORD = getpass.getpass("Enter the password for the Couchbase cluster: ")

Key init args — indexing params:

embedding: Embeddings: Embedding function to use.

Key init args — client params:

cluster: Cluster: Couchbase cluster object with active connection.
bucket_name: str: Name of the bucket to store documents in.
scope_name: str: Name of the scope in the bucket to store documents in.
collection_name: str: Name of the collection in the scope to store documents in.
distance_metric: DistanceStrategy: Distance metric to use for the index. Options are: DOT, L2, EUCLIDEAN, COSINE, L2_SQUARED, EUCLIDEAN_SQUARED.

Instantiate:

from datetime import timedelta
from langchain_openai import OpenAIEmbeddings
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase import CouchbaseQueryVectorStore
from langchain_couchbase.vectorstores import DistanceStrategy

auth = PasswordAuthenticator(DB_USERNAME, DB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(COUCHBASE_CONNECTION_STRING, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

BUCKET_NAME = "langchain_bucket"
SCOPE_NAME = "_default"
COLLECTION_NAME = "_default"

embeddings = OpenAIEmbeddings()

vector_store = CouchbaseQueryVectorStore(
    cluster=cluster,
    bucket_name=BUCKET_NAME,
    scope_name=SCOPE_NAME,
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
    distance_metric=DistanceStrategy.DOT,
)

Add Documents:

from langchain_core.documents import Document

document_1 = Document(page_content="foo", metadata={"baz": "bar"})
document_2 = Document(page_content="thud", metadata={"bar": "baz"})
document_3 = Document(page_content="i will be deleted :(")

documents = [document_1, document_2, document_3]
ids = ["1", "2", "3"]
vector_store.add_documents(documents=documents, ids=ids)

Delete Documents:

vector_store.delete(ids=["3"])

Search:

results = vector_store.similarity_search(query="thud",k=1)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with filter:

results = vector_store.similarity_search(query="thud",k=1, where_str="metadata.bar = 'baz'")
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with score:

results = vector_store.similarity_search_with_score(query="qux",k=1)
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=-0.832155] foo [{'baz': 'bar'}]

Async:

# add documents
await vector_store.aadd_documents(documents=documents, ids=ids)

# delete documents
await vector_store.adelete(ids=["3"])

# search
results = vector_store.asimilarity_search(query="thud",k=1)

# search with score
results = await vector_store.asimilarity_search_with_score(query="qux",k=1)
for doc,score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=-0.832155] foo [{'baz': 'bar'}]

Use as Retriever:

retriever = vector_store.as_retriever(
    search_kwargs={"k": 1, "fetch_k": 2, "lambda_mult": 0.5},
)
retriever.invoke("thud")

[Document(id='2', metadata={'bar': 'baz'}, page_content='thud')]

Create a new index for the Query vector store.

Args:

index_type (IndexType): Type of the index (BHIVE or COMPOSITE) to create. index_description (str): Description of the index like “IVF,SQ8”. distance_metric (Optional[DistanceStrategy]): Distance metric to use for the

index_name (str): Name of the index to create.: Defaults to “langchain_{index_type}_query_index”.
vector_field (str): Name of the vector field to use for the index.: Defaults to the embedding key in the constructor.
vector_dimension (Optional[int]): Dimension of the vector field.: If not provided, it will be determined from the embedding object.
fields (List[str]): List of fields to include in the index.: Defaults to the text field in the constructor.
where_clause (Optional[str]): Optional where clause to filter the documents to index.: Defaults to None.
index_scan_nprobes (Optional[int]): Number of probes to use for the index.: Defaults to None.
index_trainlist (Optional[int]): Number of training samples to use for the index.: Defaults to None.

classmethod from_texts(texts: List[str], embedding: Embeddings, metadatas: List[dict] | None = None, **kwargs: Any) → CouchbaseQueryVectorStore[source]

Construct a Couchbase Query Vector Store from a list of texts.

Example:

from langchain_couchbase import CouchbaseQueryVectorStore
from langchain_couchbase.vectorstores import DistanceStrategy
from langchain_openai import OpenAIEmbeddings

from couchbase.cluster import Cluster
from couchbase.auth import PasswordAuthenticator
from couchbase.options import ClusterOptions
from datetime import timedelta

auth = PasswordAuthenticator(username, password)
options = ClusterOptions(auth)
connect_string = "couchbases://localhost"
cluster = Cluster(connect_string, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

embeddings = OpenAIEmbeddings()

texts = ["hello", "world"]

vectorstore = CouchbaseQueryVectorStore.from_texts(
    texts,
    embedding=embeddings,
    cluster=cluster,
    bucket_name="BUCKET_NAME",
    scope_name="SCOPE_NAME",
    collection_name="COLLECTION_NAME",
    distance_metric=DistanceStrategy.COSINE,
)

Args:

texts (List[str]): list of texts to add to the vector store. embedding (Embeddings): embedding function to use. metadatas (optional[List[Dict]): list of metadatas to add to documents. **kwargs: Keyword arguments used to initialize the vector store with and/or

Returns:

A Couchbase Query Vector Store.

similarity_search(query: str, k: int = 4, where_str: str | None = None, **kwargs: Any) → List[Document][source]

Return documents most similar to the query.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

where_str (Optional[str]): Optional where clause to filter the documents.: Defaults to None.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of Documents most similar to the query.

similarity_search_by_vector(embedding: List[float], k: int = 4, where_str: str | None = None, **kwargs: Any) → List[Document][source]

Return documents that are most similar to the vector embedding.

Args:

embedding (List[float]): Embedding to look up documents similar to. k (int): Number of Documents to return.

where_str (Optional[str]): Optional where clause to filter the documents.: Defaults to None.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to document text and metadata fields.

Returns:

List of Documents most similar to the query.

similarity_search_with_score(query: str, k: int = 4, where_str: str | None = None, **kwargs: Any) → List[Tuple[Document, float]][source]

Return documents that are most similar to the query with their distances. Lower distances are more similar.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

where_str (Optional[str]): Optional where clause to filter the documents.: Defaults to None.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to text and metadata fields.

Returns:

List of (Document, distance) that are most similar to the query. Lower distances are more similar.

similarity_search_with_score_by_vector(embedding: List[float], k: int = 4, where_str: str | None = None, **kwargs: Any) → List[Tuple[Document, float]][source]

Return docs most similar to embedding vector with their distances. Lower distances are more similar.

Args:

embedding (List[float]): Embedding vector to look up documents similar to. k (int): Number of Documents to return.

where_str (Optional[str]): Optional where clause to filter the documents.: Defaults to None.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of (Document, distance) that are the most similar to the query vector. Lower distances are more similar.

class langchain_couchbase.vectorstores.query_vector_store.DistanceStrategy(*values)[source]

Bases: Enum

Distance strategy for the similarity search.

COSINE = 'cosine'

DOT = 'dot'

EUCLIDEAN = 'euclidean'

EUCLIDEAN_SQUARED = 'euclidean_squared'

L2 = 'l2'

L2_SQUARED = 'l2_squared'

class langchain_couchbase.vectorstores.query_vector_store.IndexType(*values)[source]

Bases: Enum

Type of the Query index to create.

BHIVE = 'bhive'

COMPOSITE = 'composite'

Couchbase Search Vector Store

Note

This vector store uses the Search service to store and search document embeddings. And is available in Couchbase Server versions 7.6 and above.

class langchain_couchbase.vectorstores.search_vector_store.CouchbaseSearchVectorStore(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, embedding: Embeddings, index_name: str, *, text_key: str | None = 'text', embedding_key: str | None = 'embedding', scoped_index: bool = True)[source]

Bases: BaseCouchbaseVectorStore

__Couchbase__ vector store integration using Search/FTS service.

Setup:

Install langchain-couchbase and head over to Couchbase Capella and create a new cluster with a bucket, collection and a search index.

For more information on Search service, see the Couchbase Search Service documentation.

pip install -U langchain-couchbase

import getpass

COUCHBASE_CONNECTION_STRING = getpass.getpass("Enter the connection string for the Couchbase cluster: ")
DB_USERNAME = getpass.getpass("Enter the username for the Couchbase cluster: ")
DB_PASSWORD = getpass.getpass("Enter the password for the Couchbase cluster: ")

Key init args — indexing params:

embedding: Embeddings: Embedding function to use.

Key init args — client params:

cluster: Cluster: Couchbase cluster object with active connection.
bucket_name: str: Name of the bucket to store documents in.
scope_name: str: Name of the scope in the bucket to store documents in.
collection_name: str: Name of the collection in the scope to store documents in.
index_name: str: Name of the Search index to use.

Instantiate:

from datetime import timedelta
from langchain_openai import OpenAIEmbeddings
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase import CouchbaseSearchVectorStore

auth = PasswordAuthenticator(DB_USERNAME, DB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(COUCHBASE_CONNECTION_STRING, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

BUCKET_NAME = "langchain_bucket"
SCOPE_NAME = "_default"
COLLECTION_NAME = "_default"
SEARCH_INDEX_NAME = "langchain-test-index"

embeddings = OpenAIEmbeddings()

vector_store = CouchbaseSearchVectorStore(
    cluster=cluster,
    bucket_name=BUCKET_NAME,
    scope_name=SCOPE_NAME,
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
    index_name=SEARCH_INDEX_NAME,
)

Add Documents:

from langchain_core.documents import Document

document_1 = Document(page_content="foo", metadata={"baz": "bar"})
document_2 = Document(page_content="thud", metadata={"bar": "baz"})
document_3 = Document(page_content="i will be deleted :(")

documents = [document_1, document_2, document_3]
ids = ["1", "2", "3"]
vector_store.add_documents(documents=documents, ids=ids)

Delete Documents:

vector_store.delete(ids=["3"])

Search:

results = vector_store.similarity_search(query="thud",k=1)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with filter:

from couchbase.search import MatchQuery

filter = MatchQuery("baz",field="metadata.bar")
results = vector_store.similarity_search(query="thud",k=1,filter=filter)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Hybrid Search:

results = vector_store.similarity_search(query="thud",k=1,search_options={"query": {"field":"metadata.bar", "match": "baz"}})
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with score:

results = vector_store.similarity_search_with_score(query="qux",k=1)
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.500762] foo [{'baz': 'bar'}]

Async:

# add documents
await vector_store.aadd_documents(documents=documents, ids=ids)

# delete documents
await vector_store.adelete(ids=["3"])

# search
results = vector_store.asimilarity_search(query="thud",k=1)

# search with score
results = await vector_store.asimilarity_search_with_score(query="qux",k=1)
for doc,score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.500735] foo [{'baz': 'bar'}]

Use as Retriever:

retriever = vector_store.as_retriever(
    search_kwargs={"k": 1, "fetch_k": 2, "lambda_mult": 0.5},
)
retriever.invoke("thud")

[Document(id='2', metadata={'bar': 'baz'}, page_content='thud')]

classmethod from_texts(texts: List[str], embedding: Embeddings, metadatas: List[dict] | None = None, **kwargs: Any) → CouchbaseSearchVectorStore[source]

Construct a Couchbase Search Vector Store from a list of texts.

Example:

from langchain_couchbase import CouchbaseSearchVectorStore
from langchain_openai import OpenAIEmbeddings

from couchbase.cluster import Cluster
from couchbase.auth import PasswordAuthenticator
from couchbase.options import ClusterOptions
from datetime import timedelta

auth = PasswordAuthenticator(username, password)
options = ClusterOptions(auth)
connect_string = "couchbases://localhost"
cluster = Cluster(connect_string, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

embeddings = OpenAIEmbeddings()

texts = ["hello", "world"]

vectorstore = CouchbaseSearchVectorStore.from_texts(
    texts,
    embedding=embeddings,
    cluster=cluster,
    bucket_name="",
    scope_name="",
    collection_name="",
    index_name="vector-index",
)

Args:

texts (List[str]): list of texts to add to the vector store. embedding (Embeddings): embedding function to use. metadatas (optional[List[Dict]): list of metadatas to add to documents. **kwargs: Keyword arguments used to initialize the vector store

Returns:

A Couchbase Searchvector store.

similarity_search(query: str, k: int = 4, search_options: Dict[str, Any] | None = {}, filter: SearchQuery | None = None, **kwargs: Any) → List[Document][source]

Return documents most similar to embedding vector with their scores.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional hybrid search options

that are passed to Couchbase search service. Used for combining vector similarity with text-based search criteria.

Defaults to empty dictionary.

Examples:

{"query": {"field": "metadata.category", "match": "action"}}

{"query": {"field": "metadata.year", "min": 2020, "max": 2023}}

filter (Optional[SearchQuery]): Optional filter to apply before

vector search execution. It reduces the search space.

Defaults to None.

Examples:

NumericRangeQuery(field="metadata.year", min=2020, max=2023)

TermQuery("search_term",field="metadata.category")

ConjunctionQuery(query1, query2)

fields (Optional[List[str]]): Optional list of fields to include in the

metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of Documents most similar to the query.

Note:

Use search_options for hybrid search combining vector similarity with other supported search queries
Use filter for efficient pre-search filtering, especially with large datasets
Both parameters can be used together for complex search scenarios

similarity_search_by_vector(embedding: List[float], k: int = 4, search_options: Dict[str, Any] | None = {}, filter: SearchQuery | None = None, **kwargs: Any) → List[Document][source]

Return documents that are most similar to the vector embedding.

Args:

embedding (List[float]): Embedding to look up documents similar to. k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional hybrid search options

that are passed to Couchbase search service. Used for combining vector similarity with text-based search criteria.

Defaults to empty dictionary.

Examples:

{"query": {"field": "metadata.category", "match": "action"}}

{"query": {"field": "metadata.year", "min": 2020, "max": 2023}}

filter (Optional[SearchQuery]): Optional filter to apply before

vector search execution. It reduces the search space.

Defaults to None.

Examples:

NumericRangeQuery(field="metadata.year", min=2020, max=2023)

TermQuery("search_term",field="metadata.category")

ConjunctionQuery(query1, query2)

fields (Optional[List[str]]): Optional list of fields to include in the

metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to document text and metadata fields.

Returns:

List of Documents most similar to the query.

Note:

Use search_options for hybrid search combining vector similarity with other supported search queries
Use filter for efficient pre-search filtering, especially with large datasets
Both parameters can be used together for complex search scenarios

similarity_search_with_score(query: str, k: int = 4, search_options: Dict[str, Any] | None = {}, filter: SearchQuery | None = None, **kwargs: Any) → List[Tuple[Document, float]][source]

Return documents that are most similar to the query with their scores.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional hybrid search options

that are passed to Couchbase search service. Used for combining vector similarity with text-based search criteria.

Defaults to empty dictionary.

Examples:

{"query": {"field": "metadata.category", "match": "action"}}

{"query": {"field": "metadata.year", "min": 2020, "max": 2023}}

filter (Optional[SearchQuery]): Optional filter to apply before

vector search execution. It reduces the search space.

Defaults to None.

Examples:

NumericRangeQuery(field="metadata.year", min=2020, max=2023)

TermQuery("search_term",field="metadata.category")

ConjunctionQuery(query1, query2)

fields (Optional[List[str]]): Optional list of fields to include in the

metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to text and metadata fields.

Returns:

List of (Document, score) that are most similar to the query.

Note:

Use search_options for hybrid search combining vector similarity with other supported search queries
Use filter for efficient pre-search filtering, especially with large datasets
Both parameters can be used together for complex search scenarios

similarity_search_with_score_by_vector(embedding: List[float], k: int = 4, search_options: Dict[str, Any] | None = {}, filter: SearchQuery | None = None, **kwargs: Any) → List[Tuple[Document, float]][source]

Return docs most similar to embedding vector with their scores.

Args:

embedding (List[float]): Embedding vector to look up documents similar to. k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional hybrid search options

that are passed to Couchbase search service. Used for combining vector similarity with text-based search criteria.

Defaults to empty dictionary.

Examples:

{"query": {"field": "metadata.category", "match": "action"}}

{"query": {"field": "metadata.year", "min": 2020, "max": 2023}}

filter (Optional[SearchQuery]): Optional filter to apply before

vector search execution. It reduces the search space.

Defaults to None.

Examples:

NumericRangeQuery(field="metadata.year", min=2020, max=2023)

TermQuery("search_term",field="metadata.category")

ConjunctionQuery(query1, query2)

fields (Optional[List[str]]): Optional list of fields to include in the

metadata of results. Note that these need to be stored in the index.: If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of (Document, score) that are the most similar to the query vector.

Note:

Use search_options for hybrid search combining vector similarity with other supported search queries
Use filter for efficient pre-search filtering, especially with large datasets
Both parameters can be used together for complex search scenarios

Couchbase Vector Store

Warning

This class is deprecated since version 0.3.0 and will be removed in version 1.0.0. Use CouchbaseSearchVectorStore instead.

Couchbase vector stores.

class langchain_couchbase.vectorstores.vectorstores.CouchbaseVectorStore(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, embedding: Embeddings, index_name: str, *, text_key: str | None = 'text', embedding_key: str | None = 'embedding', scoped_index: bool = True)[source]

Bases: VectorStore

Deprecated since version 0.3.0: Use :class:`~langchain_couchbase.vectorstores.CouchbaseSearchVectorStore` instead. It will not be removed until langchain-couchbase==1.0.0.

__Couchbase__ vector store integration.

Deprecated since version 0.1.0: This class is deprecated and will be removed in version 1.0.0. Use CouchbaseSearchVectorStore instead.

Setup:

Install langchain-couchbase and head over to the Couchbase [website](https://cloud.couchbase.com) and create a new connection, with a bucket, collection, and search index.

pip install -U langchain-couchbase

import getpass

COUCHBASE_CONNECTION_STRING = getpass.getpass("Enter the connection string for the Couchbase cluster: ")
DB_USERNAME = getpass.getpass("Enter the username for the Couchbase cluster: ")
DB_PASSWORD = getpass.getpass("Enter the password for the Couchbase cluster: ")

Key init args — indexing params:

embedding: Embeddings: Embedding function to use.

Key init args — client params:

cluster: Cluster: Couchbase cluster object with active connection.
bucket_name: str: Name of the bucket to store documents in.
scope_name: str: Name of the scope in the bucket to store documents in.
collection_name: str: Name of the collection in the scope to store documents in.
index_name: str: Name of the Search index to use.

Instantiate:

from datetime import timedelta
from langchain_openai import OpenAIEmbeddings
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase import CouchbaseVectorStore

auth = PasswordAuthenticator(DB_USERNAME, DB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(COUCHBASE_CONNECTION_STRING, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

BUCKET_NAME = "langchain_bucket"
SCOPE_NAME = "_default"
COLLECTION_NAME = "_default"
SEARCH_INDEX_NAME = "langchain-test-index"

embeddings = OpenAIEmbeddings()

vector_store = CouchbaseVectorStore(
    cluster=cluster,
    bucket_name=BUCKET_NAME,
    scope_name=SCOPE_NAME,
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
    index_name=SEARCH_INDEX_NAME,
)

Add Documents:

from langchain_core.documents import Document

document_1 = Document(page_content="foo", metadata={"baz": "bar"})
document_2 = Document(page_content="thud", metadata={"bar": "baz"})
document_3 = Document(page_content="i will be deleted :(")

documents = [document_1, document_2, document_3]
ids = ["1", "2", "3"]
vector_store.add_documents(documents=documents, ids=ids)

Delete Documents:

vector_store.delete(ids=["3"])

Search:

results = vector_store.similarity_search(query="thud",k=1)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with filter:

results = vector_store.similarity_search(query="thud",k=1,search_options={"query": {"field":"metadata.bar", "match": "baz"}})
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with score:

results = vector_store.similarity_search_with_score(query="qux",k=1)
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.500778] foo [{'baz': 'bar'}]

Async:

# add documents
await vector_store.aadd_documents(documents=documents, ids=ids)

# delete documents
await vector_store.adelete(ids=["3"])

# search
results = vector_store.asimilarity_search(query="thud",k=1)

# search with score
results = await vector_store.asimilarity_search_with_score(query="qux",k=1)
for doc,score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.500762] foo [{'baz': 'bar'}]

Use as Retriever:

retriever = vector_store.as_retriever(
    search_kwargs={"k": 1, "fetch_k": 2, "lambda_mult": 0.5},
)
retriever.invoke("thud")

[Document(id='2', metadata={'bar': 'baz'}, page_content='thud')]

DEFAULT_BATCH_SIZE = 100

add_texts(texts: Iterable[str], metadatas: List[dict] | None = None, ids: List[str] | None = None, batch_size: int | None = None, **kwargs: Any) → List[str][source]

Run texts through the embeddings and persist in vectorstore.

If the document IDs are passed, the existing documents (if any) will be overwritten with the new ones.

Args:

texts (Iterable[str]): Iterable of strings to add to the vectorstore. metadatas (Optional[List[Dict]]): Optional list of metadatas associated

ids (Optional[List[str]]): Optional list of ids associated with the texts.: IDs have to be unique strings across the collection. If it is not specified uuids are generated and used as ids.
batch_size (Optional[int]): Optional batch size for bulk insertions.: Default is 100.

Returns:

List[str]:List of ids from adding the texts into the vectorstore.

delete(ids: List[str] | None = None, **kwargs: Any) → bool | None[source]

Delete documents from the vector store by ids.

Args:: ids (List[str]): List of IDs of the documents to delete. batch_size (Optional[int]): Optional batch size for bulk deletions.
Returns:: bool: True if all the documents were deleted successfully, False otherwise.

property embeddings: Embeddings: Return the query embedding object.

classmethod from_texts(texts: List[str], embedding: Embeddings, metadatas: List[dict] | None = None, **kwargs: Any) → CouchbaseVectorStore[source]

Construct a Couchbase vector store from a list of texts.

Example:

from langchain_couchbase import CouchbaseVectorStore
from langchain_openai import OpenAIEmbeddings

from couchbase.cluster import Cluster
from couchbase.auth import PasswordAuthenticator
from couchbase.options import ClusterOptions
from datetime import timedelta

auth = PasswordAuthenticator(username, password)
options = ClusterOptions(auth)
connect_string = "couchbases://localhost"
cluster = Cluster(connect_string, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

embeddings = OpenAIEmbeddings()

texts = ["hello", "world"]

vectorstore = CouchbaseVectorStore.from_texts(
    texts,
    embedding=embeddings,
    cluster=cluster,
    bucket_name="",
    scope_name="",
    collection_name="",
    index_name="vector-index",
)

Args:

texts (List[str]): list of texts to add to the vector store. embedding (Embeddings): embedding function to use. metadatas (optional[List[Dict]): list of metadatas to add to documents. **kwargs: Keyword arguments used to initialize the vector store with and/or

Returns:

A Couchbase vector store.

similarity_search(query: str, k: int = 4, search_options: Dict[str, Any] | None = {}, **kwargs: Any) → List[Document][source]

Return documents most similar to embedding vector with their scores.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional search options that are: passed to Couchbase search. Defaults to empty dictionary
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of Documents most similar to the query.

similarity_search_by_vector(embedding: List[float], k: int = 4, search_options: Dict[str, Any] | None = {}, **kwargs: Any) → List[Document][source]

Return documents that are most similar to the vector embedding.

Args:

embedding (List[float]): Embedding to look up documents similar to. k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional search options that are: passed to Couchbase search. Defaults to empty dictionary.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to document text and metadata fields.

Returns:

List of Documents most similar to the query.

similarity_search_with_score(query: str, k: int = 4, search_options: Dict[str, Any] | None = {}, **kwargs: Any) → List[Tuple[Document, float]][source]

Return documents that are most similar to the query with their scores.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional search options that are: passed to Couchbase search. Defaults to empty dictionary.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to text and metadata fields.

Returns:

List of (Document, score) that are most similar to the query.

similarity_search_with_score_by_vector(embedding: List[float], k: int = 4, search_options: Dict[str, Any] | None = {}, **kwargs: Any) → List[Tuple[Document, float]][source]

Return docs most similar to embedding vector with their scores.

Args:

embedding (List[float]): Embedding vector to look up documents similar to. k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional search options that are: passed to Couchbase search. Defaults to empty dictionary.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of (Document, score) that are the most similar to the query vector.

Base Vector Store

Base vector store for Couchbase.

class langchain_couchbase.vectorstores.base_vector_store.BaseCouchbaseVectorStore(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, embedding: Embeddings, *, text_key: str | None = 'text', embedding_key: str | None = 'embedding')[source]

Bases: VectorStore

Base vector store for Couchbase. This class handles the data input and output for the vector store. This class is meant to be used as a base class for other vector stores.

DEFAULT_BATCH_SIZE = 100

add_texts(texts: Iterable[str], metadatas: List[dict] | None = None, ids: List[str] | None = None, batch_size: int | None = None, **kwargs: Any) → List[str][source]

Run texts through the embeddings and persist in vectorstore.

If the document IDs are passed, the existing documents (if any) will be overwritten with the new ones.

Args:

texts (Iterable[str]): Iterable of strings to add to the vectorstore. metadatas (Optional[List[Dict]]): Optional list of metadatas associated

ids (Optional[List[str]]): Optional list of ids associated with the texts.: IDs have to be unique strings across the collection. If it is not specified uuids are generated and used as ids.
batch_size (Optional[int]): Optional batch size for bulk insertions.: Default is 100.

Returns:

List[str]:List of ids from adding the texts into the vectorstore.

delete(ids: List[str] | None = None, **kwargs: Any) → bool | None[source]

Delete documents from the vector store by ids.

Args:: ids (List[str]): List of IDs of the documents to delete. batch_size (Optional[int]): Optional batch size for bulk deletions.
Returns:: bool: True if all the documents were deleted successfully, False otherwise.

property embeddings: Embeddings: Return the query embedding object.

Caching

LangChain Couchbase Caches

Functions “_hash”, “_loads_generations” and “_dumps_generations” are copied from the LangChain community module:

class langchain_couchbase.cache.CouchbaseCache(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, ttl: timedelta | None = None, **kwargs: Dict[str, Any])[source]

Bases: BaseCache

Couchbase LLM Cache LLM Cache that uses Couchbase as the backend

LLM = 'llm'

PROMPT = 'prompt'

RETURN_VAL = 'return_val'

clear(**kwargs: Any) → None[source]: Clear the cache. This will delete all documents in the collection. This requires an index on the collection.

lookup(prompt: str, llm_string: str) → Sequence[Generation] | None[source]: Look up from cache based on prompt and llm_string.

update(prompt: str, llm_string: str, return_val: Sequence[Generation]) → None[source]: Update cache based on prompt and llm_string.

class langchain_couchbase.cache.CouchbaseSemanticCache(cluster: Cluster, embedding: Embeddings, bucket_name: str, scope_name: str, collection_name: str, index_name: str, score_threshold: float | None = None, ttl: timedelta | None = None)[source]

Bases: BaseCache, CouchbaseSearchVectorStore

Couchbase Semantic Cache Cache backed by a Couchbase Server with Vector Store support

LLM = 'llm_string'

RETURN_VAL = 'return_val'

clear(**kwargs: Any) → None[source]: Clear the cache. This will delete all documents in the collection. This requires an index on the collection.

lookup(prompt: str, llm_string: str) → Sequence[Generation] | None[source]: Look up from cache based on the semantic similarity of the prompt

update(prompt: str, llm_string: str, return_val: Sequence[Generation]) → None[source]: Update cache based on the prompt and llm_string

Chat Message History

Couchbase Chat Message History

Couchbase Chat Message History

class langchain_couchbase.chat_message_histories.CouchbaseChatMessageHistory(*, cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, session_id: str, session_id_key: str = 'session_id', message_key: str = 'message', create_index: bool = True, ttl: timedelta | None = None)[source]

Bases: BaseChatMessageHistory

Couchbase Chat Message History Chat message history that uses Couchbase as the storage

add_message(message: BaseMessage) → None[source]: Add a message to the cache

add_messages(messages: Sequence[BaseMessage]) → None[source]: Add messages to the cache in a batched manner

clear() → None[source]: Clear the cache

property messages: List[BaseMessage]: Get all messages in the cache associated with the session_id

Package Overview

class langchain_couchbase.CouchbaseCache(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, ttl: timedelta | None = None, **kwargs: Dict[str, Any])[source]

Bases: BaseCache

Couchbase LLM Cache LLM Cache that uses Couchbase as the backend

LLM = 'llm'

PROMPT = 'prompt'

RETURN_VAL = 'return_val'

clear(**kwargs: Any) → None[source]: Clear the cache. This will delete all documents in the collection. This requires an index on the collection.

lookup(prompt: str, llm_string: str) → Sequence[Generation] | None[source]: Look up from cache based on prompt and llm_string.

update(prompt: str, llm_string: str, return_val: Sequence[Generation]) → None[source]: Update cache based on prompt and llm_string.

class langchain_couchbase.CouchbaseChatMessageHistory(*, cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, session_id: str, session_id_key: str = 'session_id', message_key: str = 'message', create_index: bool = True, ttl: timedelta | None = None)[source]

Bases: BaseChatMessageHistory

Couchbase Chat Message History Chat message history that uses Couchbase as the storage

add_message(message: BaseMessage) → None[source]: Add a message to the cache

add_messages(messages: Sequence[BaseMessage]) → None[source]: Add messages to the cache in a batched manner

clear() → None[source]: Clear the cache

property messages: List[BaseMessage]: Get all messages in the cache associated with the session_id

class langchain_couchbase.CouchbaseQueryVectorStore(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, embedding: Embeddings, distance_metric: DistanceStrategy, *, text_key: str | None = 'text', embedding_key: str | None = 'embedding')[source]

Bases: BaseCouchbaseVectorStore

__Couchbase__ vector store integration using Query and Index service.

Setup:

Install langchain-couchbase and head over to Couchbase Capella and create a new cluster with a bucket and collection.

For more information on the indexes, see Hyperscale Vector Index documentation or Composite Vector Index documentation.

pip install -U langchain-couchbase

import getpass

COUCHBASE_CONNECTION_STRING = getpass.getpass("Enter the connection string for the Couchbase cluster: ")
DB_USERNAME = getpass.getpass("Enter the username for the Couchbase cluster: ")
DB_PASSWORD = getpass.getpass("Enter the password for the Couchbase cluster: ")

Key init args — indexing params:

embedding: Embeddings: Embedding function to use.

Key init args — client params:

cluster: Cluster: Couchbase cluster object with active connection.
bucket_name: str: Name of the bucket to store documents in.
scope_name: str: Name of the scope in the bucket to store documents in.
collection_name: str: Name of the collection in the scope to store documents in.
distance_metric: DistanceStrategy: Distance metric to use for the index. Options are: DOT, L2, EUCLIDEAN, COSINE, L2_SQUARED, EUCLIDEAN_SQUARED.

Instantiate:

from datetime import timedelta
from langchain_openai import OpenAIEmbeddings
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase import CouchbaseQueryVectorStore
from langchain_couchbase.vectorstores import DistanceStrategy

auth = PasswordAuthenticator(DB_USERNAME, DB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(COUCHBASE_CONNECTION_STRING, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

BUCKET_NAME = "langchain_bucket"
SCOPE_NAME = "_default"
COLLECTION_NAME = "_default"

embeddings = OpenAIEmbeddings()

vector_store = CouchbaseQueryVectorStore(
    cluster=cluster,
    bucket_name=BUCKET_NAME,
    scope_name=SCOPE_NAME,
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
    distance_metric=DistanceStrategy.DOT,
)

Add Documents:

from langchain_core.documents import Document

document_1 = Document(page_content="foo", metadata={"baz": "bar"})
document_2 = Document(page_content="thud", metadata={"bar": "baz"})
document_3 = Document(page_content="i will be deleted :(")

documents = [document_1, document_2, document_3]
ids = ["1", "2", "3"]
vector_store.add_documents(documents=documents, ids=ids)

Delete Documents:

vector_store.delete(ids=["3"])

Search:

results = vector_store.similarity_search(query="thud",k=1)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with filter:

results = vector_store.similarity_search(query="thud",k=1, where_str="metadata.bar = 'baz'")
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with score:

results = vector_store.similarity_search_with_score(query="qux",k=1)
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=-0.832155] foo [{'baz': 'bar'}]

Async:

# add documents
await vector_store.aadd_documents(documents=documents, ids=ids)

# delete documents
await vector_store.adelete(ids=["3"])

# search
results = vector_store.asimilarity_search(query="thud",k=1)

# search with score
results = await vector_store.asimilarity_search_with_score(query="qux",k=1)
for doc,score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=-0.832155] foo [{'baz': 'bar'}]

Use as Retriever:

retriever = vector_store.as_retriever(
    search_kwargs={"k": 1, "fetch_k": 2, "lambda_mult": 0.5},
)
retriever.invoke("thud")

[Document(id='2', metadata={'bar': 'baz'}, page_content='thud')]

Create a new index for the Query vector store.

Args:

index_type (IndexType): Type of the index (BHIVE or COMPOSITE) to create. index_description (str): Description of the index like “IVF,SQ8”. distance_metric (Optional[DistanceStrategy]): Distance metric to use for the

index_name (str): Name of the index to create.: Defaults to “langchain_{index_type}_query_index”.
vector_field (str): Name of the vector field to use for the index.: Defaults to the embedding key in the constructor.
vector_dimension (Optional[int]): Dimension of the vector field.: If not provided, it will be determined from the embedding object.
fields (List[str]): List of fields to include in the index.: Defaults to the text field in the constructor.
where_clause (Optional[str]): Optional where clause to filter the documents to index.: Defaults to None.
index_scan_nprobes (Optional[int]): Number of probes to use for the index.: Defaults to None.
index_trainlist (Optional[int]): Number of training samples to use for the index.: Defaults to None.

classmethod from_texts(texts: List[str], embedding: Embeddings, metadatas: List[dict] | None = None, **kwargs: Any) → CouchbaseQueryVectorStore[source]

Construct a Couchbase Query Vector Store from a list of texts.

Example:

from langchain_couchbase import CouchbaseQueryVectorStore
from langchain_couchbase.vectorstores import DistanceStrategy
from langchain_openai import OpenAIEmbeddings

from couchbase.cluster import Cluster
from couchbase.auth import PasswordAuthenticator
from couchbase.options import ClusterOptions
from datetime import timedelta

auth = PasswordAuthenticator(username, password)
options = ClusterOptions(auth)
connect_string = "couchbases://localhost"
cluster = Cluster(connect_string, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

embeddings = OpenAIEmbeddings()

texts = ["hello", "world"]

vectorstore = CouchbaseQueryVectorStore.from_texts(
    texts,
    embedding=embeddings,
    cluster=cluster,
    bucket_name="BUCKET_NAME",
    scope_name="SCOPE_NAME",
    collection_name="COLLECTION_NAME",
    distance_metric=DistanceStrategy.COSINE,
)

Args:

texts (List[str]): list of texts to add to the vector store. embedding (Embeddings): embedding function to use. metadatas (optional[List[Dict]): list of metadatas to add to documents. **kwargs: Keyword arguments used to initialize the vector store with and/or

Returns:

A Couchbase Query Vector Store.

similarity_search(query: str, k: int = 4, where_str: str | None = None, **kwargs: Any) → List[Document][source]

Return documents most similar to the query.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

where_str (Optional[str]): Optional where clause to filter the documents.: Defaults to None.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of Documents most similar to the query.

similarity_search_by_vector(embedding: List[float], k: int = 4, where_str: str | None = None, **kwargs: Any) → List[Document][source]

Return documents that are most similar to the vector embedding.

Args:

embedding (List[float]): Embedding to look up documents similar to. k (int): Number of Documents to return.

where_str (Optional[str]): Optional where clause to filter the documents.: Defaults to None.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to document text and metadata fields.

Returns:

List of Documents most similar to the query.

similarity_search_with_score(query: str, k: int = 4, where_str: str | None = None, **kwargs: Any) → List[Tuple[Document, float]][source]

Return documents that are most similar to the query with their distances. Lower distances are more similar.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

where_str (Optional[str]): Optional where clause to filter the documents.: Defaults to None.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to text and metadata fields.

Returns:

List of (Document, distance) that are most similar to the query. Lower distances are more similar.

similarity_search_with_score_by_vector(embedding: List[float], k: int = 4, where_str: str | None = None, **kwargs: Any) → List[Tuple[Document, float]][source]

Return docs most similar to embedding vector with their distances. Lower distances are more similar.

Args:

embedding (List[float]): Embedding vector to look up documents similar to. k (int): Number of Documents to return.

where_str (Optional[str]): Optional where clause to filter the documents.: Defaults to None.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of (Document, distance) that are the most similar to the query vector. Lower distances are more similar.

class langchain_couchbase.CouchbaseSearchVectorStore(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, embedding: Embeddings, index_name: str, *, text_key: str | None = 'text', embedding_key: str | None = 'embedding', scoped_index: bool = True)[source]

Bases: BaseCouchbaseVectorStore

__Couchbase__ vector store integration using Search/FTS service.

Setup:

Install langchain-couchbase and head over to Couchbase Capella and create a new cluster with a bucket, collection and a search index.

For more information on Search service, see the Couchbase Search Service documentation.

pip install -U langchain-couchbase

import getpass

COUCHBASE_CONNECTION_STRING = getpass.getpass("Enter the connection string for the Couchbase cluster: ")
DB_USERNAME = getpass.getpass("Enter the username for the Couchbase cluster: ")
DB_PASSWORD = getpass.getpass("Enter the password for the Couchbase cluster: ")

Key init args — indexing params:

embedding: Embeddings: Embedding function to use.

Key init args — client params:

cluster: Cluster: Couchbase cluster object with active connection.
bucket_name: str: Name of the bucket to store documents in.
scope_name: str: Name of the scope in the bucket to store documents in.
collection_name: str: Name of the collection in the scope to store documents in.
index_name: str: Name of the Search index to use.

Instantiate:

from datetime import timedelta
from langchain_openai import OpenAIEmbeddings
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase import CouchbaseSearchVectorStore

auth = PasswordAuthenticator(DB_USERNAME, DB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(COUCHBASE_CONNECTION_STRING, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

BUCKET_NAME = "langchain_bucket"
SCOPE_NAME = "_default"
COLLECTION_NAME = "_default"
SEARCH_INDEX_NAME = "langchain-test-index"

embeddings = OpenAIEmbeddings()

vector_store = CouchbaseSearchVectorStore(
    cluster=cluster,
    bucket_name=BUCKET_NAME,
    scope_name=SCOPE_NAME,
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
    index_name=SEARCH_INDEX_NAME,
)

Add Documents:

from langchain_core.documents import Document

document_1 = Document(page_content="foo", metadata={"baz": "bar"})
document_2 = Document(page_content="thud", metadata={"bar": "baz"})
document_3 = Document(page_content="i will be deleted :(")

documents = [document_1, document_2, document_3]
ids = ["1", "2", "3"]
vector_store.add_documents(documents=documents, ids=ids)

Delete Documents:

vector_store.delete(ids=["3"])

Search:

results = vector_store.similarity_search(query="thud",k=1)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with filter:

from couchbase.search import MatchQuery

filter = MatchQuery("baz",field="metadata.bar")
results = vector_store.similarity_search(query="thud",k=1,filter=filter)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Hybrid Search:

results = vector_store.similarity_search(query="thud",k=1,search_options={"query": {"field":"metadata.bar", "match": "baz"}})
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with score:

results = vector_store.similarity_search_with_score(query="qux",k=1)
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.500762] foo [{'baz': 'bar'}]

Async:

# add documents
await vector_store.aadd_documents(documents=documents, ids=ids)

# delete documents
await vector_store.adelete(ids=["3"])

# search
results = vector_store.asimilarity_search(query="thud",k=1)

# search with score
results = await vector_store.asimilarity_search_with_score(query="qux",k=1)
for doc,score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.500735] foo [{'baz': 'bar'}]

Use as Retriever:

retriever = vector_store.as_retriever(
    search_kwargs={"k": 1, "fetch_k": 2, "lambda_mult": 0.5},
)
retriever.invoke("thud")

[Document(id='2', metadata={'bar': 'baz'}, page_content='thud')]

classmethod from_texts(texts: List[str], embedding: Embeddings, metadatas: List[dict] | None = None, **kwargs: Any) → CouchbaseSearchVectorStore[source]

Construct a Couchbase Search Vector Store from a list of texts.

Example:

from langchain_couchbase import CouchbaseSearchVectorStore
from langchain_openai import OpenAIEmbeddings

from couchbase.cluster import Cluster
from couchbase.auth import PasswordAuthenticator
from couchbase.options import ClusterOptions
from datetime import timedelta

auth = PasswordAuthenticator(username, password)
options = ClusterOptions(auth)
connect_string = "couchbases://localhost"
cluster = Cluster(connect_string, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

embeddings = OpenAIEmbeddings()

texts = ["hello", "world"]

vectorstore = CouchbaseSearchVectorStore.from_texts(
    texts,
    embedding=embeddings,
    cluster=cluster,
    bucket_name="",
    scope_name="",
    collection_name="",
    index_name="vector-index",
)

Args:

texts (List[str]): list of texts to add to the vector store. embedding (Embeddings): embedding function to use. metadatas (optional[List[Dict]): list of metadatas to add to documents. **kwargs: Keyword arguments used to initialize the vector store

Returns:

A Couchbase Searchvector store.

similarity_search(query: str, k: int = 4, search_options: Dict[str, Any] | None = {}, filter: SearchQuery | None = None, **kwargs: Any) → List[Document][source]

Return documents most similar to embedding vector with their scores.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional hybrid search options

that are passed to Couchbase search service. Used for combining vector similarity with text-based search criteria.

Defaults to empty dictionary.

Examples:

{"query": {"field": "metadata.category", "match": "action"}}

{"query": {"field": "metadata.year", "min": 2020, "max": 2023}}

filter (Optional[SearchQuery]): Optional filter to apply before

vector search execution. It reduces the search space.

Defaults to None.

Examples:

NumericRangeQuery(field="metadata.year", min=2020, max=2023)

TermQuery("search_term",field="metadata.category")

ConjunctionQuery(query1, query2)

fields (Optional[List[str]]): Optional list of fields to include in the

metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of Documents most similar to the query.

Note:

Use search_options for hybrid search combining vector similarity with other supported search queries
Use filter for efficient pre-search filtering, especially with large datasets
Both parameters can be used together for complex search scenarios

similarity_search_by_vector(embedding: List[float], k: int = 4, search_options: Dict[str, Any] | None = {}, filter: SearchQuery | None = None, **kwargs: Any) → List[Document][source]

Return documents that are most similar to the vector embedding.

Args:

embedding (List[float]): Embedding to look up documents similar to. k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional hybrid search options

that are passed to Couchbase search service. Used for combining vector similarity with text-based search criteria.

Defaults to empty dictionary.

Examples:

{"query": {"field": "metadata.category", "match": "action"}}

{"query": {"field": "metadata.year", "min": 2020, "max": 2023}}

filter (Optional[SearchQuery]): Optional filter to apply before

vector search execution. It reduces the search space.

Defaults to None.

Examples:

NumericRangeQuery(field="metadata.year", min=2020, max=2023)

TermQuery("search_term",field="metadata.category")

ConjunctionQuery(query1, query2)

fields (Optional[List[str]]): Optional list of fields to include in the

metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to document text and metadata fields.

Returns:

List of Documents most similar to the query.

Note:

Use search_options for hybrid search combining vector similarity with other supported search queries
Use filter for efficient pre-search filtering, especially with large datasets
Both parameters can be used together for complex search scenarios

similarity_search_with_score(query: str, k: int = 4, search_options: Dict[str, Any] | None = {}, filter: SearchQuery | None = None, **kwargs: Any) → List[Tuple[Document, float]][source]

Return documents that are most similar to the query with their scores.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional hybrid search options

that are passed to Couchbase search service. Used for combining vector similarity with text-based search criteria.

Defaults to empty dictionary.

Examples:

{"query": {"field": "metadata.category", "match": "action"}}

{"query": {"field": "metadata.year", "min": 2020, "max": 2023}}

filter (Optional[SearchQuery]): Optional filter to apply before

vector search execution. It reduces the search space.

Defaults to None.

Examples:

NumericRangeQuery(field="metadata.year", min=2020, max=2023)

TermQuery("search_term",field="metadata.category")

ConjunctionQuery(query1, query2)

fields (Optional[List[str]]): Optional list of fields to include in the

metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to text and metadata fields.

Returns:

List of (Document, score) that are most similar to the query.

Note:

Use search_options for hybrid search combining vector similarity with other supported search queries
Use filter for efficient pre-search filtering, especially with large datasets
Both parameters can be used together for complex search scenarios

similarity_search_with_score_by_vector(embedding: List[float], k: int = 4, search_options: Dict[str, Any] | None = {}, filter: SearchQuery | None = None, **kwargs: Any) → List[Tuple[Document, float]][source]

Return docs most similar to embedding vector with their scores.

Args:

embedding (List[float]): Embedding vector to look up documents similar to. k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional hybrid search options

that are passed to Couchbase search service. Used for combining vector similarity with text-based search criteria.

Defaults to empty dictionary.

Examples:

{"query": {"field": "metadata.category", "match": "action"}}

{"query": {"field": "metadata.year", "min": 2020, "max": 2023}}

filter (Optional[SearchQuery]): Optional filter to apply before

vector search execution. It reduces the search space.

Defaults to None.

Examples:

NumericRangeQuery(field="metadata.year", min=2020, max=2023)

TermQuery("search_term",field="metadata.category")

ConjunctionQuery(query1, query2)

fields (Optional[List[str]]): Optional list of fields to include in the

metadata of results. Note that these need to be stored in the index.: If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of (Document, score) that are the most similar to the query vector.

Note:

Use search_options for hybrid search combining vector similarity with other supported search queries
Use filter for efficient pre-search filtering, especially with large datasets
Both parameters can be used together for complex search scenarios

class langchain_couchbase.CouchbaseSemanticCache(cluster: Cluster, embedding: Embeddings, bucket_name: str, scope_name: str, collection_name: str, index_name: str, score_threshold: float | None = None, ttl: timedelta | None = None)[source]

Bases: BaseCache, CouchbaseSearchVectorStore

Couchbase Semantic Cache Cache backed by a Couchbase Server with Vector Store support

LLM = 'llm_string'

RETURN_VAL = 'return_val'

clear(**kwargs: Any) → None[source]: Clear the cache. This will delete all documents in the collection. This requires an index on the collection.

lookup(prompt: str, llm_string: str) → Sequence[Generation] | None[source]: Look up from cache based on the semantic similarity of the prompt

update(prompt: str, llm_string: str, return_val: Sequence[Generation]) → None[source]: Update cache based on the prompt and llm_string

class langchain_couchbase.CouchbaseVectorStore(cluster: Cluster, bucket_name: str, scope_name: str, collection_name: str, embedding: Embeddings, index_name: str, *, text_key: str | None = 'text', embedding_key: str | None = 'embedding', scoped_index: bool = True)[source]

Bases: VectorStore

Deprecated since version 0.3.0: Use :class:`~langchain_couchbase.vectorstores.CouchbaseSearchVectorStore` instead. It will not be removed until langchain-couchbase==1.0.0.

__Couchbase__ vector store integration.

Deprecated since version 0.1.0: This class is deprecated and will be removed in version 1.0.0. Use CouchbaseSearchVectorStore instead.

Setup:

Install langchain-couchbase and head over to the Couchbase [website](https://cloud.couchbase.com) and create a new connection, with a bucket, collection, and search index.

pip install -U langchain-couchbase

import getpass

COUCHBASE_CONNECTION_STRING = getpass.getpass("Enter the connection string for the Couchbase cluster: ")
DB_USERNAME = getpass.getpass("Enter the username for the Couchbase cluster: ")
DB_PASSWORD = getpass.getpass("Enter the password for the Couchbase cluster: ")

Key init args — indexing params:

embedding: Embeddings: Embedding function to use.

Key init args — client params:

cluster: Cluster: Couchbase cluster object with active connection.
bucket_name: str: Name of the bucket to store documents in.
scope_name: str: Name of the scope in the bucket to store documents in.
collection_name: str: Name of the collection in the scope to store documents in.
index_name: str: Name of the Search index to use.

Instantiate:

from datetime import timedelta
from langchain_openai import OpenAIEmbeddings
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase import CouchbaseVectorStore

auth = PasswordAuthenticator(DB_USERNAME, DB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(COUCHBASE_CONNECTION_STRING, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

BUCKET_NAME = "langchain_bucket"
SCOPE_NAME = "_default"
COLLECTION_NAME = "_default"
SEARCH_INDEX_NAME = "langchain-test-index"

embeddings = OpenAIEmbeddings()

vector_store = CouchbaseVectorStore(
    cluster=cluster,
    bucket_name=BUCKET_NAME,
    scope_name=SCOPE_NAME,
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
    index_name=SEARCH_INDEX_NAME,
)

Add Documents:

from langchain_core.documents import Document

document_1 = Document(page_content="foo", metadata={"baz": "bar"})
document_2 = Document(page_content="thud", metadata={"bar": "baz"})
document_3 = Document(page_content="i will be deleted :(")

documents = [document_1, document_2, document_3]
ids = ["1", "2", "3"]
vector_store.add_documents(documents=documents, ids=ids)

Delete Documents:

vector_store.delete(ids=["3"])

Search:

results = vector_store.similarity_search(query="thud",k=1)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with filter:

results = vector_store.similarity_search(query="thud",k=1,search_options={"query": {"field":"metadata.bar", "match": "baz"}})
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* thud [{'bar': 'baz'}]

Search with score:

results = vector_store.similarity_search_with_score(query="qux",k=1)
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.500778] foo [{'baz': 'bar'}]

Async:

# add documents
await vector_store.aadd_documents(documents=documents, ids=ids)

# delete documents
await vector_store.adelete(ids=["3"])

# search
results = vector_store.asimilarity_search(query="thud",k=1)

# search with score
results = await vector_store.asimilarity_search_with_score(query="qux",k=1)
for doc,score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.500762] foo [{'baz': 'bar'}]

Use as Retriever:

retriever = vector_store.as_retriever(
    search_kwargs={"k": 1, "fetch_k": 2, "lambda_mult": 0.5},
)
retriever.invoke("thud")

[Document(id='2', metadata={'bar': 'baz'}, page_content='thud')]

DEFAULT_BATCH_SIZE = 100

add_texts(texts: Iterable[str], metadatas: List[dict] | None = None, ids: List[str] | None = None, batch_size: int | None = None, **kwargs: Any) → List[str][source]

Run texts through the embeddings and persist in vectorstore.

If the document IDs are passed, the existing documents (if any) will be overwritten with the new ones.

Args:

texts (Iterable[str]): Iterable of strings to add to the vectorstore. metadatas (Optional[List[Dict]]): Optional list of metadatas associated

ids (Optional[List[str]]): Optional list of ids associated with the texts.: IDs have to be unique strings across the collection. If it is not specified uuids are generated and used as ids.
batch_size (Optional[int]): Optional batch size for bulk insertions.: Default is 100.

Returns:

List[str]:List of ids from adding the texts into the vectorstore.

delete(ids: List[str] | None = None, **kwargs: Any) → bool | None[source]

Delete documents from the vector store by ids.

Args:: ids (List[str]): List of IDs of the documents to delete. batch_size (Optional[int]): Optional batch size for bulk deletions.
Returns:: bool: True if all the documents were deleted successfully, False otherwise.

property embeddings: Embeddings: Return the query embedding object.

classmethod from_texts(texts: List[str], embedding: Embeddings, metadatas: List[dict] | None = None, **kwargs: Any) → CouchbaseVectorStore[source]

Construct a Couchbase vector store from a list of texts.

Example:

from langchain_couchbase import CouchbaseVectorStore
from langchain_openai import OpenAIEmbeddings

from couchbase.cluster import Cluster
from couchbase.auth import PasswordAuthenticator
from couchbase.options import ClusterOptions
from datetime import timedelta

auth = PasswordAuthenticator(username, password)
options = ClusterOptions(auth)
connect_string = "couchbases://localhost"
cluster = Cluster(connect_string, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

embeddings = OpenAIEmbeddings()

texts = ["hello", "world"]

vectorstore = CouchbaseVectorStore.from_texts(
    texts,
    embedding=embeddings,
    cluster=cluster,
    bucket_name="",
    scope_name="",
    collection_name="",
    index_name="vector-index",
)

Args:

texts (List[str]): list of texts to add to the vector store. embedding (Embeddings): embedding function to use. metadatas (optional[List[Dict]): list of metadatas to add to documents. **kwargs: Keyword arguments used to initialize the vector store with and/or

Returns:

A Couchbase vector store.

similarity_search(query: str, k: int = 4, search_options: Dict[str, Any] | None = {}, **kwargs: Any) → List[Document][source]

Return documents most similar to embedding vector with their scores.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional search options that are: passed to Couchbase search. Defaults to empty dictionary
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of Documents most similar to the query.

similarity_search_by_vector(embedding: List[float], k: int = 4, search_options: Dict[str, Any] | None = {}, **kwargs: Any) → List[Document][source]

Return documents that are most similar to the vector embedding.

Args:

embedding (List[float]): Embedding to look up documents similar to. k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional search options that are: passed to Couchbase search. Defaults to empty dictionary.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to document text and metadata fields.

Returns:

List of Documents most similar to the query.

similarity_search_with_score(query: str, k: int = 4, search_options: Dict[str, Any] | None = {}, **kwargs: Any) → List[Tuple[Document, float]][source]

Return documents that are most similar to the query with their scores.

Args:

query (str): Query to look up for similar documents k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional search options that are: passed to Couchbase search. Defaults to empty dictionary.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to text and metadata fields.

Returns:

List of (Document, score) that are most similar to the query.

similarity_search_with_score_by_vector(embedding: List[float], k: int = 4, search_options: Dict[str, Any] | None = {}, **kwargs: Any) → List[Tuple[Document, float]][source]

Return docs most similar to embedding vector with their scores.

Args:

embedding (List[float]): Embedding vector to look up documents similar to. k (int): Number of Documents to return.

search_options (Optional[Dict[str, Any]]): Optional search options that are: passed to Couchbase search. Defaults to empty dictionary.
fields (Optional[List[str]]): Optional list of fields to include in the: metadata of results. Note that these need to be stored in the index. If nothing is specified, defaults to all the fields stored in the index.

Returns:

List of (Document, score) that are the most similar to the query vector.