Skip to main content

Documentation Index

Fetch the complete documentation index at: https://langchain-5e9cc07a-preview-v15cha-1779827592-24b9c53.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

A vector stores embedded data and performs similarity search.

Interface

LangChain provides a unified interface for vector stores, allowing you to:
  • add_documents - Add documents to the store.
  • delete - Remove stored documents by ID.
  • similarity_search - Query for semantically similar documents.
This abstraction lets you switch between different implementations without altering your application logic.

Initialization

To initialize a vector store, provide it with an embedding model:
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore(embedding=SomeEmbeddingModel())

Adding documents

Add Document objects (holding page_content and optional metadata) like so:
vector_store.add_documents(documents=[doc1, doc2], ids=["id1", "id2"])

Deleting documents

Delete by specifying IDs:
vector_store.delete(ids=["id1"])
Issue a semantic query using similarity_search, which returns the closest embedded documents:
similar_docs = vector_store.similarity_search("your query here")
Many vector stores support parameters like:
  • k — number of results to return
  • filter — conditional filtering based on metadata

Similarity metrics & indexing

Embedding similarity may be computed using:
  • Cosine similarity
  • Euclidean distance
  • Dot product
Efficient search often employs indexing methods such as HNSW (Hierarchical Navigable Small World), though specifics depend on the vector store.

Metadata filtering

Filtering by metadata (e.g., source, date) can refine search results:
vector_store.similarity_search(
  "query",
  k=3,
  filter={"source": "tweets"}
)

Top integrations

Select embedding model:
pip install -qU langchain-openai
import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
pip install -qU langchain-azure-ai
import getpass
import os

if not os.environ.get("AZURE_OPENAI_API_KEY"):
  os.environ["AZURE_OPENAI_API_KEY"] = getpass.getpass("Enter API key for Azure: ")

from langchain_openai import AzureOpenAIEmbeddings

embeddings = AzureOpenAIEmbeddings(
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],
    openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
)
pip install -qU langchain-google-genai
import getpass
import os

if not os.environ.get("GOOGLE_API_KEY"):
  os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter API key for Google Gemini: ")

from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/gemini-embedding-001")
pip install -qU langchain-google-vertexai
from langchain_google_vertexai import VertexAIEmbeddings

embeddings = VertexAIEmbeddings(model="text-embedding-005")
pip install -qU langchain-aws
from langchain_aws import BedrockEmbeddings

embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v2:0")
pip install -qU langchain-huggingface
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
pip install -qU langchain-ollama
from langchain_ollama import OllamaEmbeddings

embeddings = OllamaEmbeddings(model="llama3")
pip install -qU langchain-cohere
import getpass
import os

if not os.environ.get("COHERE_API_KEY"):
  os.environ["COHERE_API_KEY"] = getpass.getpass("Enter API key for Cohere: ")

from langchain_cohere import CohereEmbeddings

embeddings = CohereEmbeddings(model="embed-english-v3.0")
pip install -qU langchain-mistralai
import getpass
import os

if not os.environ.get("MISTRALAI_API_KEY"):
  os.environ["MISTRALAI_API_KEY"] = getpass.getpass("Enter API key for MistralAI: ")

from langchain_mistralai import MistralAIEmbeddings

embeddings = MistralAIEmbeddings(model="mistral-embed")
pip install -qU langchain-nomic
import getpass
import os

if not os.environ.get("NOMIC_API_KEY"):
  os.environ["NOMIC_API_KEY"] = getpass.getpass("Enter API key for Nomic: ")

from langchain_nomic import NomicEmbeddings

embeddings = NomicEmbeddings(model="nomic-embed-text-v1.5")
pip install -qU langchain-nvidia-ai-endpoints
import getpass
import os

if not os.environ.get("NVIDIA_API_KEY"):
  os.environ["NVIDIA_API_KEY"] = getpass.getpass("Enter API key for NVIDIA: ")

from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings

embeddings = NVIDIAEmbeddings(model="NV-Embed-QA")
pip install -qU langchain-voyageai
import getpass
import os

if not os.environ.get("VOYAGE_API_KEY"):
  os.environ["VOYAGE_API_KEY"] = getpass.getpass("Enter API key for Voyage AI: ")

from langchain-voyageai import VoyageAIEmbeddings

embeddings = VoyageAIEmbeddings(model="voyage-3")
pip install -qU langchain-ibm
import getpass
import os

if not os.environ.get("WATSONX_APIKEY"):
  os.environ["WATSONX_APIKEY"] = getpass.getpass("Enter API key for IBM watsonx: ")

from langchain_ibm import WatsonxEmbeddings

embeddings = WatsonxEmbeddings(
    model_id="ibm/slate-125m-english-rtrvr",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="<WATSONX PROJECT_ID>",
)
pip install -qU langchain-core
from langchain_core.embeddings import DeterministicFakeEmbedding

embeddings = DeterministicFakeEmbedding(size=4096)
pip install -qU langchain-xai
import getpass
import os

if not os.environ.get("XAI_API_KEY"):
  os.environ["XAI_API_KEY"] = getpass.getpass("Enter API key for xAI: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("grok-2", model_provider="xai")
pip install -qU langchain-perplexity
import getpass
import os

if not os.environ.get("PPLX_API_KEY"):
  os.environ["PPLX_API_KEY"] = getpass.getpass("Enter API key for Perplexity: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("llama-3.1-sonar-small-128k-online", model_provider="perplexity")
pip install -qU langchain-deepseek
import getpass
import os

if not os.environ.get("DEEPSEEK_API_KEY"):
  os.environ["DEEPSEEK_API_KEY"] = getpass.getpass("Enter API key for DeepSeek: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("deepseek-chat", model_provider="deepseek")
Select vector store:
pip install -qU langchain-core
from langchain_core.vectorstores import InMemoryVectorStore

vector_store = InMemoryVectorStore(embeddings)
pip
pip install -qU boto3
from opensearchpy import RequestsHttpConnection

service = "es"  # must set the service as 'es'
region = "us-east-2"
credentials = boto3.Session(
    aws_access_key_id="xxxxxx", aws_secret_access_key="xxxxx"
).get_credentials()
awsauth = AWS4Auth("xxxxx", "xxxxxx", region, service, session_token=credentials.token)

vector_store = OpenSearchVectorSearch.from_documents(
    docs,
    embeddings,
    opensearch_url="host url",
    http_auth=awsauth,
    timeout=300,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    index_name="test-index",
)
pip install -qU langchain-astradb
from langchain_astradb import AstraDBVectorStore

vector_store = AstraDBVectorStore(
    embedding=embeddings,
    api_endpoint=ASTRA_DB_API_ENDPOINT,
    collection_name="astra_vector_langchain",
    token=ASTRA_DB_APPLICATION_TOKEN,
    namespace=ASTRA_DB_NAMESPACE,
)
pip install -qU langchain-azure-cosmosdb azure-cosmos
from langchain_azure_cosmosdb import AzureCosmosDBNoSqlVectorSearch

vector_search = AzureCosmosDBNoSqlVectorSearch.from_documents(
    documents=docs,
    embedding=openai_embeddings,
    cosmos_client=cosmos_client,
    database_name=database_name,
    container_name=container_name,
    vector_embedding_policy=vector_embedding_policy,
    full_text_policy=full_text_policy,
    indexing_policy=indexing_policy,
    cosmos_container_properties=cosmos_container_properties,
    cosmos_database_properties={},
    full_text_search_enabled=True,
)
pip install -qU langchain-azure-ai pymongo
from langchain_azure_ai.vectorstores.azure_cosmos_db_mongo_vcore import (
    AzureCosmosDBMongoVCoreVectorSearch,
)

vectorstore = AzureCosmosDBMongoVCoreVectorSearch.from_documents(
    docs,
    openai_embeddings,
    collection=collection,
    index_name=INDEX_NAME,
)
pip install -qU langchain-chroma
from langchain_chroma import Chroma

vector_store = Chroma(
    collection_name="example_collection",
    embedding_function=embeddings,
    persist_directory="./chroma_langchain_db",  # Where to save data locally, remove if not necessary
)
pip install -qU langchain-cockroachdb
from langchain_cockroachdb import AsyncCockroachDBVectorStore, CockroachDBEngine

CONNECTION_STRING = "cockroachdb://user:pass@host:26257/db?sslmode=verify-full"

engine = CockroachDBEngine.from_connection_string(CONNECTION_STRING)
await engine.ainit_vectorstore_table(
    table_name="vectors",
    vector_dimension=1536,
)

vector_store = AsyncCockroachDBVectorStore(
    engine=engine,
    embeddings=embeddings,
    collection_name="vectors",
)
Install the package and start Elasticsearch locally using the start-local script:
pip install -qU langchain-elasticsearch
curl -fsSL https://elastic.co/start-local | sh
This creates an elastic-start-local folder. To start Elasticsearch:
cd elastic-start-local
./start.sh
Elasticsearch will be available at http://localhost:9200. The password for the elastic user and API key are stored in the .env file in the elastic-start-local folder.
from langchain_elasticsearch import ElasticsearchStore

vector_store = ElasticsearchStore(
    index_name="langchain-demo",
    embedding=embeddings,
    es_url="http://localhost:9200",
)
pip install -qU langchain-milvus
from langchain_milvus import Milvus

URI = "./milvus_example.db"

vector_store = Milvus(
    embedding_function=embeddings,
    connection_args={"uri": URI},
    index_params={"index_type": "FLAT", "metric_type": "L2"},
)
pip install -qU langchain-mongodb
from langchain_mongodb import MongoDBAtlasVectorSearch

vector_store = MongoDBAtlasVectorSearch(
    embedding=embeddings,
    collection=MONGODB_COLLECTION,
    index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
    relevance_score_fn="cosine",
)
pip install -qU langchain-postgres
from langchain_postgres import PGVector

vector_store = PGVector(
    embeddings=embeddings,
    collection_name="my_docs",
    connection="postgresql+psycopg://..."
)
pip install -qU langchain-postgres
from langchain_postgres import PGEngine, PGVectorStore

$engine = PGEngine.from_connection_string(
    url="postgresql+psycopg://..."
)

vector_store = PGVectorStore.create_sync(
    engine=pg_engine,
    table_name='test_table',
    embedding_service=embedding
)
pip install -qU langchain-pinecone
from langchain_pinecone import PineconeVectorStore
from pinecone import Pinecone

pc = Pinecone(api_key=...)
index = pc.Index(index_name)

vector_store = PineconeVectorStore(embedding=embeddings, index=index)
pip install -qU langchain-qdrant
from qdrant_client.models import Distance, VectorParams
from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient

client = QdrantClient(":memory:")

vector_size = len(embeddings.embed_query("sample text"))

if not client.collection_exists("test"):
    client.create_collection(
        collection_name="test",
        vectors_config=VectorParams(size=vector_size, distance=Distance.COSINE)
    )
vector_store = QdrantVectorStore(
    client=client,
    collection_name="test",
    embedding=embeddings,
)
pip install -qU langchain-oracledb
import oracledb
from langchain_oracledb.vectorstores import OracleVS
from langchain_oracledb.vectorstores.oraclevs import create_index
from langchain_community.vectorstores.utils import DistanceStrategy

username = "<username>"
password = "<password>"
dsn = "<hostname>:<port>/<service_name>"

connection = oracledb.connect(user=username, password=password, dsn=dsn)

vector_store = OracleVS(
    client=connection,
    embedding_function=embedding_model,
    table_name="VECTOR_SEARCH_DEMO",
    distance_strategy=DistanceStrategy.EUCLIDEAN_DISTANCE
)
pip install -qU langchain-turbopuffer
from langchain_turbopuffer import TurbopufferVectorStore
from turbopuffer import Turbopuffer

tpuf = Turbopuffer(region="gcp-us-central1")
ns = tpuf.namespace("langchain-test")

vector_store = TurbopufferVectorStore(embedding=embeddings, namespace=ns)
pip install -qU "langchain-aws[valkey]"
from langchain_aws.vectorstores import ValkeyVectorStore

vector_store = ValkeyVectorStore(
    embedding=embeddings,
    valkey_url="valkey://localhost:6379",
    index_name="my_index"
)
VectorstoreDelete by IDFilteringSearch by VectorSearch with scoreAsyncPasses Standard TestsMulti TenancyIDs in add Documents
AstraDBVectorStore
AzureCosmosDBNoSqlVectorStore
AzureCosmosDBMongoVCoreVectorStore
Chroma
AsyncCockroachDBVectorStore
CouchbaseSearchVectorStore
DatabricksVectorSearch
ElasticsearchStore
InMemoryVectorStore
LambdaDB
Milvus
Moorcheh
MongoDBAtlasVectorSearch
openGauss
PGVector
PGVectorStore
PineconeVectorStore
QdrantVectorStore
Weaviate
SQLServer
TurbopufferVectorStore
ValkeyVectorStore
ZeusDB
Oracle AI Database

All vector stores

Activeloop Deep Lake

Alibaba Cloud MySQL

Astra DB Vector Store

Azure Cosmos DB Mongo vCore

Azure Cosmos DB No SQL

Azure Database for PostgreSQL - Flexible Server

Chroma

CockroachDB

Couchbase

Databricks

IBM Db2

Amazon Document DB

Elasticsearch

Gel

Google AlloyDB

Google BigQuery Vector Search

Google Cloud SQL for MySQL

Google Cloud SQL for PostgreSQL

Firestore

Google Memorystore for Redis

Google Spanner

Google Bigtable

Google Vertex AI Feature Store

Google Vertex AI Vector Search

Kinetica

LambdaDB

Lindorm

MariaDB

Amazon MemoryDB

Milvus

Moorcheh

MongoDB Atlas

Neo4j Vector Index

Oceanbase

openGauss

Oracle AI Database

PGVector

PGVectorStore

Pinecone

Pinecone (sparse)

Qdrant

SAP HANA Cloud Vector Engine

SingleStore

SQLServer

SurrealDB

Teradata VectorStore

turbopuffer

Valkey

VDMS

veDB for MySQL

Vectara

Volcengine RDS for MySQL

Weaviate

YDB

ZeusDB