diff --git a/docs/components/vectordbs/dbs/opensearch.mdx b/docs/components/vectordbs/dbs/opensearch.mdx index dcad7d7413..da2335d9d1 100644 --- a/docs/components/vectordbs/dbs/opensearch.mdx +++ b/docs/components/vectordbs/dbs/opensearch.mdx @@ -56,6 +56,30 @@ config = { } ``` +### Configuration Options + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `collection_name` | string | required | Name of the OpenSearch index | +| `host` | string | required | OpenSearch endpoint URL | +| `port` | int | 9200 | Port number | +| `http_auth` | object | None | Authentication credentials (e.g., AWSV4SignerAuth) | +| `embedding_model_dims` | int | 1536 | Dimension of embedding vectors | +| `use_ssl` | bool | False | Enable SSL/TLS connection | +| `verify_certs` | bool | False | Verify SSL certificates | +| `auto_refresh` | bool | False | Automatically refresh index after insert. OpenSearch refreshes every ~1 second by default, so this is rarely needed. | + + + The defaults above match a local OpenSearch instance. The AWS OpenSearch Serverless + example earlier on this page intentionally overrides them with `port=443`, `use_ssl=True`, + and `verify_certs=True`, which are required when connecting to a Serverless collection. + + + + For **AWS OpenSearch Serverless**, keep `auto_refresh=False` (the default). + The `indices.refresh()` API is not supported on Serverless collections. + + ### Add Memories ```python diff --git a/docs/integrations/hermes.mdx b/docs/integrations/hermes.mdx index 51349a4773..2d91bac140 100644 --- a/docs/integrations/hermes.mdx +++ b/docs/integrations/hermes.mdx @@ -1,35 +1,42 @@ --- title: Hermes Agent -description: "Add long-term memory to Hermes agents using Mem0 as a pluggable memory provider with automatic background sync and zero-latency prefetch." +description: "Add long-term memory to Hermes agents with Mem0, on managed Mem0 Cloud or fully self-hosted (OSS), with automatic background sync and zero-latency prefetch." --- -Add long-term memory to [Hermes Agent](https://github.com/NousResearch/hermes-agent) — a self-improving AI agent CLI by Nous Research. Hermes has a pluggable memory system, and Mem0 is one of the supported providers. Once enabled, Mem0 automatically learns facts from your conversations and surfaces relevant ones before each turn — all without slowing down the chat. +Add long-term memory to [Hermes Agent](https://github.com/NousResearch/hermes-agent), a self-improving AI agent CLI by Nous Research. Hermes has a pluggable memory system, and Mem0 is one of the supported providers. Once enabled, Mem0 learns facts from your conversations and surfaces relevant ones before each turn, without slowing down the chat. -## Overview +You can run Mem0 in two ways: -Hermes runs a built-in memory system (file-based `MEMORY.md` and `USER.md`) alongside one external provider. When Mem0 is active, it works additively with the built-in system at three key moments in every conversation turn: +- **Platform mode** (default): managed Mem0 Cloud. Add your API key and you are ready. +- **OSS mode**: fully self-hosted with your own LLM, embedder, and vector store. No data leaves your machine. -### 1. Before the Agent Responds (Prefetch) +## How It Works -When you send a message, Hermes checks if it already has cached Mem0 search results from the previous turn. If so, those memories are injected into the system prompt so the LLM can see them. This is **zero-latency** — no waiting for an API call. +Hermes runs a built-in memory system (file-based `MEMORY.md` and `USER.md`) alongside one external provider. When Mem0 is active, it works additively with the built-in system at three points in every conversation turn. -### 2. After the Agent Responds (Sync) +### 1. Before the agent responds (prefetch) -Once the LLM finishes responding, Hermes sends the `(user message, assistant response)` pair to Mem0's API in a **background thread**. Mem0's server-side LLM automatically extracts facts (e.g., "user prefers Python", "user works at Acme Corp") — you don't have to tell it what to remember. +When you send a message, Hermes checks for cached Mem0 search results from the previous turn. If they exist, those memories are injected into the system prompt so the model can see them. This is zero-latency, with no waiting on an API call. -### 3. Background Prefetch for Next Turn +### 2. After the agent responds (sync) -At the same time as sync, Hermes kicks off a background search on Mem0 to pre-load relevant memories for the next turn. By the time you type your next message, the memories are already cached. +Once the model finishes, Hermes sends the `(user message, assistant response)` pair to Mem0 in a background thread. Mem0 extracts facts automatically (for example, "user prefers Python" or "user works at Acme Corp"), so you never have to tell it what to remember. Each write is tagged with the gateway channel it came from. + +### 3. Background prefetch for the next turn + +At the same time, Hermes runs a background search to pre-load relevant memories for your next message. By the time you type, the results are already cached. ## Agent Tools -When Mem0 is active, the LLM gets three extra tools it can call during conversations: +When Mem0 is active, the model gets five tools it can call during a conversation: -| Tool | Description | -|------|-------------| -| `mem0_profile` | Fetch all stored memories about the user | -| `mem0_search` | Semantic search through memories (supports optional reranking via `rerank` and `top_k` parameters) | -| `mem0_conclude` | Store a specific fact verbatim — uses `infer=False` so no server-side LLM extraction happens | +| Tool | Description | Parameters | +|------|-------------|------------| +| `mem0_list` | List all stored memories, for a full overview | `page`, `page_size` (default 100, max 200) | +| `mem0_search` | Semantic search by meaning, ranked by relevance | `query` (required), `top_k` (default 10, max 50), `rerank` (default `true`, Platform mode only) | +| `mem0_add` | Store a fact verbatim, with no LLM extraction | `content` (required) | +| `mem0_update` | Update a memory's text by ID | `memory_id`, `text` (both required) | +| `mem0_delete` | Delete a memory by ID | `memory_id` (required) | ## Installation @@ -40,17 +47,19 @@ curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scri source ~/.bashrc ``` -The `mem0ai` Python package is automatically installed when you enable the Mem0 provider — no manual pip install needed. +The `mem0ai` package is installed automatically when you enable the Mem0 provider, so there is no manual pip step. OSS providers may need extra packages (for example `qdrant-client`, `psycopg2-binary`, or `ollama`), which the setup flow installs for you when you pick them. -## Setup +## Platform Setup -### Option 1: Interactive Setup Wizard (Recommended) +Platform mode uses managed Mem0 Cloud and is the fastest way to start. + +### Option 1: Interactive wizard (recommended) ```bash hermes memory setup ``` -Select **mem0** as the provider and enter your Mem0 API key when prompted. The wizard writes your config to `~/.hermes/mem0.json`. +Select **mem0**, choose **Platform**, and paste your API key when prompted. The wizard writes the non-secret settings to `~/.hermes/mem0.json` and keeps the key in `~/.hermes/.env`. Get your API key from app.mem0.ai. @@ -68,33 +77,151 @@ memory: provider: mem0 ``` -That's it — Mem0 runs automatically from this point. +That's it. Mem0 runs automatically from here. + +## OSS (Self-Hosted) Setup + +OSS mode runs Mem0 entirely on your own infrastructure: your LLM, your embedder, and your vector store. No data is sent to Mem0 Cloud, and no Mem0 API key is required. + +### Interactive + +```bash +hermes memory setup +# Select "mem0", then "Open Source (self-hosted)" +# Follow the prompts for LLM, embedder, and vector store +``` + +### With flags -## Configuration Options +```bash +hermes memory setup mem0 --mode oss \ + --oss-llm openai --oss-llm-key sk-... \ + --oss-vector qdrant +``` + +### Supported providers + +| Component | Providers | +|-----------|-----------| +| LLM | `openai` (default model `gpt-5-mini`), `ollama` (local, default `llama3.1:8b`) | +| Embedder | `openai` (default `text-embedding-3-small`), `ollama` (local, default `nomic-embed-text`) | +| Vector store | `qdrant` (local path or server), `pgvector` | + +### Flag reference + +| Flag | Description | +|------|-------------| +| `--mode` | `platform` or `oss` | +| `--oss-llm` | LLM provider (`openai` or `ollama`, default `openai`) | +| `--oss-llm-key` | LLM API key (for `openai`) | +| `--oss-llm-model` | Override the LLM model | +| `--oss-llm-url` | LLM base URL (for `ollama` or a custom endpoint) | +| `--oss-embedder` | Embedder provider (default `openai`) | +| `--oss-embedder-key` | Embedder API key | +| `--oss-vector` | Vector store (`qdrant` or `pgvector`, default `qdrant`) | +| `--oss-vector-path` | Local Qdrant storage path | +| `--oss-vector-host`, `--oss-vector-port` | PGVector or remote Qdrant host and port | +| `--oss-vector-user`, `--oss-vector-password`, `--oss-vector-dbname` | PGVector connection details | +| `--user-id` | Canonical user identifier | +| `--dry-run` | Preview the resolved config without writing it | + +## Switching Modes + +You can move between Platform and OSS at any time. Run the setup command again, or edit `~/.hermes/mem0.json` directly. + +```bash +# Platform to OSS +hermes memory setup mem0 --mode oss --oss-llm-key sk-... -Configuration is stored in `~/.hermes/mem0.json`. Values can also be set via environment variables. +# OSS to Platform +hermes memory setup mem0 --mode platform --api-key sk-... -| Key | Env Variable | Default | Description | -|-----|-------------|---------|-------------| -| `api_key` | `MEM0_API_KEY` | — | **Required.** Mem0 Platform API key | -| `user_id` | `MEM0_USER_ID` | `hermes-user` | User identifier for scoping memories | -| `agent_id` | `MEM0_AGENT_ID` | `hermes` | Agent identifier | -| `rerank` | — | `true` | Enable reranking for memory recall | +# Preview without writing anything +hermes memory setup mem0 --mode oss --oss-llm-key sk-... --dry-run +``` + +A self-hosted `~/.hermes/mem0.json` looks like this: + +```json +{ + "mode": "oss", + "oss": { + "llm": {"provider": "openai", "config": {"model": "gpt-5-mini"}}, + "embedder": {"provider": "openai", "config": {"model": "text-embedding-3-small"}}, + "vector_store": {"provider": "qdrant", "config": {"path": "~/.hermes/mem0_qdrant"}} + } +} +``` + +## Configuration + +Behavioral settings live in `~/.hermes/mem0.json` and are written for you by `hermes memory setup`. Only the secret `MEM0_API_KEY` belongs in `~/.hermes/.env`. + +| Key | Default | Description | +|-----|---------|-------------| +| `mode` | `platform` | `platform` (Mem0 Cloud) or `oss` (self-hosted) | +| `api_key` | none | Mem0 Platform API key, required in Platform mode. Stored in `.env` as `MEM0_API_KEY` | +| `user_id` | `hermes-user` | Identifier that scopes memories. See cross-channel behavior below | +| `agent_id` | `hermes` | Agent identifier attached to writes | +| `rerank` | `true` | Rerank search results for relevance (Platform mode only) | + +### Cross-channel memories + +Hermes can run from the CLI and from gateways like Telegram, Slack, and Discord. The `user_id` setting controls how memories are scoped across them: + +- **Set a `user_id`** and it applies to every gateway, so one person gets a single merged memory store no matter where they talk to the agent. +- **Leave it unset** (or at the default `hermes-user`) and each gateway uses its own native id, keeping per-platform memories separate. + +Either way, every write is tagged with `metadata.channel` (for example `telegram` or `cli`), so per-channel views are still possible at query time. ## Reliability -- **Circuit Breaker** — If Mem0's API fails 5 times in a row, Hermes stops calling it for 2 minutes, then retries. The agent keeps working fine without memory during that time. -- **Non-blocking** — All Mem0 API calls happen in background daemon threads. A slow or failed API call never blocks your conversation. -- **Thread-safe** — The Mem0 client uses lazy initialization with locking, safe for concurrent access. +- **Circuit breaker**: if Mem0 fails five times in a row, Hermes pauses calls for two minutes, then retries. The agent keeps working without memory during that window. Expected client errors, like a 404 on a missing memory id, do not count toward tripping the breaker. +- **Non-blocking**: every Mem0 call runs in a background daemon thread, so a slow or failed call never blocks your conversation. +- **Thread-safe**: the client uses lazy initialization with locking, and the background sync and prefetch threads are guarded so concurrent gateway messages cannot produce duplicate memories. + +## Troubleshooting + +### "Mem0 temporarily unavailable" + +The circuit breaker tripped after five consecutive failures and resets after two minutes. + +- **Platform mode**: check your API key and internet connection. +- **OSS mode**: make sure your vector store (Qdrant or PGVector) is running and reachable. + +### OSS: vector store connection refused + +```bash +# Local Qdrant: confirm the storage path is writable +ls -la ~/.hermes/mem0_qdrant + +# Qdrant server: confirm it is reachable +curl http://localhost:6333/healthz + +# PGVector: confirm PostgreSQL is accepting connections +pg_isready -h localhost -p 5432 +``` + +### OSS: Ollama not reachable + +```bash +curl http://localhost:11434/api/tags +``` + +### Memories not appearing + +- `mem0_add` stores text verbatim with no extraction. Ordinary conversation turns are extracted automatically by the background sync. +- Search is semantic, so try a broader query. +- Confirm `user_id` is the same across sessions (check `~/.hermes/mem0.json`). ## Key Features -1. **Zero-Latency Recall** — Memories are prefetched in the background and cached, ready before you type -2. **Server-side Extraction** — Mem0's API automatically extracts and deduplicates facts from each exchange -3. **Non-blocking** — All API calls run in background daemon threads -4. **Fault Tolerant** — Circuit breaker ensures the agent works even if Mem0 is temporarily unreachable -5. **Additive Memory** — Works alongside Hermes' built-in file-based memory system (MEMORY.md, USER.md) +1. **Two ways to run**: managed Platform or fully self-hosted OSS, switchable at any time. +2. **Zero-latency recall**: memories are prefetched in the background and cached before you type. +3. **Automatic extraction**: Mem0 extracts and deduplicates facts from each exchange for you. +4. **Non-blocking and fault tolerant**: background threads plus a circuit breaker keep the agent responsive even when Mem0 is unreachable. +5. **Additive memory**: works alongside Hermes' built-in file memory (`MEMORY.md`, `USER.md`). } href="/integrations/openclaw"> diff --git a/mem0-ts/src/client/mem0.types.ts b/mem0-ts/src/client/mem0.types.ts index 185b1dbea7..ba778415d8 100644 --- a/mem0-ts/src/client/mem0.types.ts +++ b/mem0-ts/src/client/mem0.types.ts @@ -169,7 +169,9 @@ export interface PaginatedMemories { export interface ProjectResponse { customInstructions?: string; - customCategories?: string[]; + // The API returns category objects (`[{ "": "" }]`), + // not bare strings (see issue #5738). + customCategories?: custom_categories[]; [key: string]: any; } diff --git a/mem0-ts/src/client/tests/utils.test.ts b/mem0-ts/src/client/tests/utils.test.ts index 9007a6b879..d763858453 100644 --- a/mem0-ts/src/client/tests/utils.test.ts +++ b/mem0-ts/src/client/tests/utils.test.ts @@ -99,4 +99,50 @@ describe("camelToSnakeKeys / snakeToCamelKeys", () => { }); }); }); + + describe("user-controlled customCategories names (issue #5738)", () => { + it("converts the outer key but leaves multi-word category names on write", () => { + expect( + camelToSnakeKeys({ + customCategories: [ + { work_life_balance: "desc" }, + { AIResearch: "desc" }, + ], + }), + ).toEqual({ + // outer SDK key is snake_cased, user-defined category names are not + custom_categories: [ + { work_life_balance: "desc" }, + { AIResearch: "desc" }, + ], + }); + }); + + it("converts the outer key but leaves category names verbatim on read", () => { + expect( + snakeToCamelKeys({ + custom_categories: [ + { work_life_balance: "desc" }, + { AIResearch: "desc" }, + ], + }), + ).toEqual({ + customCategories: [ + { work_life_balance: "desc" }, + { AIResearch: "desc" }, + ], + }); + }); + + it("round-trips category names losslessly (write then read)", () => { + const customCategories = [ + { work_life_balance: "balance between work and life" }, + { AIResearch: "artificial intelligence research" }, + ]; + const roundTripped = snakeToCamelKeys( + camelToSnakeKeys({ customCategories }), + ); + expect(roundTripped.customCategories).toEqual(customCategories); + }); + }); }); diff --git a/mem0-ts/src/client/utils.ts b/mem0-ts/src/client/utils.ts index 817dd02cff..aca9ffbcd5 100644 --- a/mem0-ts/src/client/utils.ts +++ b/mem0-ts/src/client/utils.ts @@ -29,6 +29,11 @@ const OPAQUE_VALUE_KEYS = new Set([ "metadata", "structuredDataSchema", "structured_data_schema", + // Custom-category names are user-controlled keys (`[{ "": "" }]`). + // Listed in both casings so they round-trip verbatim in both directions + // (see issue #5738; same class as `metadata`/`structuredDataSchema`). + "customCategories", + "custom_categories", ]); /** diff --git a/mem0-ts/src/oss/src/utils/memory.ts b/mem0-ts/src/oss/src/utils/memory.ts index 8328093692..26c18c730d 100644 --- a/mem0-ts/src/oss/src/utils/memory.ts +++ b/mem0-ts/src/oss/src/utils/memory.ts @@ -31,9 +31,11 @@ const parse_vision_messages = async (messages: Message[]) => { typeof message.content === "object" && message.content.type === "image_url" ) { - const description = await get_image_description( - message.content.image_url.url, - ); + const imageUrl = message.content.image_url?.url; + if (!imageUrl) { + throw new Error("image_url content part is missing image_url.url"); + } + const description = await get_image_description(imageUrl); new_message.content = typeof description === "string" ? description diff --git a/mem0/configs/vector_stores/opensearch.py b/mem0/configs/vector_stores/opensearch.py index 9b4ce34552..bf5f43ef86 100644 --- a/mem0/configs/vector_stores/opensearch.py +++ b/mem0/configs/vector_stores/opensearch.py @@ -18,6 +18,12 @@ class OpenSearchConfig(BaseModel): "RequestsHttpConnection", description="Connection class for OpenSearch" ) pool_maxsize: int = Field(20, description="Maximum number of connections in the pool") + auto_refresh: bool = Field( + False, + description="Automatically refresh index after insert operations to make documents " + "immediately searchable. Disabled by default for OpenSearch Serverless compatibility. " + "OpenSearch automatically refreshes indices every ~1 second, so most users don't need this.", + ) @model_validator(mode="before") @classmethod diff --git a/mem0/memory/utils.py b/mem0/memory/utils.py index dbfd3384d8..dd7b1e4cc0 100644 --- a/mem0/memory/utils.py +++ b/mem0/memory/utils.py @@ -206,7 +206,10 @@ def parse_vision_messages(messages, llm=None, vision_details="auto"): elif isinstance(content, dict) and content.get("type") == "image_url": if llm is None: continue - image_url = content["image_url"]["url"] + image_url_obj = content.get("image_url") + image_url = image_url_obj.get("url") if isinstance(image_url_obj, dict) else None + if not image_url: + raise ValueError("image_url content part is missing image_url.url") try: description = get_image_description(image_url, llm, vision_details) returned_messages.append({"role": role, "content": description}) diff --git a/mem0/reranker/cohere_reranker.py b/mem0/reranker/cohere_reranker.py index 8de2d4ac9e..281fabcc64 100644 --- a/mem0/reranker/cohere_reranker.py +++ b/mem0/reranker/cohere_reranker.py @@ -1,3 +1,4 @@ +import logging import os from typing import List, Dict, Any @@ -9,6 +10,8 @@ except ImportError: COHERE_AVAILABLE = False +logger = logging.getLogger(__name__) + class CohereReranker(BaseReranker): """Cohere-based reranker implementation.""" @@ -78,8 +81,9 @@ def rerank(self, query: str, documents: List[Dict[str, Any]], top_k: int = None) return reranked_docs - except Exception: + except Exception as e: # Fallback to original order if reranking fails + logger.warning("Cohere reranking failed, falling back to original order: %s", e) for doc in documents: doc['rerank_score'] = 0.0 final_top_k = top_k or self.config.top_k diff --git a/mem0/reranker/huggingface_reranker.py b/mem0/reranker/huggingface_reranker.py index 6d641964a8..8116c012e8 100644 --- a/mem0/reranker/huggingface_reranker.py +++ b/mem0/reranker/huggingface_reranker.py @@ -1,3 +1,4 @@ +import logging from typing import List, Dict, Any, Union import numpy as np @@ -12,6 +13,8 @@ except ImportError: TRANSFORMERS_AVAILABLE = False +logger = logging.getLogger(__name__) + class HuggingFaceReranker(BaseReranker): """HuggingFace Transformers based reranker implementation.""" @@ -139,8 +142,9 @@ def rerank(self, query: str, documents: List[Dict[str, Any]], top_k: int = None) return reranked_docs - except Exception: + except Exception as e: # Fallback to original order if reranking fails + logger.warning("HuggingFace reranking failed, falling back to original order: %s", e) for doc in documents: doc['rerank_score'] = 0.0 final_top_k = top_k or self.config.top_k diff --git a/mem0/reranker/llm_reranker.py b/mem0/reranker/llm_reranker.py index b89e25f4a5..cef2dee66d 100644 --- a/mem0/reranker/llm_reranker.py +++ b/mem0/reranker/llm_reranker.py @@ -1,3 +1,4 @@ +import logging import re from typing import Any, Dict, List, Union @@ -6,6 +7,8 @@ from mem0.reranker.base import BaseReranker from mem0.utils.factory import LlmFactory +logger = logging.getLogger(__name__) + class LLMReranker(BaseReranker): """LLM-based reranker implementation.""" @@ -151,8 +154,9 @@ def rerank(self, query: str, documents: List[Dict[str, Any]], top_k: int = None) scored_doc['rerank_score'] = score scored_docs.append(scored_doc) - except Exception: + except Exception as e: # Fallback: assign neutral score if scoring fails + logger.warning("LLM reranking failed for a document, assigning neutral score: %s", e) scored_doc = doc.copy() scored_doc['rerank_score'] = 0.5 scored_docs.append(scored_doc) diff --git a/mem0/reranker/sentence_transformer_reranker.py b/mem0/reranker/sentence_transformer_reranker.py index 2df3b05e68..d891294f14 100644 --- a/mem0/reranker/sentence_transformer_reranker.py +++ b/mem0/reranker/sentence_transformer_reranker.py @@ -1,3 +1,4 @@ +import logging from typing import List, Dict, Any, Union import numpy as np @@ -11,6 +12,8 @@ except ImportError: SENTENCE_TRANSFORMERS_AVAILABLE = False +logger = logging.getLogger(__name__) + class SentenceTransformerReranker(BaseReranker): """Sentence Transformer based reranker implementation.""" @@ -102,8 +105,9 @@ def rerank(self, query: str, documents: List[Dict[str, Any]], top_k: int = None) return reranked_docs - except Exception: + except Exception as e: # Fallback to original order if reranking fails + logger.warning("SentenceTransformer reranking failed, falling back to original order: %s", e) for doc in documents: doc['rerank_score'] = 0.0 final_top_k = top_k or self.config.top_k diff --git a/mem0/reranker/zero_entropy_reranker.py b/mem0/reranker/zero_entropy_reranker.py index df57623067..dcf71bfaf5 100644 --- a/mem0/reranker/zero_entropy_reranker.py +++ b/mem0/reranker/zero_entropy_reranker.py @@ -1,3 +1,4 @@ +import logging import os from typing import List, Dict, Any @@ -9,6 +10,8 @@ except ImportError: ZERO_ENTROPY_AVAILABLE = False +logger = logging.getLogger(__name__) + class ZeroEntropyReranker(BaseReranker): """Zero Entropy-based reranker implementation.""" @@ -89,8 +92,9 @@ def rerank(self, query: str, documents: List[Dict[str, Any]], top_k: int = None) return reranked_docs - except Exception: + except Exception as e: # Fallback to original order if reranking fails + logger.warning("Zero Entropy reranking failed, falling back to original order: %s", e) for doc in documents: doc['rerank_score'] = 0.0 final_top_k = top_k or self.config.top_k diff --git a/mem0/vector_stores/chroma.py b/mem0/vector_stores/chroma.py index 0d399aad7b..b378726439 100644 --- a/mem0/vector_stores/chroma.py +++ b/mem0/vector_stores/chroma.py @@ -169,7 +169,7 @@ def delete(self, vector_id: str): Args: vector_id (str): ID of the vector to delete. """ - self.collection.delete(ids=vector_id) + self.collection.delete(ids=[vector_id]) def update( self, @@ -185,7 +185,11 @@ def update( vector (Optional[List[float]], optional): Updated vector. Defaults to None. payload (Optional[Dict], optional): Updated payload. Defaults to None. """ - self.collection.update(ids=vector_id, embeddings=vector, metadatas=payload) + self.collection.update( + ids=[vector_id], + embeddings=[vector] if vector is not None else None, + metadatas=[payload] if payload is not None else None, + ) def get(self, vector_id: str) -> Optional[OutputData]: """ diff --git a/mem0/vector_stores/opensearch.py b/mem0/vector_stores/opensearch.py index 8b6966ba51..ea8fb49596 100644 --- a/mem0/vector_stores/opensearch.py +++ b/mem0/vector_stores/opensearch.py @@ -39,6 +39,8 @@ def __init__(self, **kwargs): self.collection_name = config.collection_name self.embedding_model_dims = config.embedding_model_dims + self.auto_refresh = config.auto_refresh + self.create_col(self.collection_name, self.embedding_model_dims) def create_index(self) -> None: @@ -148,8 +150,6 @@ def insert( } try: self.client.index(index=self.collection_name, body=body) - # Force refresh to make documents immediately searchable for tests - self.client.indices.refresh(index=self.collection_name) results.append( OutputData( @@ -162,6 +162,14 @@ def insert( logger.error(f"Error inserting vector {id_}: {e}", exc_info=True) raise + # Refresh once after the full batch (not per document) if explicitly enabled. + # Disabled by default for Serverless compatibility: OpenSearch Serverless does not + # support the indices.refresh() API, and refreshing per document would cause a + # cluster-level I/O stall on every insert. + # See: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-genref.html + if self.auto_refresh: + self.client.indices.refresh(index=self.collection_name) + return results def search( diff --git a/server/dashboard/src/app/(root)/dashboard/memories/page.tsx b/server/dashboard/src/app/(root)/dashboard/memories/page.tsx index 66f95447b1..a7419ca5df 100644 --- a/server/dashboard/src/app/(root)/dashboard/memories/page.tsx +++ b/server/dashboard/src/app/(root)/dashboard/memories/page.tsx @@ -27,6 +27,8 @@ import { useApiQuery } from "@/hooks/use-api-query"; import { Memory } from "@/types/api"; const PAGE_SIZE = 20; +// Keep in sync with ALL_MEMORIES_LIMIT in server/main.py. +const MEMORY_FETCH_LIMIT = 1000; export default function MemoriesPage() { const [userId, setUserId] = useState(""); @@ -41,7 +43,9 @@ export default function MemoriesPage() { refetch, } = useApiQuery( async () => { - const params = userId.trim() ? { user_id: userId.trim() } : undefined; + const params = userId.trim() + ? { user_id: userId.trim(), top_k: MEMORY_FETCH_LIMIT } + : { top_k: MEMORY_FETCH_LIMIT }; const res = await api.get(MEMORY_ENDPOINTS.BASE, { params }); const raw = res.data?.results ?? res.data ?? []; return Array.isArray(raw) ? raw : []; @@ -96,7 +100,7 @@ export default function MemoriesPage() {

Memories

- {memories.length >= 1000 && ( + {memories.length >= MEMORY_FETCH_LIMIT && (