Abstract:
Mem0’s infer=True
mode is designed to intelligently filter and deduplicate memories, but the default prompts are not meant for general use cases and are for personal information management! This post explores the root causes of aggressive filtering and provides practical solutions for more permissive memory management.
Estimated reading time: 6 minutes
Mem0’s Aggressive infer=True
Behavior: When LLM Filtering Goes Too Far
I’ve been exploring Mem0’s memory management capabilities lately, and discovered something frustrating about the default infer=True
behavior. While the concept of intelligent memory filtering is compelling, the implementation is surprisingly aggressive about what it considers “worth storing.”
After digging through the codebase, I found the root cause: the default LLM prompts are designed for a very specific use case (personal information management) but are being applied too broadly. Here’s what’s happening and how to fix it.
The Problem: Overly Restrictive Default Prompts
The aggressive filtering stems from two main LLM prompts that are more restrictive than they should be for general use cases.
1. Fact Extraction Prompt
Located in mem0/configs/prompts.py
, the FACT_RETRIEVAL_PROMPT
is designed specifically for “Personal Information Organizer” and includes problematic few-shot examples:
1
2
3
4
5
Input: Hi.
Output: {"facts" : []}
Input: There are branches in trees.
Output: {"facts" : []}
The prompt teaches the LLM to return empty arrays for common conversational content and general factual statements. While this might make sense for a personal information manager, it’s too restrictive for general knowledge storage.
2. Memory Update Prompt
The DEFAULT_UPDATE_MEMORY_PROMPT
has overly strict similarity detection logic. It’s designed to prevent duplicates, but the threshold for what constitutes “same information” is too aggressive.
For example, the prompt treats “Likes cheese pizza” and “Loves cheese pizza” as identical information, which might be appropriate for personal preferences but not for general knowledge storage.
The Impact: Empty Results and Skipped Updates
When the fact extraction returns an empty array, the entire memory update process is skipped. From mem0/memory/main.py
:
1
2
if not new_retrieved_facts:
logger.debug("No new facts retrieved from input. Skipping memory update LLM call.")
This means that if the LLM decides there are no “personal facts” to extract, nothing gets stored at all. This is particularly problematic when you’re trying to store general knowledge, context, or diverse types of information.
Solutions: Custom Prompts and Configuration
The good news is that Mem0 provides configuration options to override these aggressive defaults. Here are the practical solutions:
Solution 1: Use Custom Prompts
You can override the default prompts with more permissive versions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
from mem0 import Memory
# Less restrictive fact extraction prompt
custom_fact_prompt = """You are a memory extraction system. Extract all meaningful facts, information, and insights from conversations. Focus on:
1. Personal information and preferences
2. General facts and knowledge
3. Important details and context
4. Any information that might be useful for future reference
Examples:
Input: Hi.
Output: {"facts": []}
Input: There are branches in trees.
Output: {"facts": ["Trees have branches"]}
Input: I like pizza and the weather is sunny today.
Output: {"facts": ["Likes pizza", "Weather is sunny today"]}
Return facts in JSON format: {"facts": ["fact1", "fact2", ...]}
"""
# Less aggressive update prompt
custom_update_prompt = """You are a memory manager. Compare new facts with existing memories and decide:
- ADD: New information not in memory
- UPDATE: Information that significantly expands or corrects existing memory
- NONE: Only if information is exactly identical
- DELETE: Only if information directly contradicts existing memory
Be more permissive in adding new information. Only mark as NONE if the information is truly identical.
"""
config = {
"custom_fact_extraction_prompt": custom_fact_prompt,
"custom_update_memory_prompt": custom_update_prompt,
"version": "v1.1"
}
memory = Memory.from_config(config)
Solution 2: Use infer=False
for Immediate Relief
For cases where you want to store all content without LLM filtering:
1
memory.add(messages, user_id="user123", infer=False)
This bypasses the LLM filtering entirely and stores the raw content, which can be useful for debugging or when you want maximum information retention.
Important Trade-off: Using infer=False
means you lose Mem0’s sophisticated semantic chunking capabilities. Instead of LLM-powered fact extraction, you get traditional text-based chunking.
Why This Matters
The issue highlights an important consideration in LLM-powered systems: the prompts that work well for one use case might be too restrictive for others. Mem0’s default prompts are optimized for personal information management, but they’re being applied to general memory storage scenarios.
This creates a mismatch between user expectations (store my information) and system behavior (filter out most of it). The solution isn’t to abandon intelligent filtering, but to make it configurable and appropriate for different use cases.
The Semantic Chunking Trade-off
Here’s where things get interesting: infer=True
doesn’t just filter content—it performs sophisticated semantic chunking that’s fundamentally different from traditional text splitting.
How infer=True
Does Semantic Chunking
When you use infer=True
, Mem0 uses LLM-powered fact extraction to break down conversations into meaningful semantic units:
1
2
3
4
5
6
7
8
9
10
11
12
# From mem0/memory/main.py
system_prompt, user_prompt = get_fact_retrieval_messages(parsed_messages)
response = self.llm.generate_response(
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
],
response_format={"type": "json_object"},
)
new_retrieved_facts = json.loads(response)["facts"]
The LLM extracts discrete facts from conversations, creating semantically meaningful “chunks” rather than arbitrary text segments.
Example: Semantic vs. Traditional Chunking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Input conversation
messages = [
{"role": "user", "content": "Hi, I'm Sarah. I work as a data scientist at Google. I love hiking and recently visited Yosemite. My favorite food is sushi."},
{"role": "assistant", "content": "Nice to meet you Sarah! That sounds like a great trip."}
]
# With infer=True (semantic chunking):
facts = [
"Name is Sarah",
"Works as a data scientist at Google",
"Loves hiking",
"Recently visited Yosemite",
"Favorite food is sushi"
]
# With infer=False (traditional chunking):
chunks = [
"Hi, I'm Sarah. I work as a data scientist at Google. I love hiking and recently visited Yosemite. My favorite food is sushi.",
"Nice to meet you Sarah! That sounds like a great trip."
]
Why Semantic Chunking Matters
Aspect | Traditional Chunking | Semantic Chunking |
---|---|---|
Retrieval | Text similarity | Meaning-based search |
Granularity | Fixed-size blocks | Semantic concepts |
Deduplication | Overlapping chunks | LLM merges similar info |
Personalization | Raw text | User-relevant facts |
Context | Arbitrary splits | Meaningful relationships |
The Trade-off Decision
When choosing between infer=True
and infer=False
, you’re really choosing between:
infer=True
: Sophisticated semantic chunking with aggressive filtering
infer=False
: No filtering but traditional text-based chunking
The ideal solution is often custom prompts that give you semantic chunking without the overly restrictive filtering.
Key Takeaways
-
Default prompts are domain-specific: Mem0’s prompts are designed for personal information management, not general knowledge storage.
-
Configuration is available: The system provides hooks to override the default behavior with custom prompts.
-
infer=False
loses semantic chunking: Usinginfer=False
bypasses LLM filtering but also loses sophisticated semantic chunking capabilities. -
Semantic chunking is powerful:
infer=True
provides LLM-powered fact extraction that’s much more intelligent than traditional text chunking. -
Similarity detection can be too aggressive: The deduplication logic might be preventing legitimate new information from being stored.
Practical Recommendations
For most use cases, I’d recommend:
- Start with
infer=False
to understand what content you’re actually generating - Create custom prompts that match your specific use case and information types
- Test with diverse content to ensure your prompts aren’t too restrictive
- Monitor the fact extraction results to see what’s being filtered out
- Consider the semantic chunking trade-off when choosing between
infer=True
andinfer=False
The goal should be intelligent filtering that enhances rather than hinders your memory management workflow, while preserving the benefits of semantic chunking.
Wrapping Up
Mem0’s infer=True
behavior is a case study in how LLM prompts can be both powerful and problematic. The default prompts are well-designed for their intended use case, but they’re too restrictive for general memory management.
The solution isn’t to abandon the intelligent filtering approach, but to make it more configurable and appropriate for different scenarios. With custom prompts and the option to use infer=False
, you can get the benefits of intelligent memory management without the drawbacks of overly aggressive filtering.
The key insight is that infer=True
provides sophisticated semantic chunking that’s much more intelligent than traditional text splitting, but the filtering can be too aggressive. The ideal approach is custom prompts that preserve the semantic chunking benefits while being more permissive about what gets stored.