Search Models¶
Models for search queries and results.
Overview¶
Search models provide type-safe interfaces for searching conversations with BM25 ranking, date filtering, and title matching.
SearchQuery¶
SearchQuery
¶
Bases: BaseModel
Search query parameters with filters.
Encapsulates all search parameters including keywords, title filtering, date range filtering, and result limits. All filters are optional but at least one should be provided for meaningful results.
Immutability
This model is FROZEN - attempting to modify fields will raise ValidationError. Use .model_copy(update={...}) to create modified instances.
Example
from datetime import date
# Keyword search
query = SearchQuery(keywords=["algorithm", "design"], limit=10)
# Title filter only (fast, metadata-only)
query = SearchQuery(title_filter="Project")
# Combined filters
query = SearchQuery(
keywords=["refactor"],
title_filter="Project",
from_date=date(2024, 1, 1),
to_date=date(2024, 3, 31),
limit=20
)
# Check filter types
if query.has_keyword_search():
print("Performing full-text search")
Attributes:
| Name | Type | Description |
|---|---|---|
keywords |
list[str] | None
|
Keywords for full-text search (OR logic, case-insensitive) |
title_filter |
str | None
|
Partial match on conversation title (metadata-only, fast) |
from_date |
date | None
|
Start date for date range filter (inclusive) |
to_date |
date | None
|
End date for date range filter (inclusive) |
limit |
int
|
Maximum results to return (1-1000, default: 10) |
validate_message_count_bounds
¶
Validate min_messages <= max_messages when both are set (FR-005).
Raises:
| Type | Description |
|---|---|
ValueError
|
If min_messages > max_messages |
Example
Source code in src/echomine/models/search.py
has_keyword_search
¶
Check if keyword search is requested.
Returns:
| Type | Description |
|---|---|
bool
|
True if keywords provided and non-empty, False otherwise |
Source code in src/echomine/models/search.py
has_title_filter
¶
Check if title filtering is requested.
Returns:
| Type | Description |
|---|---|
bool
|
True if title_filter provided and non-empty, False otherwise |
Source code in src/echomine/models/search.py
has_date_filter
¶
Check if date range filtering is requested.
Returns:
| Type | Description |
|---|---|
bool
|
True if either from_date or to_date provided, False otherwise |
Example
Source code in src/echomine/models/search.py
has_phrase_search
¶
Check if phrase search is requested.
Returns:
| Type | Description |
|---|---|
bool
|
True if phrases provided and non-empty, False otherwise |
Source code in src/echomine/models/search.py
has_exclude_keywords
¶
Check if exclude keywords filtering is requested.
Returns:
| Type | Description |
|---|---|
bool
|
True if exclude_keywords provided and non-empty, False otherwise |
Example
Source code in src/echomine/models/search.py
has_message_count_filter
¶
Check if message count filtering is requested (FR-004).
Returns:
| Type | Description |
|---|---|
bool
|
True if either min_messages or max_messages is set, False otherwise |
Example
Source code in src/echomine/models/search.py
SearchResult¶
SearchResult
¶
Bases: BaseModel, Generic[ConversationT]
Generic search result with relevance scoring.
Represents a conversation match from a search query with relevance metadata. Results are typically sorted by score (descending) before being returned to the user.
Generic Type
ConversationT: Provider-specific conversation type (e.g., Conversation for OpenAI)
Immutability
This model is FROZEN - attempting to modify fields will raise ValidationError. Use .model_copy(update={...}) to create modified instances.
Example
from echomine.models import Conversation, SearchResult
result: SearchResult[Conversation] = SearchResult(
conversation=conversation,
score=0.85,
matched_message_ids=["msg-001", "msg-005"]
)
print(f"Relevance: {result.score:.2f}")
print(f"Title: {result.conversation.title}")
print(f"Matched {len(result.matched_message_ids)} messages")
# Sort results by relevance
results = sorted(results, reverse=True) # Uses __lt__
Attributes:
| Name | Type | Description |
|---|---|---|
conversation |
ConversationT
|
Matched conversation object (full conversation, not just ID) |
score |
float
|
Relevance score (0.0-1.0, higher = better match) |
matched_message_ids |
list[str]
|
Message IDs containing keyword matches |
__lt__
¶
Enable sorting by relevance (descending).
When using sorted() or .sort(), results will be ordered by relevance score in descending order (highest score first).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
SearchResult[ConversationT]
|
Another SearchResult to compare against |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if self.score > other.score (reversed for descending sort) |
Example
results = [
SearchResult(conversation=c1, score=0.5, matched_message_ids=[]),
SearchResult(conversation=c2, score=0.9, matched_message_ids=[]),
SearchResult(conversation=c3, score=0.7, matched_message_ids=[]),
]
# Sort descending by relevance
sorted_results = sorted(results, reverse=True)
# Order: [0.9, 0.7, 0.5]
Source code in src/echomine/models/search.py
Usage Examples¶
Basic Keyword Search¶
from echomine import OpenAIAdapter, SearchQuery
from pathlib import Path
adapter = OpenAIAdapter()
export_file = Path("conversations.json")
# Create search query
query = SearchQuery(
keywords=["algorithm", "leetcode"],
limit=10
)
# Execute search
for result in adapter.search(export_file, query):
print(f"[{result.score:.2f}] {result.conversation.title}")
print(f" Snippet: {result.snippet}") # v1.1.0: automatic preview
print(f" Matches: {len(result.matched_message_ids)} messages")
Advanced Search Features (v1.1.0+)¶
# Exact phrase matching
query = SearchQuery(phrases=["algo-insights", "data pipeline"])
for result in adapter.search(export_file, query):
print(f"{result.conversation.title}: {result.snippet}")
# Boolean match mode (require ALL keywords)
query = SearchQuery(
keywords=["python", "async", "testing"],
match_mode="all" # AND logic
)
# Exclude unwanted results
query = SearchQuery(
keywords=["python"],
exclude_keywords=["django", "flask"]
)
# Role filtering
query = SearchQuery(
keywords=["refactor"],
role_filter="user" # Search only your messages
)
# Combine all features
query = SearchQuery(
keywords=["optimization"],
phrases=["algo-insights"],
match_mode="all",
exclude_keywords=["test"],
role_filter="user",
limit=10
)
Title Filtering (Fast)¶
Title filtering is metadata-only, much faster than full-text search:
# Search by title (partial match, case-insensitive)
query = SearchQuery(
title_filter="Project Alpha",
limit=10
)
for result in adapter.search(export_file, query):
print(result.conversation.title)
Date Range Filtering¶
from datetime import date
# Filter by creation date
query = SearchQuery(
from_date=date(2024, 1, 1),
to_date=date(2024, 3, 31),
limit=20
)
for result in adapter.search(export_file, query):
print(f"{result.conversation.title} - {result.conversation.created_at.date()}")
Combined Filtering¶
Combine multiple filters for precision:
query = SearchQuery(
keywords=["python", "async"],
title_filter="Tutorial",
from_date=date(2024, 1, 1),
to_date=date(2024, 12, 31),
limit=5
)
for result in adapter.search(export_file, query):
print(f"[{result.score:.2f}] {result.conversation.title}")
print(f" Created: {result.conversation.created_at.date()}")
print(f" Messages: {len(result.conversation.messages)}")
Working with Results¶
# Collect results
results = list(adapter.search(export_file, query))
# Results are sorted by relevance (descending)
assert results[0].score >= results[1].score
# Access conversation data
for result in results:
conv = result.conversation
print(f"Title: {conv.title}")
print(f"Score: {result.score:.2f}")
print(f"Messages: {len(conv.messages)}")
Validation¶
SearchQuery validates constraints automatically:
from pydantic import ValidationError
# ❌ Invalid: from_date > to_date
try:
invalid = SearchQuery(
from_date=date(2024, 12, 31),
to_date=date(2024, 1, 1),
keywords=["test"]
)
except ValidationError as e:
print(f"Error: {e}")
# ❌ Invalid: limit < 1
try:
invalid = SearchQuery(
keywords=["test"],
limit=0
)
except ValidationError as e:
print(f"Error: limit must be >= 1")
# ✅ Valid: all constraints met
valid = SearchQuery(
keywords=["test"],
from_date=date(2024, 1, 1),
to_date=date(2024, 12, 31),
limit=10
)
SearchQuery Fields¶
Content Matching Fields (v1.1.0+)¶
- keywords (
list[str] | None): Keywords for BM25 full-text search - phrases (
list[str] | None): Exact phrases to match (preserves special characters) - match_mode (
Literal["any", "all"]): Keyword matching logic (default: "any") "any": OR logic - match if ANY keyword present"all": AND logic - match if ALL keywords present- exclude_keywords (
list[str] | None): Terms to exclude (OR logic - excludes if ANY present) - role_filter (
Literal["user", "assistant", "system"] | None): Filter by message author role
Legacy Filter Fields¶
- title_filter (
str | None): Partial title match (case-insensitive) - from_date (
date | None): Minimum creation date (inclusive) - to_date (
date | None): Maximum creation date (inclusive)
Output Control¶
- limit (
int): Maximum results to return (default: 10, min: 1)
Validation Rules¶
- At least one filter: Must specify keywords, phrases, or title_filter
- Date range: If both dates specified,
from_date <= to_date - Limit: Must be >= 1
- Match mode: Only affects keywords (phrases always use OR logic)
- Role filter: Must be one of: "user", "assistant", "system" (case-insensitive)
SearchResult Fields¶
Fields¶
- conversation (
Conversation): The matched conversation - score (
float): Relevance score (0.0 to 1.0, higher is better) - matched_message_ids (
list[str]): IDs of messages that matched the search query (v1.1.0+) - snippet (
str): Preview text from first matching message, ~100 characters (v1.1.0+)
Score Interpretation¶
- 1.0: Perfect match (all keywords present, high frequency)
- 0.8-0.9: Excellent match (most keywords, good frequency)
- 0.6-0.7: Good match (some keywords, moderate frequency)
- 0.4-0.5: Fair match (few keywords, low frequency)
- <0.4: Weak match
Note: Title filtering and date filtering do not affect score. Score is based on BM25 ranking when keywords or phrases are specified.
Snippet Features (v1.1.0+)¶
- Extracted from first matching message
- Truncated to ~100 characters with "..." suffix
- Multiple matches indicated by "(+N more)" in CLI output
- Fallback text for empty/malformed content
- Always present (never None)
Working with Matched Messages¶
for result in adapter.search(export_file, query):
conversation = result.conversation
matched_ids = result.matched_message_ids
# Find actual matched messages
matched_messages = [
msg for msg in conversation.messages
if msg.id in matched_ids
]
print(f"Found {len(matched_messages)} matching messages")
for msg in matched_messages:
print(f" [{msg.role}] {msg.content[:50]}...")
Search Behavior¶
Two-Stage Matching Process (v1.1.0+)¶
Stage 1: Content Matching (OR relationship)
Conversations match if ANY of these are true:
- Phrases: ANY phrase is found (exact match, case-insensitive)
- Keywords: Match according to match_mode
- match_mode="any" (default): ANY keyword present
- match_mode="all": ALL keywords present
Key insight: Phrases and keywords are alternatives, not cumulative. If both specified, matches if EITHER phrases match OR keywords match.
Stage 2: Post-Match Filters (AND relationship)
After Stage 1, results are filtered by ALL of these:
- exclude_keywords: Remove if ANY excluded term found
- role_filter: Only messages from specified role
- title_filter: Only conversations with matching title
- from_date / to_date: Only in date range
Filter Combination Examples¶
# Phrase OR keyword (matches either)
SearchQuery(phrases=["api"], keywords=["python"])
# Multiple keywords with ALL mode (requires both)
SearchQuery(keywords=["python", "async"], match_mode="all")
# Content + exclusion
SearchQuery(phrases=["api"], keywords=["python"], exclude_keywords=["java"])
# Role-specific search
SearchQuery(keywords=["python"], role_filter="user")
Legacy Behavior (v1.0.x)¶
For backward compatibility, v1.0.x behavior is preserved:
- Date range filter (if specified)
- Title filter (if specified) - metadata-only
- Keyword search (if specified) - full-text with BM25
- Limit results
Keyword Search (BM25)¶
When keywords or phrases are specified:
- Full-text search across message content
- BM25 relevance ranking
- Results sorted by score (descending)
- Snippet extraction from first match
Performance: Scans all conversation content. Slower but comprehensive.
Title Filtering¶
When only title_filter is specified:
- Metadata-only search (no message content scan)
- Partial match, case-insensitive
- Results returned in file order
Performance: Fast (metadata-only). Use when you remember the title.
Date Filtering¶
When date range is specified:
- Filters by
conversation.created_at - Inclusive range (from_date <= created_at <= to_date)
- Can be combined with keyword, phrase, or title search
Performance Tips¶
- Use title filtering when possible: 10-100x faster than keyword search
- Limit results: Use
limitto avoid processing thousands of matches - Narrow date ranges: Reduces conversations to search
- Specific keywords: More specific keywords = better ranking
Related Models¶
- Conversation: Result conversation model
- Message: Message model within conversations