Skip to content

Search Models

Models for search queries and results.

Overview

Search models provide type-safe interfaces for searching conversations with BM25 ranking, date filtering, and title matching.

SearchQuery

SearchQuery

Bases: BaseModel

Search query parameters with filters.

Encapsulates all search parameters including keywords, title filtering, date range filtering, and result limits. All filters are optional but at least one should be provided for meaningful results.

Immutability

This model is FROZEN - attempting to modify fields will raise ValidationError. Use .model_copy(update={...}) to create modified instances.

Example
from datetime import date

# Keyword search
query = SearchQuery(keywords=["algorithm", "design"], limit=10)

# Title filter only (fast, metadata-only)
query = SearchQuery(title_filter="Project")

# Combined filters
query = SearchQuery(
    keywords=["refactor"],
    title_filter="Project",
    from_date=date(2024, 1, 1),
    to_date=date(2024, 3, 31),
    limit=20
)

# Check filter types
if query.has_keyword_search():
    print("Performing full-text search")

Attributes:

Name Type Description
keywords list[str] | None

Keywords for full-text search (OR logic, case-insensitive)

title_filter str | None

Partial match on conversation title (metadata-only, fast)

from_date date | None

Start date for date range filter (inclusive)

to_date date | None

End date for date range filter (inclusive)

limit int

Maximum results to return (1-1000, default: 10)

validate_message_count_bounds

validate_message_count_bounds() -> SearchQuery

Validate min_messages <= max_messages when both are set (FR-005).

Raises:

Type Description
ValueError

If min_messages > max_messages

Example
# Valid: min <= max
query = SearchQuery(min_messages=5, max_messages=20)

# Invalid: min > max
try:
    query = SearchQuery(min_messages=20, max_messages=5)
except ValidationError as e:
    print(e)  # "min_messages (20) must be <= max_messages (5)"
Source code in src/echomine/models/search.py
@model_validator(mode="after")
def validate_message_count_bounds(self) -> SearchQuery:
    """Validate min_messages <= max_messages when both are set (FR-005).

    Raises:
        ValueError: If min_messages > max_messages

    Example:
        ```python
        # Valid: min <= max
        query = SearchQuery(min_messages=5, max_messages=20)

        # Invalid: min > max
        try:
            query = SearchQuery(min_messages=20, max_messages=5)
        except ValidationError as e:
            print(e)  # "min_messages (20) must be <= max_messages (5)"
        ```
    """
    if self.min_messages is not None and self.max_messages is not None:
        if self.min_messages > self.max_messages:
            raise ValueError(
                f"min_messages ({self.min_messages}) must be <= max_messages ({self.max_messages})"
            )
    return self
has_keyword_search() -> bool

Check if keyword search is requested.

Returns:

Type Description
bool

True if keywords provided and non-empty, False otherwise

Example
query = SearchQuery(keywords=["algorithm"])
assert query.has_keyword_search() is True
Source code in src/echomine/models/search.py
def has_keyword_search(self) -> bool:
    """Check if keyword search is requested.

    Returns:
        True if keywords provided and non-empty, False otherwise

    Example:
        ```python
        query = SearchQuery(keywords=["algorithm"])
        assert query.has_keyword_search() is True
        ```
    """
    return self.keywords is not None and len(self.keywords) > 0

has_title_filter

has_title_filter() -> bool

Check if title filtering is requested.

Returns:

Type Description
bool

True if title_filter provided and non-empty, False otherwise

Example
query = SearchQuery(title_filter="Project")
assert query.has_title_filter() is True
Source code in src/echomine/models/search.py
def has_title_filter(self) -> bool:
    """Check if title filtering is requested.

    Returns:
        True if title_filter provided and non-empty, False otherwise

    Example:
        ```python
        query = SearchQuery(title_filter="Project")
        assert query.has_title_filter() is True
        ```
    """
    return self.title_filter is not None and len(self.title_filter.strip()) > 0

has_date_filter

has_date_filter() -> bool

Check if date range filtering is requested.

Returns:

Type Description
bool

True if either from_date or to_date provided, False otherwise

Example
from datetime import date

query = SearchQuery(from_date=date(2024, 1, 1))
assert query.has_date_filter() is True
Source code in src/echomine/models/search.py
def has_date_filter(self) -> bool:
    """Check if date range filtering is requested.

    Returns:
        True if either from_date or to_date provided, False otherwise

    Example:
        ```python
        from datetime import date

        query = SearchQuery(from_date=date(2024, 1, 1))
        assert query.has_date_filter() is True
        ```
    """
    return self.from_date is not None or self.to_date is not None
has_phrase_search() -> bool

Check if phrase search is requested.

Returns:

Type Description
bool

True if phrases provided and non-empty, False otherwise

Example
query = SearchQuery(phrases=["algo-insights"])
assert query.has_phrase_search() is True
Source code in src/echomine/models/search.py
def has_phrase_search(self) -> bool:
    """Check if phrase search is requested.

    Returns:
        True if phrases provided and non-empty, False otherwise

    Example:
        ```python
        query = SearchQuery(phrases=["algo-insights"])
        assert query.has_phrase_search() is True
        ```
    """
    return self.phrases is not None and len(self.phrases) > 0

has_exclude_keywords

has_exclude_keywords() -> bool

Check if exclude keywords filtering is requested.

Returns:

Type Description
bool

True if exclude_keywords provided and non-empty, False otherwise

Example
query = SearchQuery(keywords=["python"], exclude_keywords=["django"])
assert query.has_exclude_keywords() is True
Source code in src/echomine/models/search.py
def has_exclude_keywords(self) -> bool:
    """Check if exclude keywords filtering is requested.

    Returns:
        True if exclude_keywords provided and non-empty, False otherwise

    Example:
        ```python
        query = SearchQuery(keywords=["python"], exclude_keywords=["django"])
        assert query.has_exclude_keywords() is True
        ```
    """
    return self.exclude_keywords is not None and len(self.exclude_keywords) > 0

has_message_count_filter

has_message_count_filter() -> bool

Check if message count filtering is requested (FR-004).

Returns:

Type Description
bool

True if either min_messages or max_messages is set, False otherwise

Example
query = SearchQuery(min_messages=10)
assert query.has_message_count_filter() is True

query2 = SearchQuery(max_messages=50)
assert query2.has_message_count_filter() is True

query3 = SearchQuery()
assert query3.has_message_count_filter() is False
Source code in src/echomine/models/search.py
def has_message_count_filter(self) -> bool:
    """Check if message count filtering is requested (FR-004).

    Returns:
        True if either min_messages or max_messages is set, False otherwise

    Example:
        ```python
        query = SearchQuery(min_messages=10)
        assert query.has_message_count_filter() is True

        query2 = SearchQuery(max_messages=50)
        assert query2.has_message_count_filter() is True

        query3 = SearchQuery()
        assert query3.has_message_count_filter() is False
        ```
    """
    return self.min_messages is not None or self.max_messages is not None

SearchResult

SearchResult

Bases: BaseModel, Generic[ConversationT]

Generic search result with relevance scoring.

Represents a conversation match from a search query with relevance metadata. Results are typically sorted by score (descending) before being returned to the user.

Generic Type

ConversationT: Provider-specific conversation type (e.g., Conversation for OpenAI)

Immutability

This model is FROZEN - attempting to modify fields will raise ValidationError. Use .model_copy(update={...}) to create modified instances.

Example
from echomine.models import Conversation, SearchResult

result: SearchResult[Conversation] = SearchResult(
    conversation=conversation,
    score=0.85,
    matched_message_ids=["msg-001", "msg-005"]
)

print(f"Relevance: {result.score:.2f}")
print(f"Title: {result.conversation.title}")
print(f"Matched {len(result.matched_message_ids)} messages")

# Sort results by relevance
results = sorted(results, reverse=True)  # Uses __lt__

Attributes:

Name Type Description
conversation ConversationT

Matched conversation object (full conversation, not just ID)

score float

Relevance score (0.0-1.0, higher = better match)

matched_message_ids list[str]

Message IDs containing keyword matches

__lt__

__lt__(other: SearchResult[ConversationT]) -> bool

Enable sorting by relevance (descending).

When using sorted() or .sort(), results will be ordered by relevance score in descending order (highest score first).

Parameters:

Name Type Description Default
other SearchResult[ConversationT]

Another SearchResult to compare against

required

Returns:

Type Description
bool

True if self.score > other.score (reversed for descending sort)

Example
results = [
    SearchResult(conversation=c1, score=0.5, matched_message_ids=[]),
    SearchResult(conversation=c2, score=0.9, matched_message_ids=[]),
    SearchResult(conversation=c3, score=0.7, matched_message_ids=[]),
]

# Sort descending by relevance
sorted_results = sorted(results, reverse=True)
# Order: [0.9, 0.7, 0.5]
Source code in src/echomine/models/search.py
def __lt__(self, other: SearchResult[ConversationT]) -> bool:
    """Enable sorting by relevance (descending).

    When using sorted() or .sort(), results will be ordered by
    relevance score in descending order (highest score first).

    Args:
        other: Another SearchResult to compare against

    Returns:
        True if self.score > other.score (reversed for descending sort)

    Example:
        ```python
        results = [
            SearchResult(conversation=c1, score=0.5, matched_message_ids=[]),
            SearchResult(conversation=c2, score=0.9, matched_message_ids=[]),
            SearchResult(conversation=c3, score=0.7, matched_message_ids=[]),
        ]

        # Sort descending by relevance
        sorted_results = sorted(results, reverse=True)
        # Order: [0.9, 0.7, 0.5]
        ```
    """
    return self.score > other.score  # Reverse for descending sort

Usage Examples

from echomine import OpenAIAdapter, SearchQuery
from pathlib import Path

adapter = OpenAIAdapter()
export_file = Path("conversations.json")

# Create search query
query = SearchQuery(
    keywords=["algorithm", "leetcode"],
    limit=10
)

# Execute search
for result in adapter.search(export_file, query):
    print(f"[{result.score:.2f}] {result.conversation.title}")
    print(f"  Snippet: {result.snippet}")  # v1.1.0: automatic preview
    print(f"  Matches: {len(result.matched_message_ids)} messages")

Advanced Search Features (v1.1.0+)

# Exact phrase matching
query = SearchQuery(phrases=["algo-insights", "data pipeline"])
for result in adapter.search(export_file, query):
    print(f"{result.conversation.title}: {result.snippet}")

# Boolean match mode (require ALL keywords)
query = SearchQuery(
    keywords=["python", "async", "testing"],
    match_mode="all"  # AND logic
)

# Exclude unwanted results
query = SearchQuery(
    keywords=["python"],
    exclude_keywords=["django", "flask"]
)

# Role filtering
query = SearchQuery(
    keywords=["refactor"],
    role_filter="user"  # Search only your messages
)

# Combine all features
query = SearchQuery(
    keywords=["optimization"],
    phrases=["algo-insights"],
    match_mode="all",
    exclude_keywords=["test"],
    role_filter="user",
    limit=10
)

Title Filtering (Fast)

Title filtering is metadata-only, much faster than full-text search:

# Search by title (partial match, case-insensitive)
query = SearchQuery(
    title_filter="Project Alpha",
    limit=10
)

for result in adapter.search(export_file, query):
    print(result.conversation.title)

Date Range Filtering

from datetime import date

# Filter by creation date
query = SearchQuery(
    from_date=date(2024, 1, 1),
    to_date=date(2024, 3, 31),
    limit=20
)

for result in adapter.search(export_file, query):
    print(f"{result.conversation.title} - {result.conversation.created_at.date()}")

Combined Filtering

Combine multiple filters for precision:

query = SearchQuery(
    keywords=["python", "async"],
    title_filter="Tutorial",
    from_date=date(2024, 1, 1),
    to_date=date(2024, 12, 31),
    limit=5
)

for result in adapter.search(export_file, query):
    print(f"[{result.score:.2f}] {result.conversation.title}")
    print(f"  Created: {result.conversation.created_at.date()}")
    print(f"  Messages: {len(result.conversation.messages)}")

Working with Results

# Collect results
results = list(adapter.search(export_file, query))

# Results are sorted by relevance (descending)
assert results[0].score >= results[1].score

# Access conversation data
for result in results:
    conv = result.conversation
    print(f"Title: {conv.title}")
    print(f"Score: {result.score:.2f}")
    print(f"Messages: {len(conv.messages)}")

Validation

SearchQuery validates constraints automatically:

from pydantic import ValidationError

# ❌ Invalid: from_date > to_date
try:
    invalid = SearchQuery(
        from_date=date(2024, 12, 31),
        to_date=date(2024, 1, 1),
        keywords=["test"]
    )
except ValidationError as e:
    print(f"Error: {e}")

# ❌ Invalid: limit < 1
try:
    invalid = SearchQuery(
        keywords=["test"],
        limit=0
    )
except ValidationError as e:
    print(f"Error: limit must be >= 1")

# ✅ Valid: all constraints met
valid = SearchQuery(
    keywords=["test"],
    from_date=date(2024, 1, 1),
    to_date=date(2024, 12, 31),
    limit=10
)

SearchQuery Fields

Content Matching Fields (v1.1.0+)

  • keywords (list[str] | None): Keywords for BM25 full-text search
  • phrases (list[str] | None): Exact phrases to match (preserves special characters)
  • match_mode (Literal["any", "all"]): Keyword matching logic (default: "any")
  • "any": OR logic - match if ANY keyword present
  • "all": AND logic - match if ALL keywords present
  • exclude_keywords (list[str] | None): Terms to exclude (OR logic - excludes if ANY present)
  • role_filter (Literal["user", "assistant", "system"] | None): Filter by message author role

Legacy Filter Fields

  • title_filter (str | None): Partial title match (case-insensitive)
  • from_date (date | None): Minimum creation date (inclusive)
  • to_date (date | None): Maximum creation date (inclusive)

Output Control

  • limit (int): Maximum results to return (default: 10, min: 1)

Validation Rules

  1. At least one filter: Must specify keywords, phrases, or title_filter
  2. Date range: If both dates specified, from_date <= to_date
  3. Limit: Must be >= 1
  4. Match mode: Only affects keywords (phrases always use OR logic)
  5. Role filter: Must be one of: "user", "assistant", "system" (case-insensitive)

SearchResult Fields

Fields

  • conversation (Conversation): The matched conversation
  • score (float): Relevance score (0.0 to 1.0, higher is better)
  • matched_message_ids (list[str]): IDs of messages that matched the search query (v1.1.0+)
  • snippet (str): Preview text from first matching message, ~100 characters (v1.1.0+)

Score Interpretation

  • 1.0: Perfect match (all keywords present, high frequency)
  • 0.8-0.9: Excellent match (most keywords, good frequency)
  • 0.6-0.7: Good match (some keywords, moderate frequency)
  • 0.4-0.5: Fair match (few keywords, low frequency)
  • <0.4: Weak match

Note: Title filtering and date filtering do not affect score. Score is based on BM25 ranking when keywords or phrases are specified.

Snippet Features (v1.1.0+)

  • Extracted from first matching message
  • Truncated to ~100 characters with "..." suffix
  • Multiple matches indicated by "(+N more)" in CLI output
  • Fallback text for empty/malformed content
  • Always present (never None)

Working with Matched Messages

for result in adapter.search(export_file, query):
    conversation = result.conversation
    matched_ids = result.matched_message_ids

    # Find actual matched messages
    matched_messages = [
        msg for msg in conversation.messages
        if msg.id in matched_ids
    ]

    print(f"Found {len(matched_messages)} matching messages")
    for msg in matched_messages:
        print(f"  [{msg.role}] {msg.content[:50]}...")

Search Behavior

Two-Stage Matching Process (v1.1.0+)

Stage 1: Content Matching (OR relationship)

Conversations match if ANY of these are true: - Phrases: ANY phrase is found (exact match, case-insensitive) - Keywords: Match according to match_mode - match_mode="any" (default): ANY keyword present - match_mode="all": ALL keywords present

Key insight: Phrases and keywords are alternatives, not cumulative. If both specified, matches if EITHER phrases match OR keywords match.

Stage 2: Post-Match Filters (AND relationship)

After Stage 1, results are filtered by ALL of these: - exclude_keywords: Remove if ANY excluded term found - role_filter: Only messages from specified role - title_filter: Only conversations with matching title - from_date / to_date: Only in date range

Filter Combination Examples

# Phrase OR keyword (matches either)
SearchQuery(phrases=["api"], keywords=["python"])

# Multiple keywords with ALL mode (requires both)
SearchQuery(keywords=["python", "async"], match_mode="all")

# Content + exclusion
SearchQuery(phrases=["api"], keywords=["python"], exclude_keywords=["java"])

# Role-specific search
SearchQuery(keywords=["python"], role_filter="user")

Legacy Behavior (v1.0.x)

For backward compatibility, v1.0.x behavior is preserved:

  1. Date range filter (if specified)
  2. Title filter (if specified) - metadata-only
  3. Keyword search (if specified) - full-text with BM25
  4. Limit results

Keyword Search (BM25)

When keywords or phrases are specified:

  1. Full-text search across message content
  2. BM25 relevance ranking
  3. Results sorted by score (descending)
  4. Snippet extraction from first match

Performance: Scans all conversation content. Slower but comprehensive.

Title Filtering

When only title_filter is specified:

  1. Metadata-only search (no message content scan)
  2. Partial match, case-insensitive
  3. Results returned in file order

Performance: Fast (metadata-only). Use when you remember the title.

Date Filtering

When date range is specified:

  1. Filters by conversation.created_at
  2. Inclusive range (from_date <= created_at <= to_date)
  3. Can be combined with keyword, phrase, or title search

Performance Tips

  1. Use title filtering when possible: 10-100x faster than keyword search
  2. Limit results: Use limit to avoid processing thousands of matches
  3. Narrow date ranges: Reduces conversations to search
  4. Specific keywords: More specific keywords = better ranking

See Also