March 6, 20268 min readSearchSQLiteNLP

Local-First, Multi-Layered Search Architecture

How Deck implements a full search pipeline from exact match to semantic search, entirely offline — FTS5, fuzzy matching, NLEmbedding, and sqlite-vec.

Search makes or breaks a clipboard manager. You might copy hundreds of things a day — if you can't quickly find that one snippet from this morning, the history is just noise.

Cloud-based apps can offload this to Elasticsearch or Algolia. Deck doesn't have that option — it's a macOS-native clipboard manager built with Swift and SwiftUI, and all data stays on the user's machine. No server, no network calls, no telemetry. The search system has to be built entirely on top of SQLite and Apple's native frameworks.

This post covers how that works.

The Challenge

Local search is deceptively hard. Consider what a clipboard manager must handle:

CJK text that has no whitespace delimiters
Typos and fuzzy recall — the user types "clpboard" and still expects to find "clipboard".
Semantic intent — searching "network error" should surface entries about "connection timeout".
Sub-second latency — search must feel instant, even with tens of thousands of entries.
Zero resource waste — a background app cannot burn CPU or memory.

Deck's answer is a five-layer search pipeline — each layer catches what the previous one missed.

Architecture Overview

flowchart TD
    Input["[?] User Query\n用户查询"] --> Parser["SearchRuleParser.parse\n输入解析"]
    Parser --> Keyword["Keyword\n关键词"]
    Parser --> Filters["Rule Filters\n规则过滤器\n(app / date / type / lang / size)"]

    Keyword --> DB["DB Layer / 数据库层"]
    DB --> FTS["FTS5 Full-Text Search\n全文检索 (BM25)"]
    DB --> LIKE["SQL LIKE Fallback\nLIKE 回退"]
    DB --> VEC["sqlite-vec Vector Search\n向量检索"]

    FTS --> Candidates["Candidate Set\n候选集"]
    LIKE --> Candidates
    VEC --> Candidates

    Candidates --> RuleFilter["SearchRuleFilters.apply\n规则过滤"]
    Filters --> RuleFilter
    RuleFilter --> MemSearch["In-Memory Search\n内存搜索"]

    MemSearch --> Exact["exact\n精确匹配"]
    MemSearch --> Regex["regex\n正则表达式"]
    MemSearch --> Fuzzy["fuzzy\n模糊匹配 (Fuse)"]
    MemSearch --> Mixed["mixed\nexact → regex → fuzzy"]

    Exact --> Rerank["SemanticSearchService.rank\n语义重排"]
    Regex --> Rerank
    Fuzzy --> Rerank
    Mixed --> Rerank
    Rerank --> Results["[=] Ranked Results\n排序结果"]

The five search modes:

Mode	Description (EN)	说明 (CN)
`exact`	Exact substring match	精确子串匹配
`fuzzy`	Fuse-style subsequence matching	Fuse 风格模糊子序列匹配
`regex`	Regular expression	正则表达式
`mixed`	exact → regex → fuzzy chain	exact → regex → fuzzy 链式调用
`semantic`	mixed + semantic reranking	mixed 基础上叠加语义重排

Layer 1: FTS5 Full-Text Search with Trigram CJK Support

SQLite's FTS5 is the workhorse of Deck's search. But out of the box, FTS5's default tokenizer splits on whitespace

The fix: trigram tokenization. Instead of splitting on word boundaries, trigram slides a 3-character window across the text, generating overlapping tokens. 剪贴板管理 becomes 剪贴板, 贴板管, 板管理 — and substring search Just Works.

swift

func createFTS5Table(forceRecreate: Bool, preferTrigram: Bool) {
    let tokenizer = preferTrigram && isFTSTrigramAvailable()
        ? "tokenize='trigram'"
        : "tokenize='unicode61'"
    
    let sql = """
        CREATE VIRTUAL TABLE IF NOT EXISTS ClipboardHistory_fts
        USING fts5(search_text, content='ClipboardHistory',
                   content_rowid='id', \(tokenizer))
        """
    // ...
}

At launch, ensureFTSTrigramIfAvailable() checks if the current SQLite build supports trigram. If it does and the existing table uses unicode61, the table is transparently rebuilt — no user action needed.

Query Building

The query builder adapts to the tokenizer:

swift

func buildFTSQuery(from keyword: String, useTrigram: Bool) -> String? {
    let terms = keyword.components(separatedBy: .whitespaces)
        .filter { !$0.isEmpty }
    
    let processed: [String] = terms.compactMap { term in
        if useTrigram {
            guard term.count >= 3 else { return nil }  // trigram minimum
            return "\"\(term)\""                        // exact phrase
        } else {
            return "\"\(term)\"*"                       // prefix match
        }
    }
    
    return processed.joined(separator: " OR ")
}

Key differences between the two modes:

	Trigram	Unicode61
Min query length	3 chars	1 char
Match style	Exact phrase (`"term"`)	Prefix match (`"term"*`)
CJK support	Native	Broken
Index size	~3× larger	Compact

Results are ranked by BM25 — the same algorithm behind Elasticsearch and Lucene — giving relevance-scored ordering out of the box.

CJK Detection

Deck detects CJK characters to decide fallback strategy:

Range	Script
`U+4E00–U+9FFF`	CJK Unified Ideographs
`U+3040–U+309F`	Hiragana
`U+30A0–U+30FF`	Katakana
`U+AC00–U+D7AF`	Hangul Syllables

If trigram is unavailable and the query contains CJK, Deck falls back to SQL LIKE (database-level) or, as a last resort, in-memory string matching. The results may be slower, but the user always gets results.

Layer 2: Bitap-Style Fuzzy Matching (FuseSearch)

FTS5 is fast but literal — it won't help when the user mistypes "screnshot" for "screenshot". That's where FuseSearch comes in.

The algorithm works in two phases:

Exact scan — O(n) substring search. If it hits, return immediately with a perfect score.
Fuzzy match — Walk through the text character by character, greedily matching each pattern character in order. Track the positions and gaps.

Scoring

The score formula (lower is better, 0 = perfect match):

consecutiveBonus = 1.0 - (totalDistance / denominator)
startBonus       = 1.0 - (firstMatchIndex / denominator)
coverageRatio    = patternLength / denominator

score = 1.0 - (consecutiveBonus × 0.5 + startBonus × 0.3 + coverageRatio × 0.2)

Three signals, weighted by importance:

Signal	Weight	Intuition (EN)	直觉 (CN)
Consecutive bonus	50%	Reward matches that are close together	奖励紧密聚集的匹配
Start bonus	30%	Reward matches near the beginning	奖励靠近开头的匹配
Coverage ratio	20%	Reward longer pattern coverage	奖励更长的模式覆盖

Guard Rails

Fuzzy search is inherently expensive, so Deck sets hard limits:

Constant	Value	Purpose (EN)	用途 (CN)
`threshold`	0.6	Discard matches worse than this	丢弃低于此分数的匹配
`maxPatternLength`	32	Ignore absurdly long patterns	忽略过长的搜索词
`fuzzySearchLimit`	5000	Skip entries longer than this	跳过超长文本

In the mixed search mode, fuzzy is the last resort — the chain runs exact → regex → fuzzy, and fuzzy only fires when the first two come up empty.

Layer 3: Semantic Vector Search with NLEmbedding

This is where the search system transcends keyword matching. Using Apple's built-in NLEmbedding from the Natural Language framework, Deck generates sentence embeddings entirely offline — no API calls, no model downloads, no privacy concerns.

Bilingual Embedding

swift

func embedding(for text: String) -> [Double]? {
    let language: NLLanguage = text.containsChineseCharacters
        ? .simplifiedChinese
        : .english
    
    guard let model = NLEmbedding.sentenceEmbedding(for: language) else {
        return nil
    }
    
    let truncated = String(text.prefix(maxSemanticTextLength))  // 512 chars
    return model.vector(for: truncated)
}

Deck dynamically selects the embedding model based on whether the text contains Chinese characters. Two models, one interface — the caller never needs to think about language.

Accelerated Similarity

Cosine similarity is computed using vDSP_mmul from Apple's Accelerate framework — hardware-optimized matrix multiplication that runs on the CPU's vector units.

To avoid memory spikes when computing similarity against thousands of entries, the computation is done in blocks:

swift

let matrixBlockSize = 256

for blockStart in stride(from: 0, to: candidates.count, by: matrixBlockSize) {
    let blockEnd = min(blockStart + matrixBlockSize, candidates.count)
    // vDSP_mmul on this block only
    // ...
}

Dynamic Thresholds

Short queries produce noisier embeddings, so Deck adjusts the similarity threshold:

Query length	Min score	Rationale (EN)	原因 (CN)
≤ 2 chars	0.18	Short queries are vague — be strict	短查询含义模糊——需严格过滤
≤ 4 chars	0.12	Moderate confidence	中等置信度
> 4 chars	0.08	Longer queries are more precise	长查询更精确——可放宽阈值

Resource Management

An embedding model sitting in memory is expensive. Deck uses a lazy-load + auto-release pattern:

Parameter	Value	Purpose (EN)	用途 (CN)
`cache.countLimit`	400	Max cached vectors	最大向量缓存数
`embeddingReleaseDelay`	30s	Release model after idle	空闲后释放模型
`maxSemanticTextLength`	512	Truncate before embedding	嵌入前截断文本
`matrixBlockSize`	256	Block size for batch similarity	批量相似度计算的分块大小

The EmbeddingBox struct caches both the vector and its pre-computed L2 norm, avoiding redundant normalization on every comparison.

Layer 4: sqlite-vec — Persistent Vector Index

In-memory cosine similarity doesn't scale when the history grows to hundreds of thousands of entries. Deck offloads vector search to sqlite-vec, a SQLite extension that adds vector indexing as a virtual table.

Loading Strategy

swift

// 1. Try static linking first / 优先静态链接
if sqlite3_vec_init != nil {
    sqlite3_vec_init(db, nil, nil)
}
// 2. Fall back to dynamic loading / 回退到动态加载
else {
    let candidates = ["vec0", "vec0.dylib", "sqlite-vec", "sqlite-vec.dylib"]
    for name in candidates {
        if sqlite3_load_extension(db, name, nil, nil) == SQLITE_OK { break }
    }
}

Virtual Table & Query

sql

-- Create the vector index / 创建向量索引
CREATE VIRTUAL TABLE ClipboardHistory_vec
USING vec0(embedding float[512]);

-- Search by vector similarity / 按向量相似度搜索
SELECT rowid, distance
FROM ClipboardHistory_vec
WHERE embedding MATCH ?    -- L2-normalized JSON array
ORDER BY distance
LIMIT ?;

The data pipeline:

NLEmbedding → [Double] → L2 normalize → JSON array
  → INSERT INTO ClipboardHistory_embedding
  → updateVecIndex() syncs to vec virtual table
  → searchVecCandidates() returns rowid + distance

This gives Deck sub-millisecond vector search even with large histories — the index lives on disk, loads lazily, and integrates seamlessly with SQLite's query planner.

Layer 5: Slash-Rule Filters — Unicode Private Use Area Encoding

Power users need structured filters: "show me images from Figma, copied last week." Deck supports this via slash commands — /app:Figma /type:image /date:7d. The underlying implementation uses a Unicode trick worth explaining.

The PUA Trick

When the user types /app:, the search bar replaces it with a colored capsule tag (via NSTextAttachment). Under the hood, this capsule contains a Unicode Private Use Area character:

swift

enum SearchRuleToken {
    static let app:    Character = "\u{E000}"  // U+E000
    static let date:   Character = "\u{E001}"  // U+E001
    static let type:   Character = "\u{E002}"  // U+E002
    static let lang:   Character = "\u{E003}"  // U+E003
    static let size:   Character = "\u{E004}"  // U+E004
    static let marker: Character = "\u{2063}"  // Invisible separator / 不可见分隔符
}

Why PUA? Because these code points are guaranteed to never appear in normal user text — no accidental collisions, no escaping needed. The invisible separator U+2063 marks tokens that were inserted by the slash trigger, so the parser knows which characters are filter tokens vs. user input.

Filter Dimensions

Filter	Example	Description (EN)	说明 (CN)
`/app:`	`/app:Xcode`	Source application	来源应用
`/date:`	`/date:7d`	Time range	时间范围
`/type:`	`/type:image`	Content type	内容类型
`/lang:`	`/lang:swift`	Programming language	编程语言
`/size:`	`/size:>1kb`	Content size	内容大小

Each filter also supports exclusion: /-app:Safari means "exclude entries from Safari."

The parser SearchRuleParser.parse scans the rich text for PUA characters, extracts the structured ParsedSearchQuery(keyword, ruleFilters), and SearchRuleFilters.apply(to:) filters the candidate set in memory.

Graceful Degradation: Search Under Safe Mode

Deck has a Safe Mode where search_text is encrypted with AES-GCM. This means FTS5 indexes become useless — you can't full-text search ciphertext.

The search system degrades gracefully:

Feature	Normal Mode	Safe Mode
FTS5	Active	Disabled
Vector index	Active	Disabled (`vecIndexEnabled = false`)
Search method	Multi-layer pipeline	`searchWithLike` (decrypt → match in memory)
Scan limit	Unlimited	`max(5000, limit × 200)`, capped at 20,000
Text cache	N/A	`searchTextCache` — 300 decrypted entries

能力	正常模式	安全模式
FTS5	活跃	禁用
向量索引	活跃	禁用 (`vecIndexEnabled = false`)
搜索方式	多层管线	`searchWithLike`（解密 → 内存匹配）
扫描上限	无限制	`max(5000, limit × 200)`，封顶 20,000
文本缓存	N/A	`searchTextCache`

Security and searchability are in tension. Deck makes the trade-off explicit: Safe Mode protects your data at the cost of search speed, but never at the cost of search availability.

Putting It All Together

Here's the full sequence when a user types screenshot /app:Xcode in the search bar:

sequenceDiagram
    participant U as User / 用户
    participant SB as Search Bar / 搜索栏
    participant P as SearchRuleParser / 规则解析器
    participant DB as SQLite (FTS5 + vec)
    participant RF as SearchRuleFilters / 规则过滤
    participant SS as SearchService / 搜索服务
    participant SEM as SemanticSearchService / 语义服务

    U->>SB: Type "screenshot /app:Xcode"
    SB->>SB: Replace "/app:Xcode" with PUA capsule
    SB->>P: Rich text with PUA tokens
    P->>P: Extract keyword="screenshot", filter={app: "Xcode"}
    P->>DB: FTS5 query: "screenshot"* (BM25 ranked)
    DB-->>P: Candidate rowids + scores
    P->>DB: sqlite-vec MATCH (embedding of "screenshot")
    DB-->>P: Vector candidates + distances
    P->>RF: Merge candidates → apply app=Xcode filter
    RF-->>SS: Filtered candidate set
    SS->>SS: mixed mode: exact → regex → fuzzy
    SS->>SEM: Semantic rerank on top results
    SEM-->>U: Final ranked results / 最终排序结果

The whole pipeline — FTS5, sqlite-vec, in-memory fuzzy matching, semantic reranking, plus the slash-rule filter — completes in under 100ms on a MacBook Air.

What We Learned

No single search technique covers every case. FTS5 handles most queries well, but it falls apart on typos and can't do semantic matching. Fuzzy search catches typos but gets noisy on long texts. Semantic search understands intent but costs memory and CPU. Stacking them together and letting each one cover the others' gaps turned out to be more practical than trying to build one perfect search algorithm.

CJK support was a decision we made early on. FTS5's default tokenizer splits on whitespace, which means Chinese, Japanese, and Korean text is effectively unsearchable. Trigram tokenization fixes this at the cost of ~3× index size — a trade-off we're comfortable with.

The resource management side took more iteration than expected. A clipboard manager runs all day, and embedding models are not small. We went through a few rounds of tuning before landing on the current pattern: lazy-load the model, cache up to 400 vectors, release the model after 30 seconds of idle, and block matrix multiplication into 256-row chunks to keep memory flat.

Everything described in this post runs locally on the user's Mac. No data leaves the device, no cloud API is called. That was a constraint from the start, not a feature we added later.

Deck is a macOS clipboard manager that keeps everything local. Learn more at deckclip.app.