Hybrid search that fuses BM25, vectors, and metadata — in one SQL query.
Most "AI search" tools force you to pick: keyword fast-and-precise, or vector flexible-and-fuzzy. StrataFS runs them together and lets you tune the weights.
- auth/middleware/refresh.golocal · code0.86
// RefreshToken validates the refresh JWT and issues a new access token. func RefreshToken(ctx context.Context, raw string) (*Tokens, error) {Matches "JWT refresh" semantically + filename contains "refresh". - docs/runbooks/auth-rotation.mds3://acme-docs0.78
When the access token expires we exchange the long-lived refresh token at /v2/auth/refresh — see the sequence diagram below.FTS hit on "refresh token" + semantic match on the rotation flow. - internal/session/jwt.golocal · code0.69
// JWT signing keys are rotated daily. Refresh tokens persist 30 days.Cross-file context: JWT + refresh policy.
Canned demo — results are illustrative. Real StrataFS runs locally and indexes your own files.
The three signals
Every result has three scores. The total ranking is a weighted sum; defaults are tuned for code+docs corpora, but you can override per query.
- FTS5 BM25 — SQLite's built-in full-text index. Precise for keyword queries; tokenization is Unicode-aware. Filename and path tokens get a small boost.
- Vector cosine similarity — embeddings from a local
ONNX model (BGE Base EN v1.5 by default). Captures intent and synonyms.
Stored in
sqlite-vecwith an HNSW index for sub-linear retrieval. - Metadata scoring — recency (modified time), filename
match, content-type bonus (e.g.
.md>.log), and source priority.
The query, in one CTE
Conceptually, the hybrid query looks like this:
WITH
fts AS (
SELECT chunk_id, bm25(file_chunks_fts) AS score
FROM file_chunks_fts WHERE file_chunks_fts MATCH ?
),
vec AS (
SELECT chunk_id, 1.0 - distance AS score
FROM file_chunks_vec
WHERE embedding MATCH ? AND k = 50
),
meta AS (
SELECT id AS chunk_id,
recency_score(updated_at) * 0.4 +
filename_match(?, path) * 0.4 +
filetype_bonus(content_type) * 0.2 AS score
FROM file_chunks
)
SELECT c.*, w.total
FROM file_chunks c
JOIN (
SELECT chunk_id,
COALESCE(fts.score, 0) * :w_fts +
COALESCE(vec.score, 0) * :w_vec +
COALESCE(meta.score, 0) * :w_meta AS total
FROM fts FULL OUTER JOIN vec USING (chunk_id)
FULL OUTER JOIN meta USING (chunk_id)
) w USING (id)
ORDER BY w.total DESC
LIMIT :limit; Three modes, one flag
The mode parameter clamps the weights:
-
mode=hybrid(default) — all three signals contribute. Best for natural-language queries. -
mode=fts— pure BM25. Use when you know the exact phrase or identifier (e.g."OAuth2RefreshGrant"). -
mode=vector— pure semantic. Use when the answer might live in a file whose vocabulary differs from yours.
Filters
Filters apply before ranking, in the SQL WHERE
clause — so they're as fast as a regular SQLite query:
source:local,s3://bucket,jira://infra, ...pathglob:"src/**/*.go","docs/**/*.md"content_type:code,markdown,pdf,spreadsheetsince:"2025-01-01"or relative"7d"
Performance
On a 2024 MacBook Air, indexing throughput is 50–100 files per second and warm search latency is under 100 ms on a 10 000-file corpus. At 500 000 chunks, hybrid queries land in the 100–300 ms range. Cold queries (first search after process start) take roughly one second while the embedding model loads into memory.
Why SQLite? Local file, no daemon, ACID, sub-millisecond reads,
FTS5 built in, sqlite-vec for embeddings. The "embedded
search server" used to be a contradiction. It isn't anymore.
Query it from anywhere
The same search engine is exposed three ways:
- CLI:
stratafs search "where do we handle JWT refresh" --mode hybrid --limit 5 - REST:
GET /search?q=...&mode=hybrid&source=docs&limit=10 - MCP:
tools/callwithname=stratafs.search, arguments identical to the REST shape. See the MCP page.
Try the hybrid query on your files.
Pick a directory, run stratafs serve, hit /search. Under a minute.