Can I migrate from Elasticsearch to StrataFS?

If your Elasticsearch index is the source of truth, no — StrataFS reads from the original files, not from an existing index. The migration story is: point StrataFS at the same files Elasticsearch was indexing, run them in parallel for a week, decide which results you prefer.

Is StrataFS faster than Elasticsearch?

On the corpora StrataFS is designed for (under ~50M chunks per source) and the query patterns it optimizes for (hybrid FTS + vector), warm latency is comparable — both in the 50–150ms range. Elasticsearch scales further; StrataFS starts faster and operates at zero ops.

Why would I pick Elasticsearch in 2026?

Existing investment, cross-team shared infrastructure, billion-scale indexes, or a workflow centered on Kibana for analytics. Elasticsearch is mature and well-supported. The case for StrataFS is operational simplicity for per-user/per-team semantic search.

Comparison

StrataFS vs. Elasticsearch for semantic code search — when each makes sense

Elasticsearch is the workhorse of full-text search. StrataFS is the new workhorse of embedded semantic search. Here's an honest comparison for developer-focused use cases.

By Dipankar Sarkar June 2, 2026 12 min read

elasticsearchcomparisonsemantic-searchcodebasedeveloper-tools

The question we get most often, in some form: “We already run Elasticsearch. Why would we use StrataFS instead of bolting vector search onto Elastic?”

This is the long version of the answer. Elasticsearch is excellent. StrataFS solves a different shape of problem. Knowing which problem you have is most of the work.

The honest summary

Dimension	StrataFS	Elasticsearch
Deployment model	Single binary, SQLite on disk	JVM cluster (1–N nodes)
Operational surface	None — it’s a file	Cluster ops, JVM tuning, shards
Cold start time	~1 second	30–60 seconds (node + warm-up)
Warm hybrid query (10k chunks)	< 100 ms	< 100 ms
Warm hybrid query (10M chunks)	200–400 ms (one source per DB)	100–200 ms
Warm hybrid query (1B chunks)	Not supported in one query	100–300 ms (sharded)
Multi-tenant in one cluster	One DB per source	Native
MCP server for AI agents	Built in	Plugin/middleware
Native embedding generation	Built in (ONNX local)	External (or paid sub for ELSER)
License	MIT	Elastic License v2 (source-available)
Cost at 10M chunks	$0 + your laptop	$$ (cluster + ops)
Cost at 1B chunks	Not the right tool	$$$ but feasible

Read this table as: “the costs are different at every scale, and the question is which scale you’re at.”

Where StrataFS wins

Per-user / per-developer / per-team deployment. When the consumer of search is one person or one team, StrataFS’ single-binary, zero-ops model is qualitatively better. You install it like a CLI. No cluster.

Agent-side retrieval. Putting a search engine on each developer’s laptop, indexing their code, accessible via MCP — Elasticsearch isn’t built for this and the friction shows. StrataFS is built for it.

Hybrid FTS + vector in one query, with low effort. Elasticsearch does both individually, and combining them is a tractable engineering project. StrataFS does it out of the box and the query is one SQL statement. The default just works.

Multi-storage indexing. StrataFS reads from Local + S3 + GCS + Azure + SharePoint + Drive + Jira in a unified way. Elasticsearch needs an ingest pipeline per source.

Operational simplicity. No JVM. No GC pauses. No cluster split-brain. The “ops handbook” for StrataFS is “rsync the SQLite files for backup; upgrade the binary in place”.

Where Elasticsearch wins

Billion-scale corpora. Elasticsearch shards across nodes. StrataFS doesn’t. At >50M chunks per logical scope, Elastic is the right tool.

Shared multi-tenant infrastructure. A central team running Elasticsearch for many product teams gets economies of scale that per-developer StrataFS instances don’t.

Analytics and aggregations. Elasticsearch’s analytics features (date histograms, geo aggregations, Kibana dashboards) are best-in-class. StrataFS is a search engine, not an analytics platform.

Mature ecosystem. Logstash, Beats, Kibana, the whole ELK story. StrataFS is young and doesn’t try to be all of those things.

Existing institutional knowledge. If your platform team already runs Elastic and is good at it, the marginal cost of adding vector search to Elastic is lower than introducing a new tool.

The hybrid-search story, specifically

Elasticsearch supports both full-text (default) and dense vector search (since 8.0). You can combine them with RRF or a custom script_score. Done well, it works. Done quickly with defaults, it’s just OK.

Three friction points show up:

Vector indexing cost. Elasticsearch’s HNSW build is good but ingest-heavy; you’ll want to provision specifically for it. SQLite + sqlite-vec builds the HNSW in milliseconds per chunk.
Rank fusion is application-level. Elastic’s RRF is a function, but tuning weighted fusion with metadata signals requires script_score queries. StrataFS does this in plain SQL.
You bring your own embedder. Elastic has ELSER (their proprietary sparse encoder) and supports calling external embedders. StrataFS runs ONNX in-process. The difference is zero API calls and zero rate-limits.

For a new hybrid-search project, the StrataFS path is shorter. For extending an existing Elastic deployment, the Elastic path is shorter.

The case for running both

In some teams, the right answer is “both, for different things”:

Elasticsearch for shared corpora — central docs, cross-team logs, customer-data search. Run by the platform team. Scaled, observed, integrated.
StrataFS for personal/team semantic search — each developer’s codebase, an AI agent’s knowledge tool, departmental Drive folders. Run by the user. Zero ops.

These don’t compete. They cover different points on the developer-experience spectrum.

Concrete decision tree

If you can answer “yes” to most of these, StrataFS is the right tool:

The consumer of search is one person or one small team.
The corpus is under 50M chunks per logical scope.
You want hybrid FTS + vector in one query with zero ops.
You want an MCP server for AI agents without writing one.
You’d rather upgrade a binary than run a JVM cluster.

If you can answer “yes” to most of these, Elasticsearch is the right tool:

The corpus is billion-scale or growing toward it.
Search is a shared service across many internal customers.
You need Kibana / analytics / aggregations alongside search.
You already operate Elastic well and don’t want to introduce a new tool.
You need search SLAs measured in millions of QPS.

If you’re somewhere in the middle, install StrataFS for a week. Five minutes to install, an hour to point at your corpus, a week of natural usage to know whether it’s enough.

A pattern that often works

Start with StrataFS for the per-developer use case (codebase + personal docs + agent retrieval). Keep Elasticsearch (or whatever you have) for the shared service. If StrataFS replaces enough of the shared use case naturally over time — many teams find it does — consolidate. If it doesn’t, you have a per-developer tool that wasn’t an Elastic-cluster expense.

This is the inverse of the usual “let’s run one search service for everyone” pattern. It works because per-developer search is a different shape of problem than enterprise search, and forcing both into one architecture is what made enterprise search expensive in the first place.

Reading list

The Architecture page covers StrataFS’s pipeline in detail.
Hybrid search deep-dive explains the ranking math.
SQLite as a search engine covers the storage layer.
Elastic’s vector search docs are the canonical reference for their side.

Both tools are good. They solve different shapes of problem. Pick the tool whose shape matches yours.