Hybrid search that just works
One SQL query fuses FTS5 BM25, vector cosine similarity, and metadata scoring. Toggle the weights, switch to pure FTS or vector — same fast SQLite engine.
Learn more →Index Local, S3, GCS, Azure, SharePoint, Drive and Jira. Search them like a human. Hand them to your agent through a native Model Context Protocol server. Open source, MIT licensed, runs on your laptop.
Switch between Hybrid, FTS, and Vector to see how StrataFS fuses BM25, cosine similarity, and metadata into a single ranked list.
// RefreshToken validates the refresh JWT and issues a new access token.
func RefreshToken(ctx context.Context, raw string) (*Tokens, error) {When the access token expires we exchange the long-lived refresh token at /v2/auth/refresh — see the sequence diagram below.// JWT signing keys are rotated daily. Refresh tokens persist 30 days.Canned demo — results are illustrative. Real StrataFS runs locally and indexes your own files.
One SQL query fuses FTS5 BM25, vector cosine similarity, and metadata scoring. Toggle the weights, switch to pure FTS or vector — same fast SQLite engine.
Learn more →Model Context Protocol on port 8081, ready for Claude Desktop, ChatGPT plugins, and custom agents. No glue code, no API gateway, no SaaS bridge.
Learn more →Local + S3 + GCS + Azure Blob + SharePoint + Google Drive + Jira — one factory pattern, one query interface, separate isolated databases per source.
Learn more →StrataFS never modifies your files. Mounts every source read-only and keeps all index state in a parallel .stratafs/ directory. No surprise mutations.
SQLite with FTS5 + sqlite-vec. No Elasticsearch cluster, no Pinecone account, no separate vector store. The database is a file, on your disk, that you own.
Native installers for macOS, Linux, Windows. Wails desktop UI. Spotlight / Nautilus / Windows Search integration. FUSE mount exposes the index as a filesystem.
Point StrataFS at one or more sources — local directories, S3 buckets, GCS, Azure, SharePoint, Google Drive, or Jira projects. Read-only credentials only. Nothing leaves your machine.
fsnotify monitors local paths in real time. Cloud sources poll via delta APIs. Changes land in a SQLite-backed job queue with retry, priority, and recovery on restart.
35+ parsers extract text from PDFs, DOCX, code, spreadsheets, markup, and config files. Streaming chunkers keep memory flat. FastEmbed + ONNX runs embeddings locally — no API calls.
One SQL query fuses FTS5 BM25, vector cosine, and metadata into a single ranked list. Exposed via REST (port 8080) and a native MCP server (port 8081) for AI agents.
Read-only credentials. Per-source isolated SQLite databases. Real-time for local, polled with delta APIs for the cloud.
fsnotify-driven, instant updates when files change.
Real-timePolling-based sync, IAM-friendly read-only credentials.
PolledService account JSON, bucket-level scoping.
PolledAccount key or SAS, container-level scoping.
PolledMicrosoft Graph delta API, enterprise-ready.
PolledOAuth2 + native Docs export.
PolledIssues, descriptions, attachments via REST API.
Polled
Pick your package manager. stratafs config init writes a
sensible default config; stratafs serve brings up the REST
API on port 8080 and the MCP server on port 8081.
npm install -g stratafs
stratafs config init
stratafs serve| Feature | StrataFS | Elasticsearch | Pinecone | grep / ripgrep |
|---|---|---|---|---|
| Self-hosted, MIT licensed | ✓ | ✓ | — | ✓ |
| Runs entirely offline | ✓ | ✓ | — | ✓ |
| Full-text + vector in one query | ✓ | ~ | — | — |
| Indexes Local + S3 + GCS + Azure | ✓ | ~ | — | — |
| Native MCP server for AI agents | ✓ | — | — | — |
| Zero ops: SQLite on disk | ✓ | — | — | ✓ |
| Real-time file change indexing | ✓ | ~ | — | — |
| FUSE / Spotlight / Explorer integration | ✓ | — | — | — |
Comparison reflects features as of June 2026. See individual /compare pages for nuance.
Elasticsearch is the workhorse of full-text search. StrataFS is the new workhorse of embedded semantic search. Here's an honest comparison for developer-focused use cases.
Wire StrataFS into Claude Desktop in five minutes. Hybrid search across your code and docs becomes a tool Claude knows how to call. Step-by-step, with the config and the troubleshooting.
Indexing cloud storage used to mean pipelines, copies, and a separate search service. StrataFS reads buckets in place with read-only credentials and exposes a hybrid search across all of them.
Most 'AI search' tools want your data on their servers. We didn't want to send our files anywhere, so we built StrataFS. Here's the case for self-hosted semantic search.
StrataFS is an open-source semantic filesystem that indexes Local, S3, GCS, Azure, SharePoint, Google Drive, and Jira sources, runs hybrid full-text + vector search over them, and exposes the result as a Model Context Protocol server for AI agents.
Developers who want better-than-grep search over their codebase, AI engineers who need a Model Context Protocol-native retrieval layer, and enterprise teams who want semantic search across SharePoint, Drive, and Jira without a SaaS vendor in the data path.
MIT. The full source is on GitHub at github.com/neul-labs/stratafs. No usage limits, no telemetry, no 'we may change this license later' clause.
Version 0.2.1 as of June 2026, active development. Production-ready for single-user and small-team deployments; enterprise RBAC and at-rest encryption are in active development for the next release.
Hybrid search fuses BM25 (full-text), vector cosine similarity (semantic), and metadata signals (recency, filename match, file-type) into one ranking. It catches the queries pure full-text misses (synonyms, intent) AND the queries pure vector misses (exact identifiers, error codes). StrataFS does this in a single SQL query.
Yes. The 'mode' parameter on /search switches between 'hybrid' (default), 'fts' (BM25 only), and 'vector' (cosine only). Use 'fts' when you know the exact phrase, 'vector' when you want semantic exploration.
MIT-licensed. Multi-storage. Native MCP server for AI agents. Local, S3, GCS, Azure, SharePoint, Drive and Jira.