v0.2.1 · MIT licensed · 35+ file types

A semantic filesystem for the AI era.

Index Local, S3, GCS, Azure, SharePoint, Drive and Jira. Search them like a human. Hand them to your agent through a native Model Context Protocol server. Open source, MIT licensed, runs on your laptop.

Live demo

One query. Three ranking modes. Same SQLite engine.

Switch between Hybrid, FTS, and Vector to see how StrataFS fuses BM25, cosine similarity, and metadata into a single ranked list.

Try:
  1. auth/middleware/refresh.golocal · code0.86
    // RefreshToken validates the refresh JWT and issues a new access token.
    func RefreshToken(ctx context.Context, raw string) (*Tokens, error) {
    Matches "JWT refresh" semantically + filename contains "refresh".
  2. docs/runbooks/auth-rotation.mds3://acme-docs0.78
    When the access token expires we exchange the long-lived refresh token at /v2/auth/refresh — see the sequence diagram below.
    FTS hit on "refresh token" + semantic match on the rotation flow.
  3. internal/session/jwt.golocal · code0.69
    // JWT signing keys are rotated daily. Refresh tokens persist 30 days.
    Cross-file context: JWT + refresh policy.

Canned demo — results are illustrative. Real StrataFS runs locally and indexes your own files.

What it does

A storage layer that understands what's inside your files.

Native MCP server

Model Context Protocol on port 8081, ready for Claude Desktop, ChatGPT plugins, and custom agents. No glue code, no API gateway, no SaaS bridge.

Learn more →

Seven storage backends

Local + S3 + GCS + Azure Blob + SharePoint + Google Drive + Jira — one factory pattern, one query interface, separate isolated databases per source.

Learn more →

Read-only by design

StrataFS never modifies your files. Mounts every source read-only and keeps all index state in a parallel .stratafs/ directory. No surprise mutations.

Embedded, not embedded-into

SQLite with FTS5 + sqlite-vec. No Elasticsearch cluster, no Pinecone account, no separate vector store. The database is a file, on your disk, that you own.

Cross-platform desktop & CLI

Native installers for macOS, Linux, Windows. Wails desktop UI. Spotlight / Nautilus / Windows Search integration. FUSE mount exposes the index as a filesystem.

How it works

A pipeline from your storage to your search bar — and your agent's context window.

  1. 01

    Connect storage

    Point StrataFS at one or more sources — local directories, S3 buckets, GCS, Azure, SharePoint, Google Drive, or Jira projects. Read-only credentials only. Nothing leaves your machine.

  2. 02

    Watch and queue

    fsnotify monitors local paths in real time. Cloud sources poll via delta APIs. Changes land in a SQLite-backed job queue with retry, priority, and recovery on restart.

  3. 03

    Parse, chunk, embed

    35+ parsers extract text from PDFs, DOCX, code, spreadsheets, markup, and config files. Streaming chunkers keep memory flat. FastEmbed + ONNX runs embeddings locally — no API calls.

  4. 04

    Search & serve

    One SQL query fuses FTS5 BM25, vector cosine, and metadata into a single ranked list. Exposed via REST (port 8080) and a native MCP server (port 8081) for AI agents.

Storage backends

Seven backends. One query interface.

Read-only credentials. Per-source isolated SQLite databases. Real-time for local, polled with delta APIs for the cloud.

Read the storage docs →
Get started

Install in 30 seconds. Index in minutes.

Pick your package manager. stratafs config init writes a sensible default config; stratafs serve brings up the REST API on port 8080 and the MCP server on port 8081.

All install methods →
npm install -g stratafs
stratafs config init
stratafs serve
macOS · Linux · Windows
How it compares

StrataFS vs. the alternatives.

Feature StrataFS Elasticsearch Pinecone grep / ripgrep
Self-hosted, MIT licensed
Runs entirely offline
Full-text + vector in one query ~
Indexes Local + S3 + GCS + Azure ~
Native MCP server for AI agents
Zero ops: SQLite on disk
Real-time file change indexing ~
FUSE / Spotlight / Explorer integration

Comparison reflects features as of June 2026. See individual /compare pages for nuance.

Latest writing

Field notes from semantic-search land.

Tutorial 11 min read

Index S3, GCS, and Azure Blob buckets with semantic search

Indexing cloud storage used to mean pipelines, copies, and a separate search service. StrataFS reads buckets in place with read-only credentials and exposes a hybrid search across all of them.

s3gcsazurecloud-storage
Read all articles →
Questions

Frequently asked

What is StrataFS, in one sentence?

StrataFS is an open-source semantic filesystem that indexes Local, S3, GCS, Azure, SharePoint, Google Drive, and Jira sources, runs hybrid full-text + vector search over them, and exposes the result as a Model Context Protocol server for AI agents.

Who is StrataFS for?

Developers who want better-than-grep search over their codebase, AI engineers who need a Model Context Protocol-native retrieval layer, and enterprise teams who want semantic search across SharePoint, Drive, and Jira without a SaaS vendor in the data path.

What's the license?

MIT. The full source is on GitHub at github.com/neul-labs/stratafs. No usage limits, no telemetry, no 'we may change this license later' clause.

How mature is the project?

Version 0.2.1 as of June 2026, active development. Production-ready for single-user and small-team deployments; enterprise RBAC and at-rest encryption are in active development for the next release.

What's hybrid search and why does it matter?

Hybrid search fuses BM25 (full-text), vector cosine similarity (semantic), and metadata signals (recency, filename match, file-type) into one ranking. It catches the queries pure full-text misses (synonyms, intent) AND the queries pure vector misses (exact identifiers, error codes). StrataFS does this in a single SQL query.

Can I do FTS-only or vector-only search?

Yes. The 'mode' parameter on /search switches between 'hybrid' (default), 'fts' (BM25 only), and 'vector' (cosine only). Use 'fts' when you know the exact phrase, 'vector' when you want semantic exploration.

Index your storage. Search it like a human.

MIT-licensed. Multi-storage. Native MCP server for AI agents. Local, S3, GCS, Azure, SharePoint, Drive and Jira.