44 KiB
SDK Enhancements & Core Development Alignment
Date: October 11, 2025 Status: Action Items for SDK v3.0 Source: Grok Analysis + Codebase Research
This document outlines concrete enhancements to the VDFS SDK specification based on:
- Deep codebase analysis of Spacedrive Core
- Grok's critique against whitepaper and project status
- Grounding in actual implementation (87% core complete, extensions 60%)
Executive Summary
The Core Insight:
Core provides primitives. Extensions provide experiences.
Spacedrive Core handles generic data extraction (OCR, embeddings, metadata) and stores results in the Virtual Sidecar System (VSS). Extensions consume this pre-computed intelligence and add domain-specific behavior (Photos face detection, Chronicle research analysis, Ledger receipt parsing).
The Goal: Refine the SDK spec to create optimal separation between Core responsibilities and Extension capabilities, enabling:
- Photos extension that adds faces/places without bloating core
- CRM extension that adapts the UI without photos features
- Chronicle extension that leverages core OCR without re-processing
Part 1: Critical Spec Updates
1.1. Rename #[app] → #[extension]
Rationale: Avoid confusion with Spacedrive client apps (iOS/macOS at 65%). Extensions are plugins, not standalone applications.
// BEFORE
#[app(id = "com.spacedrive.chronicle")]
struct Chronicle;
// AFTER
#[extension(id = "com.spacedrive.chronicle")]
struct Chronicle;
Files to update:
docs/design/SDK_SPEC.md- All occurrencesdocs/design/SDK_SPEC_GROUNDED.md- All occurrences
1.2. Add Core Dependency Declaration
Rationale: Extensions need to declare minimum core version and required features to prevent runtime failures.
#[extension(
id = "com.spacedrive.chronicle",
name = "Chronicle Research Assistant",
version = "1.0.0",
// NEW: Declare dependencies on core features
min_core_version = "2.0.0",
required_features = [
"ai_models", // Needs AI model loader
"semantic_search", // Needs VSS embeddings
"ocr_sidecars", // Needs OCR from analysis pipeline
],
permissions = [
Permission::ReadEntries(glob = "**/*.pdf"),
Permission::ReadSidecars(kinds = ["ocr", "embeddings"]),
Permission::WriteTags,
Permission::UseModel(category = "llm", preference = "local"),
]
)]
struct Chronicle;
Implementation Note:
Core's PluginManager validates min_core_version and required_features before loading extension.
1.3. Extension-Triggered Analysis (On-Demand, User-Scoped)
DECISION: No automatic extraction hooks. Extensions define jobs that run on-demand on user-scoped locations.
Rationale:
- Core should NOT extract faces from every screenshot (wasteful)
- Extensions control when/where specialized analysis happens
- User decides which locations get analyzed (privacy + performance)
The Photos Extension Pattern:
#[model]
struct Photo {
#[entry(filter = "*.{jpg,png,heic}")]
file: Entry,
// Core provides EXIF automatically (part of indexing)
#[metadata] exif: Option<ExifData>,
// Extension-generated sidecars (stored in VSS extensions/ folder)
#[sidecar(kind = "faces", extension = "photos")]
faces: Option<Vec<FaceDetection>>,
#[sidecar(kind = "scene", extension = "photos")]
scene_tags: Option<Vec<String>>,
}
// User-initiated job when they enable Photos on a location
#[job(trigger = "user_initiated")]
async fn analyze_location(
ctx: &JobContext,
location: SdPath,
) -> JobResult<()> {
ctx.progress(Progress::indeterminate("Finding photos..."));
// Get all images in user-selected location
let photos = ctx.vdfs()
.query_entries()
.in_location(location)
.of_type::<Image>()
.collect()
.await?;
for photo in photos.progress(ctx) {
// Skip if already analyzed
if ctx.sidecar_exists(photo.content_uuid(), "faces")? {
continue;
}
// Face detection
let faces = ctx.ai()
.from_registered("face_detection") // Model registered on install
.detect_faces(&photo)
.await?;
// Save detailed data to sidecar
ctx.save_sidecar(
photo.content_uuid(),
"faces",
extension_id = "photos",
&faces
).await?;
ctx.check_interrupt().await?; // Checkpoint
}
// Bulk generate tags from face sidecars
ctx.run(generate_face_tags, (location,)).await?;
Ok(())
}
// Follow-up job: Sidecar → Tags for indexing/search
#[job]
async fn generate_face_tags(
ctx: &JobContext,
location: SdPath,
) -> JobResult<()> {
let photos = ctx.vdfs()
.query_entries()
.in_location(location)
.with_sidecar("faces") // Only photos with face data
.collect()
.await?;
for photo in photos {
let faces: Vec<FaceDetection> = ctx.read_sidecar(
photo.content_uuid(),
"faces"
).await?;
// Generate tags from sidecar data
for face in faces {
if let Some(person_id) = face.identified_as {
ctx.vdfs()
.add_tag(photo.metadata_id(), &format!("#person:{}", person_id))
.await?;
}
}
}
Ok(())
}
The Flow:
- User installs Photos extension
- User enables Photos on
/My Photoslocation (scoping) - Photos extension dispatches
analyze_locationjob - Job processes photos, saves detailed face data to sidecars
- Job generates searchable tags from sidecar data
- User can now search "#person:alice" using core tag system
Why This Works:
- On-demand - no wasted computation
- User-scoped - only analyzes chosen locations
- Sidecars for details - face coords, confidence scores
- Tags for indexing - searchable via core
- Versioned - re-run job on model upgrade
###1.4. Agent Trail for Debugging (Not Memory)
CLARIFICATION: Agent trail is for tracing/debugging, not cognitive memory.
Rationale:
- Memory = Extension-defined (Temporal/Associative/Working) for reasoning
- Trail = Debug logs showing agent's decision flow
- Audit Log = VDFS mutations only (tag writes, file moves, etc.)
#[agent]
#[agent_trail(
level = "debug", // Standard log level
format = "jsonl",
rotation = "daily",
// Stored in: .sdlibrary/logs/extension/{id}/
)]
impl Chronicle {
async fn on_paper_analyzed(&self, ctx: &AgentContext) -> AgentResult<()> {
// Debug trail (for developers/troubleshooting)
ctx.trace("Received paper analysis event");
ctx.trace("Checking memory for similar papers");
// Agent memory (cognitive system - extension-defined)
let memory = ctx.memory().read().await;
let similar = memory.papers_related_to("neural networks").await;
ctx.trace(format!("Found {} similar papers", similar.len()));
// VDFS mutation (goes to audit log, not trail)
ctx.vdfs()
.add_tag(paper_id, "#machine-learning")
.await?;
Ok(())
}
}
Three Separate Systems:
- Agent Trail:
.sdlibrary/logs/extension/{id}/trace.jsonl(debug only) - Agent Memory:
.sdlibrary/sidecars/extension/{id}/memory/(cognitive state) - VDFS Audit Log:
.sdlibrary/audit.db(mutations only)
1.5. Simple "Work With What's Available" Philosophy
DECISION: No explicit progressive query modes. Extensions just work with available data and trigger jobs for gaps.
Rationale:
- Keeps query API simple
- Extensions naturally get better as more data is analyzed
- If data missing, extension triggers its own analysis job
impl ChronicleMind {
async fn papers_about(&self, topic: &str) -> Vec<PaperAnalysisEvent> {
// Just query what exists
self.history
.query()
.where_semantic("summary", similar_to(topic))
.limit(20)
.collect()
.await
.unwrap_or_default()
}
}
// If extension needs data that doesn't exist yet
#[agent]
impl Chronicle {
async fn ensure_paper_analyzed(&self, paper: Paper, ctx: &AgentContext) -> AgentResult<()> {
// Check if OCR sidecar exists
if paper.full_text.is_none() {
// Trigger analysis job to generate it
ctx.jobs()
.dispatch(analyze_paper, paper)
.await?;
return Ok(AgentResult::Pending); // Will retry when sidecar ready
}
// Data available - proceed
Ok(AgentResult::Success)
}
}
Why This Works:
- Extensions query what's available
- Missing data → trigger job → retry later
- No complex progressive modes
- Natural async improvement over time
1.6. User-Scoped Permission Model
KEY INSIGHT: Extensions request permissions. Users scope them to specific locations/paths.
The Model:
// Extension declares broad permissions (in manifest)
#[extension(
id = "com.spacedrive.photos",
permissions = [
// Extension REQUESTS these capabilities
Permission::ReadEntries,
Permission::ReadSidecars(kinds = ["exif"]),
Permission::WriteSidecars(kinds = ["faces", "places"]),
Permission::WriteTags,
Permission::UseModel(category = "face_detection"),
]
)]
struct Photos;
User scopes during setup:
[User installs Photos extension]
Spacedrive UI:
┌─────────────────────────────────────────┐
│ Photos Extension Setup │
├─────────────────────────────────────────┤
│ │
│ This extension requests: │
│ ✓ Read image files │
│ ✓ Write face detection sidecars │
│ ✓ Add tags │
│ ✓ Use face detection AI model │
│ │
│ Grant access to: │
│ [x] /My Photos │
│ [x] /Family Photos │
│ [ ] /Documents (not relevant) │
│ │
│ [ Advanced: Restrict by file type ] │
│ │
│ [Cancel] [Grant Access] │
└─────────────────────────────────────────┘
Runtime Enforcement:
// Core enforces scope on every operation
impl WasmHost {
fn vdfs_add_tag(&self, metadata_id: Uuid, tag: &str) -> Result<()> {
// 1. Check permission granted
if !self.has_permission(Permission::WriteTags) {
return Err(PermissionDenied);
}
// 2. Check entry is in user-scoped locations
let entry = self.db.get_entry(metadata_id).await?;
if !self.in_granted_scope(&entry.path()) {
return Err(OutOfScope {
path: entry.path(),
granted: self.granted_scopes.clone(),
});
}
// 3. Execute
self.db.add_tag(metadata_id, tag).await
}
}
Permission Types:
pub enum Permission {
// Entry access (scoped by user to locations)
ReadEntries,
WriteEntries,
DeleteEntries,
// Sidecars (scoped by user to locations)
ReadSidecars { kinds: Vec<String> },
WriteSidecars { kinds: Vec<String> },
// Metadata (scoped by user to locations)
ReadTags,
WriteTags,
WriteCustomFields { namespace: String },
// Jobs
DispatchJobs,
// AI & Models
UseModel {
category: String, // "face_detection", "ocr", "llm"
preference: ModelPreference, // Local, API, or Bundled
},
RegisterModel {
category: String,
max_memory_mb: u64,
},
// Network (requires explicit user consent per-call)
AccessNetwork {
domains: Vec<String>,
purpose: String, // "Download model weights"
},
}
pub enum ModelPreference {
LocalOnly, // Only local models (Ollama)
ApiAllowed, // Can use APIs (user provides keys + grants consent)
BundledWithExtension, // Extension ships model weights
}
Key Principle:
Extension requests capabilities. User grants and scopes to specific data. Core enforces at runtime.
Part 2: Core Development Priorities
2.1. AI Model Registration & Loader System (P0 - Blocks AI Extensions)
DECISION: Models are registered with Core on extension install, then accessed by name.
Storage: Models live in root data dir (not library - no sync needed)
~/.spacedrive/
└── models/
├── face_detection/
│ ├── photos_v1.onnx # Registered by Photos extension
│ └── premium_v2.onnx # Registered by Photos (premium)
├── ocr/
│ └── tesseract.onnx # Could be registered by Core or extension
└── llm/
└── llama3.gguf # Ollama-managed
The Model Manager:
// In core/src/ai/manager.rs (NEW MODULE)
pub struct ModelManager {
registry: HashMap<String, RegisteredModel>,
root_dir: PathBuf, // ~/.spacedrive/models/
}
impl ModelManager {
/// Extensions register models on install
pub async fn register_model(
&self,
category: &str,
name: &str,
source: ModelSource,
) -> Result<ModelId> {
match source {
ModelSource::Bundled(bytes) => {
// Extension includes model in WASM
self.save_to_disk(category, name, bytes).await?;
}
ModelSource::Download { url, sha256 } => {
// Extension provides download URL
self.download_and_verify(category, name, url, sha256).await?;
}
ModelSource::Ollama(model_name) => {
// Defer to Ollama
self.register_ollama(category, name, model_name).await?;
}
}
Ok(ModelId::new(category, name))
}
/// Load registered model for inference
pub async fn load(&self, model_id: &ModelId) -> Result<LoadedModel> {
let path = self.root_dir.join(&model_id.category).join(&model_id.name);
match model_id.category.as_str() {
"llm" => self.load_ollama(model_id).await,
_ => self.load_onnx(&path).await,
}
}
}
pub enum ModelSource {
Bundled(Vec<u8>), // Included in extension
Download { url: String, sha256: String }, // Downloaded on install
Ollama(String), // Managed by Ollama
}
Extension Usage:
// On extension install (via manifest or #[on_install] hook)
#[on_install]
async fn install(ctx: &InstallContext) -> InstallResult<()> {
// Register face detection model
ctx.models()
.register(
"face_detection",
"photos_basic",
ModelSource::Download {
url: "https://models.spacedrive.com/photos/faces-v1.onnx",
sha256: "abc123...",
}
)
.await?;
Ok(())
}
// Later, in jobs
#[job]
async fn detect_faces(ctx: &JobContext, photo: Photo) -> JobResult<Vec<Face>> {
let faces = ctx.ai()
.from_registered("face_detection:photos_basic") // Category:name
.detect_faces(&photo.file)
.await?;
Ok(faces)
}
Host Functions Needed:
// NEW host functions
fn model_register(category_ptr, name_ptr, source_ptr) -> u32;
fn model_load(model_id_ptr) -> u32;
fn model_infer(model_id, input_ptr) -> u32;
Timeline: 3-4 weeks
2.2. Complete Extension API Surface (P0)
Current Status: 30% VDFS API complete
Missing Host Functions:
// In core/src/infra/extension/host_functions.rs (EXPAND)
#[link(wasm_import_module = "spacedrive")]
extern "C" {
// Implemented
fn vdfs_query_entries(filter_ptr: u32, filter_len: u32) -> u32;
fn vdfs_read_sidecar(uuid_ptr: u32, kind_ptr: u32) -> u32;
// ️ Partially implemented
fn vdfs_dispatch_job(job_ptr: u32, job_len: u32) -> u32;
// Missing - MUST IMPLEMENT
fn vdfs_write_tag(metadata_id: u32, tag_ptr: u32, tag_len: u32) -> u32;
fn vdfs_write_custom_field(metadata_id: u32, key_ptr: u32, value_ptr: u32) -> u32;
fn event_subscribe(event_type_ptr: u32) -> u32;
fn event_next(subscription_id: u32) -> u32;
fn model_load(category_ptr: u32, name_ptr: u32) -> u32;
fn model_infer(model_id: u32, input_ptr: u32) -> u32;
fn job_checkpoint(job_id: u32, state_ptr: u32, state_len: u32) -> u32;
}
Priority Order:
vdfs_write_tag- Unblocks basic extension functionalityevent_subscribe/event_next- Enables agent event handlersjob_checkpoint- Enables resumable extension jobsmodel_load/model_infer- Enables AI extensionsvdfs_write_custom_field- Enables extension-specific metadata
Timeline: 3-4 weeks
2.3. VSS Extension Storage Layout (P1)
DECISION: Extension data in library VSS. Models in root data dir (no sync).
Storage Layout:
~/.spacedrive/ # Root data dir
├── models/ # Models (NOT in library)
│ ├── face_detection/
│ │ └── photos_v1.onnx
│ ├── ocr/
│ │ └── tesseract.onnx
│ └── llm/
│ └── llama3.gguf # Ollama-managed
│
└── libraries/
└── my-library.sdlibrary/
├── database.db # Core database
├── sidecars/
│ ├── content/{h0}/{h1}/{content_uuid}/
│ │ ├── ocr/ocr.json # Core-generated
│ │ ├── thumbs/grid@2x.webp # Core-generated
│ │ └── extensions/
│ │ └── {extension_id}/
│ │ ├── faces.json # Extension sidecar
│ │ └── receipt.json
│ │
│ └── extension/{extension_id}/
│ ├── memory/
│ │ ├── history.db # TemporalMemory
│ │ └── knowledge.vss # AssociativeMemory
│ └── state.json # Extension state
│
├── logs/
│ └── extension/{extension_id}/
│ └── trace.jsonl # Agent trail (debug)
│
└── virtual/ # Optional: Persisted virtual entries
└── {extension_id}/
└── {uuid}.json # Email, Note, etc.
Database Schema:
-- Extend sidecars table
ALTER TABLE sidecars ADD COLUMN extension_id TEXT;
CREATE INDEX idx_sidecars_extension ON sidecars(extension_id, content_uuid, kind);
-- Extension scope grants (user-defined)
CREATE TABLE extension_scopes (
id INTEGER PRIMARY KEY,
extension_id TEXT NOT NULL,
location_id INTEGER REFERENCES locations(id),
path_pattern TEXT, -- For sub-path scoping
granted_at TIMESTAMP
);
Timeline: 1-2 weeks
2.4. Memory System Foundation (P1)
Base Traits for Memory Types:
// In core/src/infra/memory/mod.rs (NEW MODULE)
#[async_trait]
pub trait TemporalMemory<T>: Send + Sync
where
T: Serialize + DeserializeOwned + Clone,
{
/// Append event to temporal log
async fn append(&mut self, event: T) -> Result<()>;
/// Query builder for temporal queries
fn query(&self) -> TemporalQuery<T>;
}
#[async_trait]
pub trait AssociativeMemory<T>: Send + Sync
where
T: Serialize + DeserializeOwned + Clone,
{
/// Add knowledge to associative memory
async fn add(&mut self, knowledge: T) -> Result<()>;
/// Semantic query builder
fn query_similar(&self, query: &str) -> AssociativeQuery<T>;
}
#[async_trait]
pub trait WorkingMemory<T>: Send + Sync
where
T: Serialize + DeserializeOwned + Clone + Default,
{
/// Read current state
async fn read(&self) -> T;
/// Transactional update
async fn update<F>(&mut self, f: F) -> Result<()>
where
F: FnOnce(T) -> Result<T> + Send;
}
Concrete Implementations:
// TemporalMemory backed by SQLite FTS5
pub struct SqliteTemporalMemory<T> {
db_path: PathBuf,
_phantom: PhantomData<T>,
}
// AssociativeMemory backed by VSS Vector Repository
pub struct VssAssociativeMemory<T> {
vss_path: PathBuf,
embedding_model: String,
_phantom: PhantomData<T>,
}
// WorkingMemory backed by JSON file
pub struct JsonWorkingMemory<T> {
state_path: PathBuf,
_phantom: PhantomData<T>,
}
Timeline: 2-3 weeks
1.5. UI Integration via Manifest (Not Rust)
DECISION: Use ui_manifest.json for UI integration, not Rust attributes.
Rationale:
- Keep UI separate from business logic
- Manifest can be passed directly to frontend
- No bundling React/UI code in Rust WASM
- Cleaner separation of concerns
Extension Package Structure:
photos.wasm
manifest.json # Extension metadata
ui_manifest.json # UI integration points
prompts/
└── describe_photo.jinja
assets/
└── icon.svg
ui_manifest.json Example:
{
"sidebar": {
"section": "Photos",
"icon": "assets/icon.svg",
"views": [
{
"id": "albums",
"title": "Albums",
"component": "grid",
"query": "list_albums"
},
{
"id": "people",
"title": "People",
"component": "cluster_grid",
"query": "list_people"
}
]
},
"context_menu": [
{
"action": "create_album",
"label": "Add to Album...",
"icon": "plus",
"applies_to": ["image/*"],
"keyboard_shortcut": "cmd+shift+a"
}
],
"file_viewers": [
{
"mime_types": ["image/*"],
"component": "photo_viewer",
"supports_slideshow": true
}
]
}
Frontend Rendering:
// Frontend (React/React Native) parses ui_manifest.json
function renderExtensionSidebar(extension: Extension) {
const uiManifest = extension.uiManifest;
return (
<SidebarSection title={uiManifest.sidebar.section}>
{uiManifest.sidebar.views.map(view => (
<ExtensionView
key={view.id}
component={view.component}
data={useExtensionQuery(extension.id, view.query)}
/>
))}
</SidebarSection>
);
}
Why This Works:
- No Rust UI code
- Frontend handles rendering generically
- Extensions provide data via queries
- Can bundle custom assets
- Manifest updates don't require recompilation
1.6. Virtual Entries with Optional Persistence
NEW CAPABILITY: Extensions can create virtual entries (emails, notes, tasks) with optional disk persistence.
#[model]
#[persist_strategy = "user_preference"] // User decides if persisted
struct Email {
// No #[entry] - this is a virtual entry
// No physical file on disk (unless user enables persistence)
#[sync(shared)] from: String,
#[sync(shared)] to: Vec<String>,
#[sync(shared)] subject: String,
#[sync(shared)] body: String,
#[sync(shared)] received_at: DateTime<Utc>,
// Extension can optionally persist to disk
#[persist_to = "virtual/{extension_id}/{uuid}.json"]
persisted: bool,
}
// Extension creates virtual entry
#[job]
async fn import_emails(ctx: &JobContext, imap: ImapConfig) -> JobResult<()> {
let emails = fetch_from_imap(&imap).await?;
for email in emails {
// Create virtual entry
let email_model = Email {
from: email.from,
to: email.to,
subject: email.subject,
body: email.body,
received_at: email.date,
persisted: ctx.config().persist_virtual_entries,
};
// Save to VDFS (in database + optionally on disk)
ctx.vdfs().create_virtual_entry(email_model).await?;
}
Ok(())
}
User Control:
// Extension settings
{
"persist_virtual_entries": false, // User choice
"backup_includes_virtual": true
}
Storage:
- Database: Always (for queries and sync)
- Disk: Optional (
.sdlibrary/virtual/{extension_id}/{uuid}.json) - Benefits: Virtual entries can sync across devices even if not persisted to disk
1.7. The Sidecar → Tags Pattern
KEY PATTERN: Sidecars store detailed extraction results. Tags make them searchable.
Rationale:
- Sidecars = Source of truth (detailed, versioned JSON)
- Tags = Index for search (lightweight, core primitive)
- Regenerate tags when sidecar model upgrades
// Step 1: Extension saves detailed data to sidecar
#[job]
async fn detect_objects(ctx: &JobContext, photo: Photo) -> JobResult<()> {
let detections = ctx.ai()
.from_registered("object_detection:yolo_v8")
.detect(&photo.file)
.await?;
// Save detailed sidecar (coords, confidence, etc.)
ctx.save_sidecar(
photo.content_uuid(),
"objects",
extension_id = "photos",
&ObjectDetectionResult {
objects: detections.iter().map(|d| ObjectBox {
class: d.class.clone(),
confidence: d.confidence,
bbox: d.bbox,
}).collect(),
model_version: "yolo_v8_v1",
}
).await?;
Ok(())
}
// Step 2: Bulk generate tags from sidecars
#[job]
async fn generate_tags_from_objects(ctx: &JobContext, location: SdPath) -> JobResult<()> {
let photos = ctx.vdfs()
.query_entries()
.in_location(location)
.with_sidecar("objects")
.collect()
.await?;
for photo in photos {
let objects: ObjectDetectionResult = ctx.read_sidecar(
photo.content_uuid(),
"objects"
).await?;
// Generate tags from detailed sidecar
for obj in objects.objects {
if obj.confidence > 0.8 {
ctx.vdfs()
.add_tag(photo.metadata_id(), &format!("#object:{}", obj.class))
.await?;
}
}
}
Ok(())
}
// On model upgrade
#[job]
async fn regenerate_tags_after_model_upgrade(
ctx: &JobContext,
location: SdPath,
) -> JobResult<()> {
// Delete old sidecars
ctx.vdfs()
.delete_sidecars_by_model_version(
location,
"objects",
old_version = "yolo_v8_v1"
)
.await?;
// Re-run analysis with new model
ctx.run(detect_objects, (location,)).await?;
// Regenerate tags
ctx.run(generate_tags_from_objects, (location,)).await?;
Ok(())
}
Why This Works:
- Sidecars preserve detail (bounding boxes, confidence)
- Tags enable search ("show me all photos with dogs")
- Versionable (sidecar tracks model version)
- Regenerable (on model upgrade, redo analysis + tags)
- Bulk operations (tags generated efficiently in batches)
Part 2: Core/Extension Boundary Clarification
2.1. What Core MUST Provide (The Perception Layer)
Generic Data Extraction (Core's Responsibility)
DECISION: Core only does generic extraction useful to ALL extensions.
| Capability | Core Does It? | Status | Storage |
|---|---|---|---|
| File Indexing | Always | 95% | entries table |
| Content Identity | Always | 100% | content_identity |
| EXIF/Media Metadata | Always | 95% | media_data JSON |
| Thumbnails | Always | 90% | VSS thumbs/*.webp |
| OCR (Documents) | Generic documents | 70% | VSS ocr/ocr.json |
| Embeddings | For semantic search | 0% | VSS embeddings/*.json |
| Object Detection | Extension-triggered | 0% | Extension sidecar |
| Face Detection | Extension-triggered | 0% | Extension sidecar |
| Transcription | Extension-triggered | 0% | Extension sidecar |
| Receipt Parsing | Extension-triggered | 0% | Extension sidecar |
The Rule:
- Core does: Basic, universally useful extraction (EXIF, OCR for docs, embeddings for search)
- Extensions do: Specialized extraction (faces, receipts, custom analysis)
Infrastructure
- Event Bus -
Event::EntryCreated,Event::JobCompleted, etc. - Job System - Durable, resumable background work
- Sync System - HLC timestamps, CRDTs, transitive sync
- Tags/Collections - Generic organization primitives
- Model Loaders - Local (Ollama), API (OpenAI), Custom (ONNX)
- VSS - Sidecar storage and deduplication
2.2. What Extensions PROVIDE (The Experience Layer)
Domain-Specific Models
// Photos Extension
#[model] struct Album { ... }
#[model] struct Person { faces: Vec<Face> }
#[model] struct Place { geo: GeoLocation }
// CRM Extension
#[model] struct Contact { ... }
#[model] struct Company { ... }
#[model] struct Deal { ... }
// Chronicle Extension
#[model] struct Paper { ... }
#[model] struct ResearchProject { ... }
Specialized Agents & Memory
// Photos agent remembers faces, places, moments
#[agent_memory]
struct PhotosMind {
faces: AssociativeMemory<FaceCluster>,
places: AssociativeMemory<Location>,
events: TemporalMemory<PhotoEvent>,
}
// CRM agent remembers interactions, deals
#[agent_memory]
struct CrmMind {
contacts: AssociativeMemory<Contact>,
interactions: TemporalMemory<Interaction>,
pipeline: WorkingMemory<SalesPipeline>,
}
Custom Extraction (via #[extractor])
- Face detection (Photos)
- Receipt parsing (Ledger)
- Contact extraction from emails (CRM)
- Citation parsing from PDFs (Chronicle)
UI Adaptations
- Photos: Album grid, face clusters, map view
- CRM: Contact list, pipeline board, interaction timeline
- Chronicle: Knowledge graph, reading list, gap analysis
Part 3: Detailed Implementation Specifications
3.1. Model Manager Implementation Details
Core Module: core/src/ai/manager.rs (NEW)
Full Implementation:
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use tokio::fs;
use serde::{Serialize, Deserialize};
pub struct ModelManager {
root_dir: PathBuf, // ~/.spacedrive/models/
registry: HashMap<ModelId, RegisteredModel>,
loaded_models: HashMap<ModelId, Box<dyn LoadedModel>>,
}
#[derive(Hash, Eq, PartialEq, Clone)]
pub struct ModelId {
category: String, // "face_detection", "ocr", "llm"
name: String, // "photos_basic", "tesseract", "llama3"
}
struct RegisteredModel {
id: ModelId,
source: ModelSource,
format: ModelFormat,
memory_mb: u64,
registered_by: String, // Extension ID
}
pub enum ModelSource {
Bundled(Vec<u8>),
Download { url: String, sha256: String },
Ollama(String),
}
pub enum ModelFormat {
Onnx,
SafeTensors,
Gguf,
Ollama,
}
impl ModelManager {
pub async fn register(
&mut self,
category: &str,
name: &str,
source: ModelSource,
registered_by: &str,
) -> Result<ModelId> {
let id = ModelId {
category: category.to_string(),
name: name.to_string(),
};
// Save to disk based on source
match source {
ModelSource::Bundled(bytes) => {
let path = self.root_dir
.join(category)
.join(format!("{}.onnx", name));
fs::create_dir_all(path.parent().unwrap()).await?;
fs::write(&path, &bytes).await?;
}
ModelSource::Download { url, sha256 } => {
self.download_model(&id, &url, &sha256).await?;
}
ModelSource::Ollama(model_name) => {
// Verify Ollama has this model
self.verify_ollama_model(&model_name).await?;
}
}
// Register in memory
self.registry.insert(id.clone(), RegisteredModel {
id: id.clone(),
source,
format: self.detect_format(&id)?,
memory_mb: self.estimate_memory(&id).await?,
registered_by: registered_by.to_string(),
});
Ok(id)
}
pub async fn load(&mut self, id: &ModelId) -> Result<&dyn LoadedModel> {
if !self.loaded_models.contains_key(id) {
let model = match self.registry.get(id).unwrap().format {
ModelFormat::Onnx => self.load_onnx(id).await?,
ModelFormat::Ollama => self.load_ollama(id).await?,
_ => return Err(anyhow!("Unsupported format")),
};
self.loaded_models.insert(id.clone(), model);
}
Ok(self.loaded_models.get(id).unwrap().as_ref())
}
}
Host Functions:
// In core/src/infra/extension/host_functions.rs
#[no_mangle]
pub extern "C" fn model_register(
category_ptr: u32,
category_len: u32,
name_ptr: u32,
name_len: u32,
source_ptr: u32,
source_len: u32,
) -> u32 {
// Deserialize from WASM memory
// Call core.models.register()
// Return model ID
}
#[no_mangle]
pub extern "C" fn model_infer(
model_id_ptr: u32,
input_ptr: u32,
input_len: u32,
) -> u32 {
// Load model
// Run inference
// Return result pointer
}
3.2. Context Window & Prompt Construction
Extension-Managed: Each extension controls how it builds prompts for AI models.
#[agent]
impl Chronicle {
/// Extension defines how to build context from memory
async fn build_research_context(&self, ctx: &AgentContext<ChronicleMind>) -> String {
let memory = ctx.memory().read().await;
// Get recent papers from temporal memory
let recent = memory.history
.query()
.where_variant(ChronicleEvent::PaperAnalyzed)
.since(Duration::days(7))
.limit(10)
.collect()
.await
.unwrap_or_default();
// Get relevant concepts from associative memory
let concepts = memory.knowledge
.query_similar("current research focus")
.top_k(5)
.collect()
.await
.unwrap_or_default();
// Build context string for LLM
format!(
"Recent papers: {}\nKey concepts: {}\nCurrent plan: {}",
recent.iter().map(|p| &p.title).join(", "),
concepts.iter().map(|c| &c.name).join(", "),
memory.plan.read().await.priority_topics.join(", ")
)
}
/// Use context in Jinja template
#[on_query("suggest next paper")]
async fn suggest(ctx: &AgentContext<ChronicleMind>) -> AgentResult<String> {
let context = self.build_research_context(ctx).await;
#[derive(Serialize)]
struct PromptCtx { research_context: String }
let suggestion = ctx.ai()
.from_registered("llm:llama3")
.prompt_template("suggest_paper.jinja")
.render_with(&PromptCtx { research_context: context })?
.generate_text()
.await?;
Ok(suggestion)
}
}
Key Point: Extensions are responsible for:
- Querying their own memory
- Building context that fits model's window
- Managing prompt construction
- Not Core's concern
Part 4: Whitepaper Updates Needed
4.1. Update Section 6 (AI Architecture)
Add new subsection:
\subsubsection{Multi-Agent Extension Architecture}
Spacedrive's AI layer is deliberately modular to enable specialized
intelligence without core bloat. The architecture separates:
\paragraph{Core AI Responsibilities}
\begin{itemize}
\item \textbf{Model Loaders}: Unified interface for local (Ollama),
API (OpenAI), and custom (ONNX) models
\item \textbf{Index Observation}: Event bus for real-time index changes
\item \textbf{Generic Extraction}: OCR, embeddings, object detection
\item \textbf{Memory Primitives}: Base traits for temporal and associative memory
\end{itemize}
\paragraph{Extension AI Responsibilities}
\begin{itemize}
\item \textbf{Specialized Agents}: Domain-specific reasoning (Photos,
CRM, Research)
\item \textbf{Custom Models}: Pre-trained models for specific tasks
\item \textbf{Memory Structures}: Enum-based memories for multi-domain
knowledge graphs
\item \textbf{Experience Adaptation}: UI components and workflows
\end{itemize}
This enables a Photos extension to deploy a media organization agent
while a CRM extension deploys a contact management agent, both
leveraging the same VDFS primitives but providing distinct experiences.
4.2. Update Section 4.2.5 (Analysis Queueing Phase)
Extend to mention extension extractors:
\item \textbf{Analysis Queueing Phase}: After a file's content and
type are identified, this phase dispatches specialized jobs:
\paragraph{Core Analysis Jobs}
- \texttt{OcrJob} for document text extraction
- \texttt{EmbeddingJob} for semantic search preparation
- \texttt{ThumbnailJob} for preview generation
- \texttt{MediaAnalysisJob} for EXIF/codec metadata
\paragraph{Extension Extractor Jobs}
Extensions can register custom extractors that hook into this phase:
- Photos extension: \texttt{FaceDetectionJob}
- Ledger extension: \texttt{ReceiptParsingJob}
- Chronicle extension: \texttt{CitationExtractionJob}
Extension extractors run after core analysis and can depend on
core-generated sidecars (e.g., face detection uses object detection
results).
Part 5: Implementation Roadmap (Prioritized)
Week 1-2: Core API Completion (P0)
- Implement missing WASM host functions
vdfs_write_tag()vdfs_write_custom_field()event_subscribe()/event_next()
- Extend VSS schema for extension sidecars
- Update SDK_SPEC.md with
#[extension]rename
Week 3-4: AI Model Loader (P0)
- Design
ModelLoadertrait - Implement local loader (Ollama integration)
- Implement API loader (OpenAI/Anthropic)
- Implement custom model loader (ONNX runtime)
- Add
Permission::UseModelenforcement
Week 5-6: Memory System (P1)
- Implement
TemporalMemorytrait + SQLite backend - Implement
AssociativeMemorytrait + VSS backend - Implement
WorkingMemorytrait + JSON backend - Add enum support with
.where_variant() - Add progressive query support
Week 7-8: Extractor System (P1)
- Design
#[extractor]macro - Modify indexer Analysis Queueing phase
- Extension extractor registration API
- Dependency checking for extractors
- Test with Photos face detection example
Week 9-10: Polish & Examples (P2)
- UI integration system
- Agent trail implementation
- Complete Chronicle extension example
- Complete Photos extension example
- Documentation and tutorials
Part 6: Design Decisions to Make
Decision 1: Extension Job Persistence
Question: Where do extension jobs persist?
Option A: Shared jobs.db (simpler)
-- Single jobs.db with extension_id column
CREATE TABLE jobs (
id TEXT PRIMARY KEY,
extension_id TEXT, -- NULL for core jobs
status TEXT,
state BLOB
);
Option B: Separate per-extension (more isolated)
.sdlibrary/sidecars/extension/{id}/jobs.db
Recommendation: Option A - simpler, unified job monitoring UI.
Decision 2: Model Packaging
Question: How do extensions ship custom models?
Option A: Bundled in WASM
// model.onnx compiled into WASM binary
ctx.ai().from_custom_model(include_bytes!("model.onnx"))
Option B: Downloaded on-demand
// Extension manifest specifies model URL
"models": [{
"name": "face_detection",
"url": "https://models.spacedrive.com/photos/faces-v1.onnx",
"sha256": "abc123..."
}]
Recommendation: Support both - bundled for small models, downloaded for large.
Decision 3: Memory Sync
Question: Should extension memory sync across devices?
Option A: Sync everything
TemporalMemorysyncs via CRDT logAssociativeMemorysyncs via vector merge- Heavy bandwidth, full experience
Option B: Device-local only
- Each device has own agent memory
- No sync overhead
- Inconsistent across devices
Option C: Selective sync
#[agent_memory]
struct ChronicleMind {
#[sync(shared)] // Syncs across devices
knowledge: AssociativeMemory<Concept>,
#[sync(device_owned)] // Device-local
plan: WorkingMemory<ResearchPlan>,
}
Recommendation: Option C - let developers choose per-field.
Part 7: Validation Strategy
Test Against Real Core Systems
Create integration tests:
// tests/extension_integration_test.rs
#[tokio::test]
async fn test_extension_reads_core_ocr() {
let core = setup_test_core().await;
let library = core.create_library("test").await?;
// Core indexes a PDF
let pdf = library.add_file("paper.pdf").await?;
// Wait for OcrJob to complete
wait_for_sidecar(&pdf, "ocr").await?;
// Extension reads OCR
let extension_ctx = create_extension_context("chronicle", &library);
let ocr = extension_ctx.read_sidecar(pdf.content_uuid(), "ocr").await?;
assert!(ocr.contains("machine learning"));
}
Map to existing test infrastructure:
- Leverage
core/tests/sync_integration_test.rs(1,554 LOC passing) - Add extension-specific scenarios
Key Refined Decisions (James + Grok Synthesis)
Confirmed Approaches
-
On-Demand Analysis, Not Automatic
- No
#[extractor]hooks that fire on every file - User-initiated jobs on scoped locations
- Extensions control when/where processing happens
- No
-
Model Registration, Not Raw Loading
ctx.ai().from_custom_model(include_bytes!())ctx.models().register()on install +ctx.ai().from_registered()- Models in root dir (
~/.spacedrive/models/), not library
-
User-Scoped Permissions
- Extensions request broad capabilities
- Users grant and scope to specific locations
- Runtime enforcement on every operation
-
UI via Manifest, Not Rust
#[ui_integration(...)]attributesui_manifest.jsonparsed by frontend- Cleaner separation, no UI code in WASM
-
Sidecars → Tags Pattern
- Sidecars store detailed results (source of truth)
- Tags generated from sidecars (for search/indexing)
- Regenerable on model upgrades
-
Virtual Entries with Optional Persistence
- Extensions can create virtual entries (emails, notes)
- Database always (for sync and queries)
- Disk optional (user preference)
-
Three Separate Logging Systems
- Agent Trail: Debug/tracing logs (
.sdlibrary/logs/extension/) - Agent Memory: Cognitive state (
.sdlibrary/sidecars/extension/memory/) - VDFS Audit Log: Filesystem mutations (
.sdlibrary/audit.db)
- Agent Trail: Debug/tracing logs (
-
Core Does Generic, Extensions Do Specialized
- Core: OCR for docs, EXIF, thumbnails, embeddings for search
- Extensions: Face detection, receipt parsing, citation extraction
- No automatic face detection on every screenshot
Rejected Approaches
Automatic- Too automatic, wasteful#[extractor]hooksProgressive query modes- Unnecessary complexityUI integration in Rust- Keep it in manifestModels in library- Root dir only (no sync)Object/face detection in Core- Extension responsibility
Summary: Next Actions
For You (Immediate):
- Create
SDK_ENHANCEMENTS.md(this document) - Update
SDK_SPEC.md:- Rename
#[app]→#[extension] - Remove automatic
#[extractor]- replace with on-demand jobs - Add user-scoped permission model
- Add virtual entry persistence
- Add sidecar → tags pattern
- Add UI manifest approach
- Rename
- Update whitepaper Section 6 (multi-agent architecture)
For Core Development (3-4 Weeks):
- Week 1-2: Complete WASM host functions (tags, events, custom fields)
- Week 3-4: Build model loader system
- Week 5-6: Implement memory trait foundations
- Week 7-8: Add extractor hook system
Key Principle:
Core does the expensive, generic work once. Extensions consume and specialize.
This keeps Core lean (87% → 100%) while enabling infinite adaptability through extensions. Photos, CRM, and Chronicle all leverage the same OCR/embeddings but create completely different experiences.
Ready to implement. This is your November 2025 alpha roadmap.