Public alpha — AgentRobbi is in active development. v3.0.0-alpha.5 is stable for daily use; expect rapid iteration.
LOCAL-FIRST AI WORKSTATION · WINDOWS

Your personal AI,
running on your machine.

AgentRobbi is a complete AI workstation for Windows. One installer. CPU, GPU, and Cloud in a single binary. Multi-GPU offload, agentic runners, encrypted vault, and a 50+ integration catalogue — all without sending your data anywhere.

Local by default. Cloud is optional and BYO key. One-time licence. Optional $5/mo Updates Pass.
100%Local processing
3Built-in runners
50+Integrations
Built for power users

Everything you need to run AI locally — and nothing you don't.

One Rust binary, one installer. No Python. No Node.js on the host. No Docker. No WSL.

🖥️

CPU, GPU & Cloud

Switch with one click. CPU works on any Windows box. CUDA is an optional ~600 MB download triggered by the UI when an NVIDIA card is detected. Cloud is BYO key — OpenRouter, Groq, Cerebras, OpenAI, Kimi, or any custom OpenAI-compatible endpoint.

Multi-GPU offload

Automatic VRAM pooling across every CUDA device, or pin the work to one GPU, or hand-tune the tensor split. The plan_n_gpu_layers heuristic uses your total VRAM (e.g. 24 GB + 8 GB = 32 GB) so 27B IQ4 weights fit on rigs neither card could handle alone.

🤖

Agentic runners

Three built-in personas — Research, Coding, and Writing — each with focused prompts, tool catalogues, and workflows. Drop your own .md runners into Documents/AgentRobbi/runners/ and they show up instantly.

🧠

Persistent memory

SQLite + FTS5 conversation store, full-text searchable across every chat. Local. No cloud sync. No telemetry. The DB lives at Documents/AgentRobbi/memory.db — open it in any SQLite browser if you want.

🔐

Encrypted vault

API keys, OAuth tokens, and secrets live in vault.enc — PBKDF2-SHA256 + AES-256-GCM. Unlocked with your master password on first use; never written in plaintext.

🔌

50+ integrations

A catalogue of SaaS connectors — Gmail, GitHub, Linear, HubSpot, Stripe, Zendesk, Slack, Notion, Cloudflare, AWS, and more — dispatched through a single integration_action tool. Dry-run by default; opt in per action.

📡

Curl-able everything

Every capability is an HTTP endpoint on 127.0.0.1:11434. If the UI breaks you can still chat from PowerShell. Single OpenAI-compatible /v1/chat/completions + 50+ admin/tool endpoints.

🎨

Browser-as-window

AgentRobbi.exe launches Edge or Chrome in --app= mode — a borderless, native-feeling window without bundling a 200 MB browser. The UI is plain HTML + ~6 K lines of vanilla JS — no React, no bundler, no build step.

Step by step

From download to first chat in under 5 minutes.

The AgentRobbi installer is one Windows .exe with no admin prompt. Per-user install. Uninstall = delete two folders.

1

Download & install

Run AgentRobbi-Setup-3.0.0-alpha.5.exe. NSIS handles the rest — no admin rights required for the per-user install. Total install size: ~30 MB before first model download.

AgentRobbi installer running with progress bar — no admin prompt
2

Onboarding

AgentRobbi detects your CPU, RAM, and every GPU you have. It picks a model that fits your hardware and downloads the GGUF weights from HuggingFace in the background.

Hardware detection panel showing CPU, RAM, multi-GPU pooling, and recommended model download
3

Pick a runner

Research, Coding, or Writing — each with focused prompts, tools, and workspace conventions. Switch on the fly. Drop your own runners into the folder and they show up instantly.

Runner picker with Research selected; Coding and Writing tiles alongside
4

Chat & iterate

Streaming responses, tool calls inline, full chat history searchable across sessions. Switch CPU ↔ GPU ↔ Cloud per turn or globally. Multi-GPU pooling happens automatically.

Chat view streaming a response with CPU/GPU/Cloud toggle and inline tool call card
System requirements

Runs on the laptop you already have.

Minimum (CPU only)

  • Windows 10 or 11, 64-bit
  • 8 GB RAM
  • 5 GB free disk + ~2 GB per model
  • Internet for first model download

Recommended (single GPU)

  • Windows 11, 64-bit
  • 16–32 GB RAM
  • NVIDIA RTX with 8 GB+ VRAM
  • SSD with 10 GB+ free

Power user (multi-GPU)

  • Windows 11, 64-bit
  • 32 GB+ RAM
  • 2 NVIDIA cards, 24 + 8 GB or larger
  • Auto, Single, or custom tensor-split

Not required

  • No Docker, WSL, or Python
  • No Node.js on the host
  • No admin rights for install
  • No accounts. No telemetry.
Simple pricing

One-time licence. Optional Updates Pass.

You buy AgentRobbi once. The Updates Pass is optional and gives you new model catalogues and feature drops as they ship. Cancel anytime — your install keeps working.

+ Updates Pass

Required for ongoing model catalogue refreshes and new features.

$5 / month USD

+ tax. Cancel anytime — your install keeps working with the catalogue you had at cancellation.

  • Monthly model catalogue refresh — new GGUF weights as they're tested
  • New agentic runners as they ship
  • New integrations as they're added
  • Feature releases & new tools
  • Priority email support
  • Stack with the $49 software licence

Payments processed by Stripe. Card details never touch our servers. Receipts emailed instantly. Sales tax / GST calculated by Stripe at checkout.

FAQ

The questions everybody asks.

Is my data private? Does AgentRobbi phone home?

Yes, private. No, it doesn't. The local engine (CPU or GPU) runs entirely on your machine. Conversations, memory, and vault all live under %USERPROFILE%\Documents\AgentRobbi\ and never leave your computer.

The only network calls AgentRobbi makes are: (1) the first-run model download from HuggingFace, (2) the catalogue refresh from marcgough.github.io (gated by Updates Pass), (3) the licence check (once on launch + once every 24 h with a 7-day offline grace period), and (4) any cloud provider you explicitly configured with your own API key.

What's the difference between Local, Cloud, and Auto modes?

Local runs the model on your CPU or GPU via the bundled llama-server. Cloud bypasses local entirely and routes to your configured provider (OpenRouter / Groq / Cerebras / OpenAI / Kimi / custom). Auto picks GPU if a runtime is installed, otherwise CPU, otherwise Cloud as a last resort.

You can override per-turn from the chat UI without changing the global setting.

How does multi-GPU work?

Three strategies in Settings:

  • Auto (default) — llama.cpp pools VRAM across every detected CUDA device using its layer-wise tensor split, proportional to free VRAM. On a 3090 (24 GB) + RTX 4000 (8 GB) rig you get a 32 GB pool.
  • Single — pin the work to one GPU. Emits both --main-gpu N and a zero-weighted --tensor-split mask so the other devices are excluded from offload entirely.
  • Custom — supply a comma-separated weight string yourself (e.g. 0.75,0.25) for hand-tuned splits.
Why $49 once and $5/month? Why not just $49 forever?

The $49 software licence is yours forever — buy it once, run it forever, never pay another cent. The model catalogue and the runners that drive AgentRobbi, however, evolve fast. New 7B / 14B / 27B models drop monthly, llama.cpp gets faster, integrations get added.

The Updates Pass funds that ongoing work. If you're happy with the catalogue you have today, you don't need it. If you want the latest weights and features as they ship, $5/month gets you everything new, and you can cancel anytime — your install keeps working with the catalogue you had at cancellation.

What's your refund policy?

14 days, no questions asked. Email support@agentrobbi.com within 14 days of purchase and we'll refund you in full. Your licence is revoked at the same time and the app stops working on the next licence check (within 24 h, or immediately if you're online).

How do I uninstall?

Apps & Features → AgentRobbi → Uninstall. That removes the binary at %LOCALAPPDATA%\Programs\AgentRobbi\.

Your data stays at %USERPROFILE%\Documents\AgentRobbi\ (models, conversations, vault, settings) so you can come back to it. Delete that folder manually for a fully clean slate.

Is it really local? What about the licence check?

Yes. The licence check is a single HTTPS POST to agentrobbi.com/api/license/check with your licence token (an ARBI-… string), once on launch and once every 24 h. It returns {"valid": true|false} plus your plan and Updates-Pass status. No telemetry, no model usage data, no chat content — just "is this licence still valid?".

If your network is offline, the cached "last good" check is honoured for 7 days before the app gates. Privacy-conscious users who block the licence endpoint at the firewall get a 7-day grace period every time.

What models can I run?

Any GGUF-format model that llama.cpp supports — Qwen, Llama, Gemma, Mistral, Phi, DeepSeek, Yi, and more. The catalogue ships hand-curated picks for each hardware tier (4 GB → 8 GB → 12 GB → 24 GB+ VRAM); you can also drop arbitrary .gguf files into Documents/AgentRobbi/models/ and they'll appear in the picker.

Mac and Linux support?

AgentRobbi v3 is Windows-first. Mac and Linux builds are deferred — the binary itself is portable Rust + portable llama.cpp, but the installer, multi-GPU detection, and the browser-launch UX are Windows-shaped today. They'll come.

Is it open source?

No. AgentRobbi is proprietary commercial software (see the licence). It statically links a number of open-source Rust crates and llama.cpp; full attribution is in third-party licences.

There are no GPL, LGPL, MPL, AGPL, CDDL, or SSPL components in the shipped Windows build.

What changed from RobiClaw?

v3 is a ground-up rewrite. RobiClaw was a Tauri + React + Node app driving llamafile; AgentRobbi is a single Rust binary driving llama.cpp directly with an embedded HTML UI — no Tauri, no React, no Node.js on the host. Smaller installer, faster cold start, multi-GPU support, agentic tool-calling, encrypted vault, integration catalogue.

The brand was renamed during alpha.5 — same author (marcgough), same broad mission, but a much faster and more capable engine.