Ir al contenido
Skip to main content
May 25, 2026

Why I rebuilt my whole assistant stack on AWS Bedrock

Notes from a year of Betty / OpenClaw — multi-modal personal AI, fully self-hosted.
AI Engineering
25 de mayo de 2026 por
Why I rebuilt my whole assistant stack on AWS Bedrock
CLIMB IT Solutions, Inc., Manuel Bautista

For about a year I bounced between ChatGPT, Claude.ai, and the various AI features inside Cursor. They were all good. None of them were mine.

That's the gap I built Betty — my agent on top of OpenClaw — to fill.

What I actually wanted

Three things that none of the SaaS products gave me at once:

  1. Persistent memory across sessions and modalities. When I tell my assistant something about a client on a Tuesday call, I want her to remember it on a Thursday Slack thread. The shared-state problem is solvable, but no SaaS product solves it the way I want — they default to per-context isolation for understandable privacy reasons.

  2. Real tool use, not "agent" theatre. I wanted an assistant that could actually read my Odoo CRM, schedule into my Outlook, search my Notion notebooks, and pull from my Slack history. Not a vendor's curated approximation. Direct API integrations to my systems, owned by me.

  3. Ingress wherever I am. I'm rarely at a desk. The assistant needs to take input from Telegram, WhatsApp, Slack, and email — and respond in the same channel. The "chat UI" of the major AI products is the wrong shape for how I work.

What I built

The simplest possible architecture that solves the three problems above.

  • Compute runs on a small DigitalOcean droplet — the orchestrator, ingress handlers, and tool-call routing. I deliberately did not put any model weights here.
  • Models run through AWS Bedrock — Claude Sonnet 4.6 as the default — with OpenRouter (Sonnet again) as the fallback if Bedrock has an issue. The Bedrock account gives me a single billing/audit surface and lets me swap models without touching the orchestrator. Embeddings are OpenAI's text-embedding-3-small via OpenRouter.
  • Storage is plain files on the droplet for the workspace, plus the memory store below.
  • Memory is SQLite with sqlite-vec — hybrid retrieval that blends BM25 text search (30%) with vector similarity (70%) over a markdown corpus (long-term MEMORY.md, daily notes, and a hand-curated persistent-notes.md). The retrieval pipeline is dumb on purpose. Anything fancier I tried hurt more than it helped.
  • Tools are a set of MCP-style adapters into MS365 (Graph), Odoo (XML-RPC), Notion, GitHub, and Slack. Each tool is a thin wrapper around an existing API with its own auth and its own audit log.
  • Ingress is Telegram (primary), WhatsApp (via wacli on the droplet), and Slack (workspace bot). Each ingress sends to the same orchestrator endpoint.

What I learned

Self-hosting beats SaaS for any AI you use heavily. Not for the cost — Bedrock isn't cheaper than a ChatGPT subscription at my volume — but for the fit. I can change anything about how my assistant works without waiting for a vendor.

Persistent memory is the killer feature. Once your assistant actually remembers what you told it three weeks ago, it stops being a search engine and starts being staff.

Tool calls are mostly boring API work. The hype around "agents" obscures a simple truth: 80% of the value comes from connecting an LLM to the three or four systems where your data already lives. I'd take that any day over a smarter model with no hands.

There's a longer technical writeup of the architecture coming. If you're building something similar, I'd like to hear about it.

— Manuel

Running a managed-services practice from an RV
What works, what breaks, what nobody warned me about.