The Structured-First Executive Assistant

Cortex stores calendar events, tasks, projects, and reminders as typed Postgres rows. When the AI reasons about my day, it gets a structured JSON payload assembled from those rows: what's due, what's overdue, where my free time is, what I've been neglecting. The model's job is to say something useful about that context. Not to reconstruct it from chat history.

What it does

Cortex is a web-based personal executive assistant. It syncs and manages my Google Calendar, handles tasks and projects natively, generates a daily plan every morning, runs proactive background checks throughout the day, and sends Discord notifications for things worth interrupting me for. It learns habits over time and adjusts accordingly.

The key word is proactive. Most of what it does happens without me asking. Background jobs run on schedule, detect notable states, and decide whether to surface something or stay quiet.

Structure first, AI second

Calendar events, tasks, projects, reminders: these live in Postgres as properly typed rows. When the AI reasons about my day, it isn't reading raw text from a notes file. It gets a structured JSON payload assembled from the database:

{
  "date": "2026-04-20",
  "calendar": [...],
  "free_blocks": [...],
  "tasks": [...],
  "projects": [...],
  "tendencies": [...]
}

The AI's job is to say something useful about that structure. Not to guess what structure might exist.

This matters for quality. When I ask "what should I work on today?", a chatbot guesses based on the last few messages. Cortex assembles my actual state: what's due, what's overdue, where my free time is, what my calendar looks like, what I've been neglecting. It hands all of that to the model. The model's response quality is bounded by the quality of its inputs.

It also matters for reliability. A system that stores tasks and events properly can answer questions about them correctly, every time. A system that relies on the AI to remember what tasks you mentioned in chat will eventually forget or confuse them.

Four AI roles

The AI layer is split into four logical roles rather than one monolithic assistant.

Planner runs once a day: daily and weekly planning, task prioritization, scheduling suggestions. Premium model, slow call, high quality output.

Monitor runs every 30-60 minutes during active hours: background evaluation, detecting risk states, deciding whether to notify. Cheap model, because most runs should produce nothing. It only escalates to a real LLM call when rule-based detectors first find something worth surfacing.

Memory curator runs nightly: extracts stable patterns from repeated observations. It's responsible for turning "user moved the suggested block to morning three times" into a formal tendency that affects future planning.

Chat assistant runs on demand: answers questions and takes actions. It has the full tool set.

Each role gets scoped tool access. The monitor can only read and propose notifications. The planner can read and propose schedule blocks. The chat assistant can read, write, and act. Scoping tool access per role reduces the chance that a proactive wake-up accidentally does something I didn't ask for.

The think → act loop

A single request like "clean up my afternoon" might require several steps: read calendar, identify low-priority events, check task urgency, find free windows, propose a rearrangement. That's not one LLM call.

Cortex uses a think → act loop. The model gets called with its tool set. If it returns tool use blocks, those tools execute, results come back, the loop continues. When the model produces a final text response, the loop exits.

Turn cap: every run has a maxTurns limit (25 for chat, 15 for proactive wake-ups). Exceeding it aborts cleanly and logs the outcome.
Time cap: wall-clock timeout enforced at the orchestration layer, not just per LLM call.
Parallel tool calls: read tools execute concurrently. Writes serialize.
Loop detection: if the same tool fires with the same input repeatedly, the system injects a warning into the next tool result, then hard-stops at a threshold.
Transactionality: actions with real consequences, like sending a Discord message or moving a Google Calendar event, queue as proposals during the loop and only commit after it finishes successfully. Unless I've whitelisted that tool category as auto-act.

The proactive wake-ups use the same loop as chat. The only differences are the trigger source and which tools are in scope.

Spam vs. usefulness

The assistant that notifies me about everything is worse than the one that says nothing. Every unnecessary notification makes the next one less likely to be read.

Cortex uses deterministic rule detectors before any LLM calls on proactive runs. Examples:

Task due within 24h, unscheduled, estimated duration greater than available free time: flag deadline risk
Meeting in 2h, marked important, no prep block: flag prep reminder
Three overdue tasks, no focus block today: flag backlog problem

Only when a rule fires does the system spend tokens on a real LLM call. The LLM's job on a proactive run is to produce human-readable reasoning for something that already passed a rule check. Not to invent reasons to interrupt me.

The monitor also has explicit cooldowns, importance scores, and per-channel rate limits. The morning daily plan goes to Discord once. A deadline risk that fired yesterday doesn't fire again unless the situation got worse.

Learning that doesn't trust too fast

Cortex learns habits: preferred work times, task duration accuracy, how I respond to scheduling suggestions. But inferred tendencies require evidence before they affect behavior.

The threshold: at least three similar observations over 14+ days before a pattern is promoted. A single data point doesn't change the model. Repeated ones do.

I can see and edit everything the system has learned. Every inferred tendency is visible. Any tendency can be deleted, corrected, or promoted to an explicit preference. The system is transparent about what it thinks it knows.

The design came from a real concern: a system that confidently adapts to wrong inferences will erode trust fast. Conservative learning and visible memory are both answers to the same problem: the system should earn trust over time, not assume it from the start.

What it's not

Cortex is not a raw chatbot with a calendar API strapped to it. It's not an autonomous agent that reorganizes my life without asking. In V1 it doesn't send emails, take over my browser, or make decisions that can't be reviewed.

What it is: a system that combines structured data, proactive orchestration, memory, and careful action-taking. Any single piece of it (a chatbot, a task manager, a calendar app) is a solved problem. Connecting them, making them proactive, and making the AI layer understand the structure underneath: that's the actual product.