AGENTS / GITHUB / AEGIS
githubinferredactive

AEGIS

provenance:github:MihneaTeodorStoica/AEGIS
WHAT THIS AGENT DOES

AEGIS is a helpful assistant that lives directly on your computer, allowing you to interact with it through voice or text. It can perform tasks on your desktop, like moving windows, taking screenshots, or even automating repetitive actions, all while ensuring those actions are safe and approved. Business professionals who need to automate workflows, manage multiple applications, or quickly access information on their computer would find AEGIS incredibly useful.

View Source ↗First seen 26d agoNot yet hireable
README
# AEGIS

AEGIS is a local CLI-first AI desktop agent for Python 3.12.3.

It is intentionally not a web app. The core design is:

- Assistant: voice-facing/text-facing conversational agent
- Agent: heavy reasoning and guarded desktop execution agent
- MCP server: local typed tool surface for the computer
- Policy gate: local risk classification and approval enforcement
- Orchestrator: durable task IDs, status, logs, approvals, and pause-resume

## Architecture

```text
User
  -> aegis CLI
    -> Assistant layer (voice/text)
      -> explicit handoff request
        -> Orchestrator
          -> Agent runtime
            -> local MCP server
              -> policy gate
                -> local computer tools
```

Key implementation choices:

- Assistant uses the OpenAI Agents SDK as the conversational manager.
- Voice mode uses the Realtime agent path (`RealtimeRunner` + `RealtimeAgent`).
- Agent uses the OpenAI Agents SDK with a local `MCPServerStdio` subprocess.
- MCP tool calls are policy-gated locally before execution.
- Approval pauses are durable through the Agents SDK `RunState` snapshot.
- Task, approval, and event history is persisted in SQLite under `AEGIS_HOME`.

## Features

- `aegis chat` for Assistant text sessions
- `aegis` as the default terminal chat entrypoint
- readline-backed chat history, arrow-key recall, `Ctrl+R` history search, and tab-completed slash commands
- `aegis voice` for realtime Assistant voice sessions
- `aegis run "..."` for direct Agent tasks
- `aegis approvals`, `aegis approve`, `aegis deny`
- `aegis status`, `aegis logs`, `aegis cancel`
- `aegis mcp` to run the local MCP server
- `aegis config` to inspect or validate runtime configuration

## Tool Surface

AEGIS MCP v1 tools:

- `get_screenshot`
- `get_active_window_screenshot`
- `get_active_window`
- `get_window_screenshot`
- `get_window_region_screenshot`
- `list_open_windows`
- `find_windows`
- `get_display_layout`
- `get_desktop_snapshot`
- `run_safe_action_chain`
- `run_interactive_action_chain`
- `move_mouse`
- `move_mouse_relative`
- `move_mouse_to_window`
- `click_mouse`
- `click_in_window`
- `focus_text_input_in_window`
- `double_click`
- `double_click_in_window`
- `right_click`
- `right_click_in_window`
- `mouse_down`
- `mouse_up`
- `drag_mouse_to`
- `drag_mouse_relative`
- `drag_in_window`
- `drag_scrollbar_in_window`
- `scroll`
- `horizontal_scroll`
- `scroll_in_window`
- `page_scroll_in_window`
- `type_text`
- `type_text_in_window`
- `clear_and_type_in_window`
- `press_key`
- `press_key_in_window`
- `press_key_sequence`
- `press_key_sequence_in_window`
- `key_down`
- `key_up`
- `hotkey`
- `hotkey_in_window`
- `launch_app`
- `open_url`
- `search_browser`
- `wait`
- `move_window`
- `resize_window`
- `snap_window`
- `maximize_window`
- `minimize_window`
- `restore_window`
- `run_shell_command_guarded`
- `finish_task`
- `get_mouse_position`
- `focus_window`
- `window_bounds`
- `read_clipboard_metadata`
- `read_file_metadata`

## Policy Model

Every MCP action passes through a local policy gate.

- Low risk: auto-allowed
- Medium risk: approval-gated
- High risk: approval-gated or blocked

AEGIS also supports an explicit full-access mode:

- `AEGIS_ACCESS_MODE=full`
- `aegis --full-access`

The guarded shell tool is intentionally narrow:

- read-only diagnostics like `neofetch` are allowed with approval
- destructive, publishing, install, or shell-composition patterns are blocked

Examples of blocked shell patterns:

- `rm`, `mv`, `cp`, `dd`
- `sudo`
- package installs/removals
- `git push`
- `curl`/`wget`
- shell chaining or redirection

In `full` access mode, approvals are bypassed for the exposed local tools, including the guarded shell tool. Use that mode only when you want unrestricted local execution.

## Observation Strategy

AEGIS now uses a cheap observation-first loop for the Agent:

- fresh tasks start with an initial structured desktop snapshot
- most action tools automatically return a lightweight post-action `desktop_snapshot`
- screenshots remain opt-in for cases where layout or text meaning is ambiguous

This is meant to approximate an `observe -> act -> observe` loop without paying screenshot-token costs every turn.

Cost controls:

- `AEGIS_AUTO_OBSERVE_AFTER_ACTIONS=true|false`
- `AEGIS_AUTO_OBSERVE_WINDOW_LIMIT=<n>`
- `AEGIS_ACTION_CHAIN_MAX_STEPS=<n>`
- `AEGIS_MCP_CLIENT_TIMEOUT_SECONDS=<seconds>`
- `AEGIS_WINDOW_WAIT_TIMEOUT_SECONDS=<seconds>`

Recommended default:

- keep auto-observe enabled
- keep the window limit low, usually `3` to `5`
- keep action chains short, usually `3` to `6` steps
- rely on `get_desktop_snapshot` for cheap refreshes
- request window or region screenshots before falling back to a full-screen screenshot
- increase the MCP/client timeout if desktop apps are slow to launch on your machine

Action-chain guidance:

- default to `run_safe_action_chain` or `run_interactive_action_chain` when the next `2` to `5` steps are already known
- use `run_safe_action_chain` for deterministic low-risk batches like `launch_app -> wait_for_window -> focus_window -> snap_window`
- use `run_interactive_action_chain` for deterministic UI batches like `focus_window -> focus_text_input_in_window -> clear_and_type_in_window -> press_key_in_window`
- do not chain workflows that require reading or interpreting the screen after each step

Browser/chat guidance:

- use `get_active_window_screenshot`, `get_window_screenshot`, or `get_window_region_screenshot` to inspect browser content without paying for a full-screen image
- use semantic anchors such as `chat-input`, `content-center`, `content-top`, and `scrollbar-right` instead of brittle raw coordinates when possible
- use `page_scroll_in_window` or `drag_scrollbar_in_window` for long chat/document panes when wheel scrolling is unreliable
- use `focus_text_input_in_window` before typing into chat-style web apps when the input box is easy to miss

## Setup

1. Ensure Python `3.12.3`.
2. Create or reuse the local `.venv/`.
3. Install the project in editable mode:

```bash
./.venv/bin/python -m pip install -e '.[dev,voice]'
```

4. Create an environment file:

```bash
cp .env.example .env
```

5. Set `OPENAI_API_KEY` in `.env`.
6. Validate configuration:

```bash
./.venv/bin/aegis config --validate
```

If you rely on the repo-local `.env`, run `aegis` from this project directory or one of its subdirectories. If you want to launch it from anywhere, export `OPENAI_API_KEY` in your shell instead.

## Usage

Start a text Assistant session:

```bash
aegis chat
```

The default entrypoint also starts chat:

```bash
aegis
```

Launch chat in full-access mode:

```bash
aegis --full-access
```

Run a direct Agent task:

```bash
aegis run "Take a screenshot and tell me what's on screen."
```

Open Firefox on GitHub:

```bash
aegis run "Open Firefox and go to GitHub."
```

Voice mode:

```bash
aegis voice
```

Inspect pending approvals:

```bash
aegis approvals
```

Approve or deny:

```bash
aegis approve <approval_id>
aegis deny <approval_id> --reason "Don't run shell commands right now."
```

If there is only one pending request, `aegis approve` can approve the latest one without an explicit ID.

Inside chat, the same flow works with:

```text
/approve
/deny
/clear
```

The chat prompt also supports terminal navigation features through readline:

- Up/Down for command history
- `Ctrl+R` for reverse history search
- `Tab` for slash-command completion
- standard cursor shortcuts like `Ctrl+A` and `Ctrl+E`

Status and logs:

```bash
aegis status
aegis logs
```

Run the MCP server directly:

```bash
aegis mcp --transport stdio
```

## Golden Paths

### 1. Screenshot summary

```bash
aegis
```

Then ask:

```text
Take a screenshot and tell me what's on screen.
```

### 2. Open Firefox and go to GitHub

```bash
aegis run "Open Firefox and go to GitHub"
```

### 3. Voice request with approval

```bash
aegis voice
```

Then say:

```text
Open terminal and run neofetch.
```

The Agent pauses for the gua

[truncated…]

PUBLIC HISTORY

First discoveredMar 22, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub
first seenMar 21, 2026
last updatedMar 21, 2026
last crawled26 days ago
version

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:MihneaTeodorStoica/AEGIS)