ratchet

provenance:github:mahsumaktas/ratchet

WHAT THIS AGENT DOES

Ratchet is an autonomous code improvement engine designed to work with Claude Code. It makes small, incremental changes to your codebase, carefully measuring the impact of each change and only keeping those that demonstrably improve the code. This approach avoids the risks of large, sweeping refactors and ensures continuous, measurable progress. Developers can use Ratchet to automatically fix bugs, improve code quality, and enhance security. The agent's unique enforcement hooks and self-review engine ensure stability and continuous learning, making it a powerful tool for automated code refinement.

PROBLEM IT SOLVES

Ratchet solves the problem of making safe, incremental improvements to codebases using AI, preventing the instability and difficulty of reverting large, automated changes. Instead of relying on potentially flawed AI suggestions, Ratchet uses a methodical approach, proving each change with metrics before committing it, offering a more reliable and controllable improvement process.

View Source ↗First seen 2mo agoNot yet hireable

CAPABILITIES & CONSTRAINTS

TECH & STACK

claudecodeautomationbashmetricsautoresearchsecuritylinting

USE CASES

Automation Testing Code Review

README

# Ratchet

**Autonomous code improvement engine for Claude Code.**

Ratchet makes small, atomic changes to your codebase, measures the impact, and keeps only what improves. Like a ratchet wrench — it only turns forward, never back.

Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch): 126 experiments, 11% improvement, zero human intervention.

---

## Why Ratchet?

Most AI coding tools make large, risky changes. Ratchet takes the opposite approach:

| Traditional AI Coding | Ratchet |
|----------------------|---------|
| Big refactors that might break things | One change at a time |
| "Trust me, it's better" | Frozen metrics prove it |
| Can't undo easily | Git commit or revert, nothing in between |
| Runs once, done | Runs for hours, compounds improvement |
| LLM decides quality | **Scripts decide quality** — deterministic, not probabilistic |

**The key insight:** Claude Code follows markdown instructions, but can forget or skip steps during long sessions. Ratchet adds **enforcement hooks** — bash scripts that mechanically prevent invalid actions. Claude can't commit without passing validation. Can't edit protected files. Can't skip metrics.

## Quick Start

```bash
# Clone and install
git clone https://github.com/mahsumaktas/ratchet.git
cd ratchet
bash install.sh

# Go to your project
cd ~/your-project

# Start Claude Code and run
claude
> /autoresearch            # general improvement
> /autoresearch fix 20     # fix 20 lint/type errors
> /autoresearch security   # security audit
```

Or just tell Claude naturally:
```
> projeyi iyileştir
> uyurken çalış
> analyze and improve this codebase
```

## Features

### 6 Modes

| Mode | Command | What it does |
|------|---------|-------------|
| **run** | `/autoresearch` | General improvement: bugs, lint, types, dead code |
| **fix** | `/autoresearch fix [N]` | Drive lint/type/test errors to zero |
| **debug** | `/autoresearch debug` | Scientific bug hunting: reproduce → 5 Whys → TDD fix |
| **security** | `/autoresearch security` | OWASP Top 10 + STRIDE threat scan |
| **predict** | `/autoresearch predict` | 5-persona analysis (no changes, just prioritized findings) |
| **plan** | `/autoresearch plan` | Interactive wizard → frozen metric config |

### Enforcement Hooks

Unlike markdown-only instructions, Ratchet uses real bash scripts that **mechanically enforce** the rules:

| Hook | Type | What it enforces |
|------|------|-----------------|
| `ar-state-enforcer.sh` | PreToolUse | Blocks edits outside MAKE_CHANGE state, blocks commits outside COMMIT state |
| `ar-boundary-guard.sh` | PreToolUse | Blocks edits to `never_touch` files (*.lock, node_modules, .env, etc.) |
| `ar-metrics-collector.sh` | PostToolUse | Auto-runs frozen metrics + guard + decision engine after every edit |
| `ar-session-restore.sh` | SessionStart | Restores state after Claude Code restart |
| `ar-compact-inject.sh` | PostCompact | Injects state after context compaction |
| `ar-stop-summary.sh` | Stop | Shows progress summary + triggers self-review |

**Performance:** When Ratchet is not active, all hooks exit in <1ms (single file existence check).

### Comprehensive Logging (v2)

Every hook call, every experiment, every decision is logged to project-local JSONL files:

```
.autoresearch/logs/
├── events.jsonl        — hook-level micro events (state transitions, boundary checks, guard runs)
├── experiments.jsonl    — full experiment records (metrics before/after, decision, strategy, duration)
└── insights.jsonl       — self-review learnings (accumulated across sessions)
```

Log rotation: `events.jsonl` auto-rotates at 5MB. Experiments and insights are preserved for analysis.

### Self-Review Engine (v2)

Ratchet analyzes its own performance and auto-adjusts configuration:

| Trigger | When | What it does |
|---------|------|-------------|
| **Session end** | Every stop | Analyzes all experiments, detects patterns |
| **Threshold** | 5 consecutive discards or every 20 experiments | Mid-session course correction |
| **Manual** | `/autoresearch review` | On-demand analysis |

**Auto-detected patterns:**

| Pattern | Action |
|---------|--------|
| Strategy < 10% success (5+ experiments) | Remove from rotation |
| File with 3+ consecutive failures | Add to `never_touch` |
| Last 10 experiments all discarded (local minimum) | Reset strategies + add `discovery` |
| Strategy > 70% success | Prioritize (move to front) |

**Safety:** Self-review can modify `strategy_rotation` and `never_touch`, but **never** touches `guard_command`, `frozen_commands`, `mode`, or `max_experiments`. Config backup taken before every change.

### State Machine

Every experiment follows a strict state machine. Invalid transitions are impossible:

```
BOOTSTRAP → SELECT_TARGET → READ_FILE → MAKE_CHANGE → VALIDATE → DECIDE
                 ↑                                                   |
                 |                                          COMMIT ←─┤ (KEEP)
                 |                                          REVERT ←─┘ (DISCARD)
                 |                                            |
                 └──────────── LOG ←──────────────────────────┘
```

### Guard Commands

Main metrics measure improvement. Guard commands prevent regression:

```
Optimize lint → lint errors decrease ✓
But guard (npm test) fails → DISCARD ✗
```

The guard ensures you never break tests while fixing lint, never break the build while adding types.

### Cross-run Learning (v2.1)

Ratchet remembers what worked and what didn't across sessions:

```
.autoresearch/lessons.jsonl
```

- **50-entry cap** with FIFO eviction
- **30-day time-decay** — old lessons auto-pruned at bootstrap
- **Auto-populated** — KEEP decisions record strategy + file + reason
- **Loaded into CHECKPOINT.md** — Claude sees previous learnings on restart

### Token & Cost Tracking (v2.1)

Track spending per ratchet session:

```bash
# During session — auto-tracked per experiment
# At session end — summary in stop hook:
# [RATCHET] 12 experiments (8 kept / 4 discarded) | Cost: 45K tokens, ~$0.32

# Budget enforcement:
# Set in config: "max_budget_usd": 5.00
# Ratchet stops when budget exceeded
```

### Environment Probing (v2.1)

At bootstrap, `ar-probe.sh` auto-detects:

| Category | Detected |
|----------|----------|
| Languages | Node.js, TypeScript, Python, Rust, Go, Ruby, Java, Shell |
| Test runners | jest, vitest, mocha, pytest, cargo test, go test |
| Linters | eslint, biome, ruff, clippy, go vet, rubocop |
| Type checkers | tsc, mypy, pyright |
| Frameworks | Next.js, Nuxt, Vite, Angular, Svelte, Django, Flask |
| CI | GitHub Actions, GitLab CI, Jenkins, CircleCI |
| Monorepo | lerna, pnpm workspaces, Cargo workspaces, nx |

Results saved to `state.json` as `environment` field — used for smarter strategy selection.

### Mechanical Decision Engine

The keep/discard decision is made by `ar-decide.sh` — a deterministic script, not an LLM judgment:

| Metrics | Guard | Decision |
|---------|-------|----------|
| Improved | PASS | **KEEP** |
| Same, code shorter | PASS | **KEEP** |
| Same, code same size | PASS | **KEEP** |
| Same, code longer | PASS | **DISCARD** |
| All errored | Any | **DISCARD** |
| Improved | FAIL | **DISCARD** |
| Worsened | Any | **DISCARD** |

Code size is measured via `git diff --stat` (insertions minus deletions) — only checked when all metrics are unchanged.

### Context-Reset Proof

Ratchet survives Claude Code restarts and context compaction:
- `state.json` — machine state, persisted to disk after every transition
- `CHECKPOINT.md` — self-contained document for zero-context resume
- `SessionStart` hook — auto-restores state on restart
- `PostCompact` hook — auto-injects state after compaction

### Strategy Rotation

When stuck (5 consecutive discards), Ratchet automatically switches strategy:

```
default → low-hanging-fruit → deep-refactor → security-sweep → dead-code-cleanup → discovery-driven
```

After 10 consecutive discard

[truncated…]

PUBLIC HISTORY

First discoveredMar 25, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenMar 24, 2026

last updatedMar 24, 2026

last crawled1 months ago

version—

RELATED AGENTS

askimo

Askimo is a platform that lets you interact with artificial intelligence in a simple way, whether through chatting, sear

drone-agent

The drone-agent is an autonomous system designed for drone control and operation. It leverages generative AI, specifical

dolios-agent

Dolios Agent is an AI assistant that can handle complex tasks by breaking them down into smaller steps and learning from

encode

Here's a plain English summary of the "encode" agent (which is part of the ZORAK platform): This agent analyzes cryptoc

J.E.L.L.Y._AI

J.E.L.L.Y._AI is an article writing AI developed by its creator. This repository is publicly available for job seeking p

More Automation agents →

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:mahsumaktas/ratchet)