Optimize Code Context and Token Usage
ZipAI is a context-window-aware skill that applies prompt caching, log compression, surgical edits, and telegraphic output techniques for AI assistants.
Why it matters
Enhance developer productivity by intelligently managing context windows and optimizing token usage for code analysis, debugging, and review tasks.
Outcomes
What it gets done
Triage requests for context-window-aware processing.
Filter and parse logs, code, and tool outputs efficiently.
Provide concise, technical output with adaptive verbosity.
Handle ambiguity with targeted clarification questions.
Install
Add it to your toolbox
Run in your project directory:
curl -fsSL https://spark.entire.vc/get/ag-zipai-optimizer | bash Capabilities
What this skill does
Analyzes code for bugs, style issues, and improvements.
Traces errors to their root cause and suggests fixes.
Pulls structured data fields from unstructured text.
Condenses long documents or threads into key takeaways.
Overview
ZipAI: Context & Token Optimizer
What it does
A prompt engineering skill that structures AI assistant interactions using seven compression and caching rules to reduce redundancy and maintain prompt efficiency.
How it connects
Use when working with technical outputs, error logs, large codebases, or long assistant sessions where context management and concise communication are priorities.
Source README
ZipAI: Context & Token Optimizer
When to Use
Use this skill when the request needs context-window-aware triage, prompt caching optimizations, concise technical output, ambiguity handling, or selective reading of logs, source files, JSON/YAML payloads, VCS output, or MCP tool results.
Rules
Rule 1 - Adaptive Verbosity (No Filler)
- Fixes: technical only. ZERO filler (e.g., "Certainly", "I understand", "Here is", "Sure").
- Analysis: full reasoning allowed.
- Direct Ask: max 15 words in ultra-dense telegraphic style. Omit grammatical helper constructs.
- Long Sessions: never re-summarize past thread context.
- Reviews: use structured headers:
[ISSUE],[SUGGESTION],[NITPICK].
Rule 2 - Ambiguity-First Execution
- Ask exactly ONE question if 2+ interpretations exist. Never stack questions.
- Default to minimal intervention for minor changes.
- Scope ambiguous requests to narrowest boundary.
Rule 3 - Prompt Caching & Prefix Stability
- Static-First Ordering: Structure prompts to place invariant components (system instructions, core rules, static tool schemas) at the top of the prompt.
- Isolate Dynamic Context: Append dynamic and volatile elements (active conversation history, recently read file contents, CLI execution outputs) at the very end of the prompt to protect and reuse the cached prefix.
- Prefix Integrity: Avoid interleaving new queries or dynamic variables inside static system blocks. Keep the static instructions strictly invariant.
- Cached Files Reuse: Reuse already loaded file contents present in the conversation history; do not re-read files unless explicitly updated.
Rule 4 - Semantic Input Pruning & Log Compression
- Traceback Extraction: When handling error or build outputs, parse and filter logs using grep/regex to extract only tracebacks, error statements, and a maximum of 3-5 lines of context around them. Strip all info logs, successful build tasks, and redundant progress messages.
- Skeletal Code Viewing (AST): For large files (>300 lines), do not view the full file. Use
grep -nE "^(class|def|async def|function|const|let|var).*="(or language equivalents) to view class and function headers first, then target specific ranges withview_file. - Smart JSON/YAML Crusher: Minify structured inputs. Strip pretty-printing whitespaces, comments, and unused fields from JSON/YAML payloads before placing them in context. Convert large arrays to dense CSV or key-value listings if they are queried.
Rule 5 - Surgical & Compact Output
- Local Replacements: Perform edits using surgical tools (
str_replaceor single-hunk diffs). Never reprint unchanged surrounding code or perform full-file reprints. - Batch Modifies: Consolidate multiple non-contiguous edits in a single file into a single multi-replace chunk operation, ordered from leaf dependencies upward.
- Differential Output: Limit conversational responses to the exact modified blocks, avoiding conversational code repetition.
Rule 6 - Telegraphic Grammar & Density
- Syntax Compression: Strip articles ("a", "an", "the"), redundant helper verbs ("to be", "to have", "do"), and politeness/softening modifiers ("please", "simply", "just", "easy").
- Structure: Format output blocks into dense semantic mappings (
key: val), short bullet lists, and compact tables. Avoid paragraphs of text.
Rule 7 - Token-Budget Reasoning (CoT Optimization)
- Direct Mode: Skip long planning/thinking cycles for trivial, deterministic edits (typos, formatting, import adjustments).
- Abbreviated Thoughts: Keep thought blocks compact. Never reprint code snippets or copy-paste file blocks inside thoughts. Reference files via path and lines (e.g.
file.py#L12-18).
Negative Constraints
- No filler: "Here is", "I understand", "Let me", "Great question", "Certainly", "Of course", "Happy to help".
- No blind truncation of stacktraces or error logs.
- No full-file reads on large files.
- No re-reading files already in context.
- No multi-question clarification dumps.
- No silent bundling of unrelated changes.
- No full git diff ingestion on large changesets - extract hunks only.
- No git log beyond 20 entries unless a specific range is requested.
- No full MCP object inspection when field-level access suffices.
- No MCP mutations without prior read of current resource state.
- No SHA reuse across sessions for file updates.
Limitations
- Brainstorming: disable during creative/open-ended design phases.
- Grep Blindness: key context may fall outside filter boundaries.
- Overshadowing: aggressive pruning may drop micro-variables in long sessions.
Discussion
Questions & comments · 0
Sign In Sign in to leave a comment.