Stop Paying for Slop: A Deterministic Middleware for LLM Token Optimization
Context windows are getting huge, but token budgets are tightening. Every time your agent iterates in an autonomous loop, you're potentially sending a massive, bloated prompt filled with conversati...

Source: DEV Community
Context windows are getting huge, but token budgets are tightening. Every time your agent iterates in an autonomous loop, you're potentially sending a massive, bloated prompt filled with conversational filler, redundant whitespace, and low-entropy "slop." Today, I've merged the Prompt Token Rewriter to the Skillware registry (v0.2.1). It's a deterministic middleware that aggressively compresses prompts by 50-80% before they ever hit the LLM. Why does this matter? Lower Costs: Pay only for the "signal," not the "noise." Faster Inference: Fewer tokens mean less time spent on KV-caching and long generations. Deterministic Behavior: Because it uses heuristics rather than another expensive LLM call, your agent behavior stays stable and repeatable. Three Levels of Aggression The rewriter includes three presets depending on your use case: Low: Normalizes whitespace and line breaks (Safe for strict code). Medium: Strips conversational fillers ("please," "could you," "ensure that"). High: Aggre