Fit Context Window
Clean HTML and truncate to fit an LLM token budget
AnalyzeHow it works
1. html-to-text Convert HTML to plain text preserving line breaks 2. squeeze Collapse extra whitespace and blank lines 3. token-truncate Truncate text to fit a token budget (real BPE)
$ html-to-text | squeeze | token-truncate --max=4096 Examples
Clean a documentation page to fit a 30-token LLM prompt
Usage
"<article><h2>Getting Started with React</h2><p>React is a Ja..." | html-to-text | squeeze | token-truncate --max=4096 Strip blog HTML and extra whitespace for an API summary call
Usage
"<div class="post"><span class="meta">Posted by admin • ..." | html-to-text | squeeze | token-truncate --max=4096 Convert a status page table to clean text for an incident prompt
Usage
"<table><tr><th>Name</th><th>Status</th></tr><tr><td>API Gate..." | html-to-text | squeeze | token-truncate --max=4096