Pick up an Indian tax statute — the kind we spend our days indexing — and look at a single section. It is not prose. It is a small, precise machine.
A typical section has a numbered head clause, two or three sub-clauses, an explanation block, and a "Provided that..." proviso carving out an exception only meaningful in the context of the head. Sometimes a rate table follows. Sometimes an amendment marker substitutes a phrase ("twenty per cent" for "fifteen per cent") by reference to a later Act. None of these are decorative. Each changes what the section means.
Now imagine a fixed-window splitter walking through that text every 1000 characters. The proviso lands in a different chunk from the section it modifies. The explanation gets stranded from the term it defines. A retriever that finds the proviso cannot tell you what it's a proviso to. A retriever that finds the head clause cannot tell you the carve-out exists.
This is the failure mode that taught us, on document one, that we could not reuse anyone's off-the-shelf chunker.