(2 mins read)
LLM Structured output and real-time parsers
Standardization of JSON mode has been like a fundamentally psychologically limiting
thing for developers. There's so much more that can be done around structured outputs.
I made this library an year ago to parse JSON streams to achieve things like this:
When I say JSON streams, I mean taking something like this: {"title": "This is an unfinishe
and parsing it into a full object: {"title": "This is an unfinishe"} while preserving and
accounting for the context that rest of the object and that unfinished string is yet
to arrive
The neat part about this is how the graph structure in that example goes recursively
deep. Most data is linear and much simpler to work with, but if you were to generate
that graph node-by-node (in order to parse each one individually), not only would it
cost a lot more but the quality of output would go down quite significantly too.
Here's a more complex example from a thing I've been working more recently on:
It uses an entirely custom format inspired by QML and the parsing logic here is quite
complex compared to the previous example.
There's simply no way to achieve something like this using JSON/YAML/whatever and have
the same kind of cost and speed (low token count), and output quality (the model's
familiarity with generating the UI in a more UI-like format instead of key-value pairs)