No more bad outputs with structured generation: Remi Louf

10.7K views · Oct 14, 2024 · 15:31 min · Watch on YouTube ↗

Takeaway

Constrained decoding via logit masking gives you 99.9% valid JSON at zero inference cost — there is no reason left to YOLO unstructured outputs in production.

Summary

Outlines (Python library, 87+ contributors, used under the hood by vLLM and TGI for function calling) constrains LLM outputs to regex, JSON-schema, Pydantic or grammars
Works by masking logits at every step so any token that would violate the structure is set to -inf — the trick is doing this efficiently, which dottxt's index data structure solves
Compared to unconstrained Mistral-7B-v0.1, outlines pushes valid-JSON rate from 17% to 99.9% without prompt optimization
Adds essentially zero inference-time overhead (vs Guidance which grows with token count) and can actually accelerate generation since structural tokens (brackets, field names) need not be sampled
Supports vision-model JSON extraction, custom types like airport codes, and works with Transformers, llama.cpp, mlx-lm; just `pip install outlines`

structured-generationoutlinesjson-schema

Original description

JSON, prompt formatting, hallucinations. If you feel uncomfortable, you have probably had to implement complex solutions to circumvent these problems. What if there was a way to increase the reliability of Large Language Models at no cost? Enter structured generation. In this talk we will explore how the output of models can be steered at no extra cost, why this improves their efficiency and accuracy significantly and makes generation less sensitive to the specifics of the prompt. By the end of the talk, we'll understand how we can start benefiting from this breakthrough today by using the open source library Outlines.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Rémi
Rémi is the co-founder and CEO of .txt. After studies in Philosophy and Physics, Remi explored NLP and Bayesian statistics for 7 years. He works from a French castle, enjoys long walks in the nearby forests, reading poetry, playing guitar and spending time with his children.