Compile your guide, share it on GitHub or arXiv, and join the community building LLMs one line of code at a time.

A language model assigns probability to a sequence of tokens:

If you built a 15-million-parameter model and trained it on the complete works of Jane Austen, the output might start as gibberish ( "asdio fjkl qwep" ) but after 5,000 steps, it will produce real English words. After 50,000 steps, it will write in iambic pentameter.


About    Privacy Policy    Terms and Conditions

© 2023. A Matt Cone project. CC BY-NC-SA 4.0. Made with 🌶️ in New Mexico.