Compile your guide, share it on GitHub or arXiv, and join the community building LLMs one line of code at a time.
A language model assigns probability to a sequence of tokens:
If you built a 15-million-parameter model and trained it on the complete works of Jane Austen, the output might start as gibberish ( "asdio fjkl qwep" ) but after 5,000 steps, it will produce real English words. After 50,000 steps, it will write in iambic pentameter.