Depth is All You Need

A short post for the essential reading exploring the basics of LLMs and VLMs

  ·   1 min read

Exploring the basic depths of LLMs and VLMs, A review of LLM and VLM content. I have been working on the basics, but now is a good time to explore the depths and refresh.

Here is a list of resources I have gone through:

The top 3 gets us up to date in data manipulation, basic architecture of attention, and tokenisation.

Then we get into actually training a model. Inccluding llm.c, which is worth the deepdive

And some extra bits we can get into once the basics is complete, esepcially surrounding modern techniques, such as Q/Lora, quantisation, evals, MoE, ViTs, and Flash Attention.

  • decoding strategies in large language models mlabonne
  • how to make llms go fast by vgel
  • a visual guide to quantization maarten
  • the novice’s llm training guide by alpin
  • a survey on evaluation of large language models paper
  • mixture of experts explained huggingface

Some notes on ViTs, CLIP, and Paligemma

Some further extra reading, (less truly evergreen, but still useful)