In a recent 3Blue1Brown video series on transformer models, Grant Sanderson posed a fascinating question: How can a relatively modest embedding space of 12,288 dimensions (GPT-3) accommodate millions of distinct real-world concepts?
The answer lies at the intersection of high-dimensional geometry and a remarkable mathematical result known as the