I write about machine learning, deep learning and lately more about NLP.
Shadows, Angles, and the Geometry of Sameness
Measuring the gap - Rulers of ML
The Building Blocks – Vectors, Basis & Spans
A detailed description about tokenizers in LLMs.
A guide on MoEs, their history and implementation in transformers along with the losses associated with it.
A guide on KV-Caching along with Sliding Window Attention. Ending with MQA and GQA.
A guide on BERT models and how their architecture.
Implementation and trainig of Transformer for En-Hi machine translation task.
A guide on Seq2Seq models that changed machine translation task.
Some tips for working with PyTorch for NLP tasks