> paepper.com/blog

Blog
⌂ Paepper.com

Paper

2024

FlashAttention - optimizing GPU memory for more scalable transformers Jul 20
LoRA - low rank adaption explained in three minutes Jan 28

2023

Understanding the difference between weight decay and L2 regularization Sep 17

2022

DINO - Emerging properties in self-supervised vision transformers Mar 13
Rethinking Batch in BatchNorm Feb 28

2021

P-Diff Learning Classifier with noisy labels based on probability difference distributions Oct 17
Meta-learning from noisy labels Jun 25

2020

Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition Nov 29
End-to-End object detection with transformers Aug 30
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Jun 28

© Marc Päpper 20182025ImprintPrivacy

ReadHackLearnRepeat