>
paepper.com/blog
Blog
⌂ Paepper.com
Paper
2024
FlashAttention - optimizing GPU memory for more scalable transformers
Jul 20
LoRA - low rank adaption explained in three minutes
Jan 28
2023
Understanding the difference between weight decay and L2 regularization
Sep 17
2022
DINO - Emerging properties in self-supervised vision transformers
Mar 13
Rethinking Batch in BatchNorm
Feb 28
2021
P-Diff Learning Classifier with noisy labels based on probability difference distributions
Oct 17
Meta-learning from noisy labels
Jun 25
2020
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition
Nov 29
End-to-End object detection with transformers
Aug 30
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Jun 28