Channel Avatar

Soroush Mehraban @UCCCzAbwp5De5wfiP7oGJtBQ@youtube.com

4K subscribers - no pronouns :c

More from this channel (soon)


10:04
Prompt-to-Prompt (P2P) image Editing - Method Explained
30:57
Denoising Diffusion Null-Space Model (DDNM) - Method Explained
21:44
Autoregressive Image Generation without Vector Quantization
18:28
Diffusion Models (DDPM & DDIM) - Easily explained!
10:46
GLIGEN (CVPR2023): Open-Set Grounded Text-to-Image Generation
09:09
The Entropy Enigma: Success and Failure of Entropy Minimization
13:16
Tent: Fully Test-time Adaptation by Entropy Minimization
09:44
VPD (ICCV2023): Unleashing Text-to-Image Diffusion Models for Visual Perception
30:13
TokenHMR (CVPR2024): Advancing Human Mesh Recovery witha Tokenized Pose Representation
22:26
SHViT (CVPR2024): Single-Head Vision Transformer with Memory Efficient Macro Design
22:17
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation
14:10
FastV: An Image is Worth 1/2 Tokens After Layer 2
28:39
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
32:22
PoseGPT (ChatPose): Chatting about 3D Human Pose
09:13
MotionAGFormer (WACV2024): Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network
35:08
HD-GCN (ICCV2023): Skeleton-Based Action Recognition
08:25
ST-GCN: Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
13:08
Graph Convolutional Networks (GCN): From CNN point of view
21:12
DINO: Self-Supervised Vision Transformers
31:03
MoCo (+ v2): Unsupervised learning in computer vision
22:30
ViTPose: 2D Human Pose Estimation
28:40
TrackFormer: Multi-Object Tracking with Transformers
10:59
MetaFormer is Actually What You Need for Vision
21:00
ConvNet beats Vision Transformers (ConvNeXt) Paper explained
21:32
Swin Transformer V2 - Paper explained
15:20
Masked Autoencoders (MAE) Paper Explained
23:13
Relative Position Bias (+ PyTorch Implementation)
19:59
Swin Transformer - Paper Explained
06:41
Vision Transformer (ViT) Paper Explained
07:05
Convolutional Block Attention Module (CBAM) Paper Explained
09:11
Squeeze-and-Excitation Networks (SENet) paper explained
12:18
Faster R-CNN: Faster than Fast R-CNN!
08:11
Receptive Fields: Why 3x3 conv layer is the best?
38:37
Fast R-CNN: Everything you need to know from the paper
18:32
R-CNN: Clearly EXPLAINED!