pytorch-image-models
by
huggingface

Description: The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT,...

View on GitHub ↗

Summary Information

Updated 36 minutes ago

Added to GitGenius on December 4th, 2024

Created on February 2nd, 2019

Open Issues & Pull Requests: 62 (+0)

Number of forks: 5,169

Total Stargazers: 36,964 (+0)

Total Subscribers: 321 (+0)

Issue Activity (beta)

Open issues: 44

New in 7 days: 1

Closed in 7 days: 1

Avg open age: 943 days

Stale 30+ days: 42

Stale 90+ days: 42

Recent activity

Opened in 7 days: 1

Closed in 7 days: 1

Comments in 7 days: 1

Events in 7 days: 8

Top labels

bug (402)
enhancement (347)
help wanted (21)
good first issue (1)

Most active issues this week

#2712 [BUG] reg_token excluded from group_matcher()/no_weight_decay() in Eva and VisionTransformer - 5 events / 0 comments
#2711 [FEATURE] Add MogaNet (ICLR 2024) backbone - 3 events / 1 comments

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: 0.0 hours

Mean response time: 336.3 days

90th percentile: 1319.9 days

Tracked items: 285

Most active contributors

rwightman - 602 events, 211 issues
conceptofmind - 11 events, 1 issues
MichaelMonashev - 9 events, 3 issues
TheDarkKnight-21th - 9 events, 4 issues
chenyanting1 - 9 events, 3 issues

Related by overlapping contributors

Detailed Description

PyTorch Image Models is the largest collection of PyTorch image encoders and backbones, maintained primarily by rwightman with significant community contributions tracked across 602 events. The repository serves as a comprehensive model zoo encompassing a wide range of architectures including ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer variants, MobileNetV4, MobileNet-V3 and V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, and ConvNeXt. Beyond model definitions, the repository provides complete training, evaluation, inference, and export scripts alongside pretrained weights, making it a practical resource for practitioners implementing computer vision tasks.

The repository has demonstrated sustained activity with 285 tracked issues and pull requests, though the median response latency of 0.0 hours masks a mean latency of 8070.1 hours, indicating that while some items receive immediate attention, others experience extended resolution times. Bug reports constitute the most active issue category with 131 items, followed by 116 enhancement requests and 5 help-wanted items. The codebase is classified across multiple computer vision domains including semantic segmentation, object detection, image classification, and feature extraction, reflecting its broad applicability to diverse vision tasks.

Recent development activity shows intensive focus on Vision Transformer variants and emerging architectures. As of May 2026, the repository added EUPE ViT models with DINOv3-style training and ConvNeXt variants, along with TIPSv2 model definitions for DINOv2-style Vision Transformers. Earlier updates in 2025 introduced DINOv3 support for both ConvNeXt and ViT models, MobileCLIP-2 vision encoders, MetaCLIP-2 Worldwide ViT weights, and SigLIP-2 NaFlex ViT encoders. The repository also integrated support for Naver ROPE-ViT models and added MobileNetV5 backbone variants designed for Google Gemma 3n image encoding.

The codebase maintains active optimization efforts across multiple fronts. Recent releases introduced the Muon optimizer with customizations for convolutional weights and fallback mechanisms, alongside improvements to AdaMuon and NAdaMuon variants. Security enhancements include improved pickle checkpoint handling with weights_only=True as default and safe_global support for argument parsing. The repository added device and dtype factory keyword argument support across all models and modules, enabling flexible initialization strategies including meta-device model creation.

Benchmark coverage has expanded significantly, with new inference timing results added for RTX Pro 6000, 5090, and 4090 graphics cards using PyTorch 2.9.1. The repository maintains compatibility across PyTorch versions from 1.13 through 2.9.1 and Python versions from 3.10 through 3.13. Recent architectural additions include differential attention mechanisms, pooling modules like LsePlus and SimPool, and various normalization variants including Fp32 LayerNorm and RMSNorm options. The codebase demonstrates integration with broader ecosystems through connections to microsoft/vscode, microsoft/typescript, and rust-lang/rust repositories via overlapping contributors, indicating cross-domain technical collaboration.

pytorch-image-models
by
huggingface

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

pytorch-image-models
by
huggingfacehuggingface/pytorch-image-models

Repository Details

pytorch-image-models by huggingface

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

pytorch-image-models by huggingfacehuggingface/pytorch-image-models

Repository Details

pytorch-image-models
by
huggingface

pytorch-image-models
by
huggingfacehuggingface/pytorch-image-models