scikit-learn
by
scikit-learn

Description: scikit-learn: machine learning in Python

View on GitHub ↗

Summary Information

Updated 2 hours ago

Added to GitGenius on April 24th, 2023

Created on August 17th, 2010

Open Issues & Pull Requests: 2,101 (+0)

Number of forks: 27,163

Total Stargazers: 66,646 (+1)

Total Subscribers: 2,120 (+0)

Issue Activity (beta)

Open issues: 949

New in 7 days: 3

Closed in 7 days: 2

Avg open age: 1,402 days

Stale 30+ days: 870

Stale 90+ days: 775

Recent activity

Opened in 7 days: 2

Closed in 7 days: 1

Comments in 7 days: 14

Events in 7 days: 51

Top labels

Bug (982)
New Feature (695)
Documentation (587)
Needs Triage (583)
help wanted (317)
Enhancement (263)
Needs Decision (158)
Build / CI (154)

Most active issues this week

#34437 The PyPI wheel doesn't bundle the right `libgomp` on Linux - 9 events / 2 comments
#21391 add sklearn.metrics Display class to plot Precision/Recall/F1 for probability thresholds - 6 events / 4 comments
#32393 ⚠️ CI failed on Wheel builder (last failure: Jul 06, 2026) ⚠️ - 4 events / 0 comments
#34029 LinearSVC crammer_singer reports a garbage n_iter_ - 4 events / 2 comments
#34200 ⚠️ CI failed on Unit tests Linux x86-64 pylatest_free_threaded (last failure: Jul 06, 2026) ⚠️ - 4 events / 0 comments

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: 1569.4 days

Mean response time: 1698.1 days

90th percentile: 3482.8 days

Tracked items: 679

Most active contributors

lesteve - 1,587 events, 493 issues
ogrisel - 1,572 events, 467 issues
adrinjalali - 1,382 events, 579 issues
jeremiedbb - 858 events, 400 issues
glemaitre - 856 events, 361 issues

Related by overlapping contributors

Detailed Description

scikit-learn is a Python machine learning library built on top of SciPy and distributed under the 3-Clause BSD license. The project was initiated in 2007 by David Cournapeau as a Google Summer of Code project and has since grown through contributions from numerous volunteers. It is currently maintained by a volunteer team and provides a comprehensive toolkit for machine learning tasks in Python.

The library encompasses a broad range of machine learning functionality across multiple domains. GitGenius classification identifies the repository as covering regression, ensemble methods, model selection, datasets, data mining, supervised learning, classification, preprocessing, dimensionality reduction, clustering, unsupervised learning, feature extraction, and data analysis. This breadth reflects scikit-learn's position as a general-purpose machine learning framework rather than a specialized tool for any single technique or problem domain.

The project maintains active development with significant community engagement. Across 2227 tracked issues and pull requests, the median response latency is 0.0 hours, indicating rapid triage and initial response to community submissions. The most frequently addressed issue categories are Bug reports with 618 items, New Feature requests with 425 items, and Documentation improvements with 341 items. The core maintenance team includes lesteve with 1587 tracked events, ogrisel with 1569 events, and adrinjalali with 1378 events, demonstrating sustained individual contributions to project governance and development.

The repository's technical infrastructure reflects production-grade standards. The codebase uses GitHub Actions for unit testing, CircleCI for continuous integration, and Codecov for coverage tracking. Code quality is enforced through Ruff for style checking, and nightly wheel builds ensure compatibility across platforms. The project maintains a benchmark suite tracked through asv, allowing performance monitoring across releases.

Dependencies are carefully managed with specified minimum versions. The project requires Python 3.11 or later, NumPy 1.24.1 or higher, and SciPy 1.10.0 or higher. Additional dependencies include Joblib 1.4.0, Narwhals 2.0.1, Threadpoolctl 3.5.0, Matplotlib 3.6.1, scikit-image 0.22.0, Pandas 1.5.0, Seaborn 0.13.0, Pytest 7.1.2, and Plotly 5.22.0, reflecting the library's integration with the broader Python scientific computing ecosystem.

The repository shares contributors with other major Python projects including Microsoft's VSCode, pandas-dev/pandas, and matplotlib/matplotlib, indicating deep integration within the Python data science community. Installation is straightforward through pip or conda, with comprehensive documentation available both for stable releases and development versions. The project actively welcomes new contributors of all experience levels and provides detailed development guides covering code contribution, documentation, and testing procedures. Testing can be controlled through the SKLEARN_SEED environment variable for reproducibility.

scikit-learn
by
scikit-learn

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

scikit-learn
by
scikit-learnscikit-learn/scikit-learn

Repository Details

scikit-learn by scikit-learn

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

scikit-learn by scikit-learnscikit-learn/scikit-learn

Repository Details

scikit-learn
by
scikit-learn

scikit-learn
by
scikit-learnscikit-learn/scikit-learn