giskard-oss
by
Giskard-AI

Description: 🐢 Open-Source Evaluation & Testing library for LLM Agents

View Giskard-AI/giskard-oss on GitHub ↗

Summary Information

Updated 21 minutes ago

Added to GitGenius on December 21st, 2025

Created on March 6th, 2022

Open Issues & Pull Requests: 58 (+0)

Number of forks: 462

Total Stargazers: 5,376 (+1)

Total Subscribers: 39 (+0)

Issue Activity (beta)

Open issues: 28

New in 7 days: 2

Closed in 7 days: 1

Avg open age: 17 days

Stale 30+ days: 17

Stale 90+ days: 0

Recent activity

Opened in 7 days: 2

Closed in 7 days: 1

Comments in 7 days: 4

Events in 7 days: 10

Top labels

enhancement (52)
good first issue (40)
bug (38)
p:normal (23)
question (15)
d:easy (9)
p:major (9)
user-test (8)

Most active issues this week

#2471 `Suite.run()`: add parallel flag to run scenarios concurrently - 4 events / 1 comments
#2473 feat(checks): live progress output when running a suite - 4 events / 1 comments
#2308 OWASP MCP Risk Scanner -- open-source security audit for AI agents - 1 events / 1 comments
#2364 Add `sentiment` check - 1 events / 1 comments

Explore full issue details

Detailed Description

Giskard-OSS is an open-source platform designed to help developers and data scientists build trustworthy and reliable machine learning models. It addresses the critical need for model explainability, fairness, and robustness, moving beyond simple accuracy metrics to provide a comprehensive understanding of model behavior. The platform offers a user-friendly interface and a suite of tools to facilitate model evaluation, debugging, and monitoring throughout the entire model lifecycle, from development to production.

At its core, Giskard focuses on enabling responsible AI practices. It allows users to assess model performance across various dimensions, including accuracy, fairness, and robustness. The platform provides functionalities to detect and mitigate biases in model predictions, ensuring that models do not discriminate against specific demographic groups. It also offers tools to evaluate model robustness by simulating adversarial attacks and identifying vulnerabilities to input perturbations. This helps users understand how their models behave under different conditions and identify potential weaknesses.

Giskard's key features include a model catalog for organizing and managing different models, a testing framework for defining and running automated tests, and a debugging environment for investigating model failures. The model catalog allows users to store and track different model versions, along with their associated metadata and performance metrics. The testing framework enables the creation of custom tests to evaluate model behavior based on specific requirements and business rules. These tests can be run automatically, providing continuous monitoring and alerting users to potential issues. The debugging environment provides tools to analyze model predictions, identify the root causes of errors, and understand the factors influencing model decisions. This includes features like feature importance analysis, which highlights the most influential features in a model's predictions, and counterfactual explanations, which show how input features need to be changed to alter the model's output.

Furthermore, Giskard supports integration with various machine learning frameworks and deployment environments. It offers connectors for popular libraries like scikit-learn, TensorFlow, and PyTorch, making it easy to integrate with existing workflows. The platform also provides options for deploying models and monitoring their performance in production. This allows users to track model accuracy, fairness, and robustness over time and identify any degradation in performance. Giskard's monitoring capabilities include alerts and notifications, enabling users to proactively address any issues that may arise.

In essence, Giskard-OSS empowers data scientists and developers to build and deploy trustworthy machine learning models by providing a comprehensive platform for model evaluation, debugging, and monitoring. It promotes responsible AI practices by focusing on explainability, fairness, and robustness, helping users to understand and control their models' behavior throughout their lifecycle. The platform's user-friendly interface, automated testing capabilities, and integration with popular machine learning frameworks make it a valuable tool for anyone looking to build reliable and responsible AI systems.

giskard-oss
by
Giskard-AI

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week