heretic
by
p-e-w

Description: Fully automatic censorship removal for language models

View on GitHub ↗

Summary Information

Updated 51 minutes ago

Added to GitGenius on February 18th, 2026

Created on September 21st, 2025

Open Issues & Pull Requests: 72 (+0)

Number of forks: 2,814

Total Stargazers: 25,966 (+1)

Total Subscribers: 106 (+0)

Issue Activity (beta)

Open issues: 48

New in 7 days: 1

Closed in 7 days: 0

Avg open age: 56 days

Stale 30+ days: 38

Stale 90+ days: 27

Recent activity

Opened in 7 days: 1

Closed in 7 days: 0

Comments in 7 days: 1

Events in 7 days: 2

Top labels

bug (18)
enhancement (13)
question (2)
wontfix (2)

Most active issues this week

#401 Important announcement concerning the future of the Heretic project - 2 events / 1 comments

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: 3.4 hours

Mean response time: 3.1 days

90th percentile: 43.9 hours

Tracked items: 194

Most active contributors

p-e-w - 501 events, 174 issues
Vinay-Umrethe - 70 events, 38 issues
kabachuha - 38 events, 20 issues
accemlcc - 34 events, 7 issues
rigin - 19 events, 6 issues

Related by overlapping contributors

Detailed Description

Heretic is a Python-based tool designed to automatically remove safety alignment and censorship from transformer-based language models without requiring expensive post-training. The project operates under the primary language of Python and is classified by GitGenius as a CLI tool with lightweight deployment characteristics, supporting both local development and web server functionality.

The core functionality of Heretic combines an advanced implementation of directional ablation, commonly referred to as abliteration, with a TPE-based parameter optimizer powered by Optuna. This combination enables the tool to work completely automatically, finding high-quality abliteration parameters by co-minimizing the number of refusals and KL divergence from the original model. The approach requires no understanding of transformer internals, making it accessible to users who simply know how to run command-line programs.

Heretic supports a broad range of model architectures, including most dense models, many multimodal models, several different mixture-of-experts architectures, and hybrid models like Qwen3.5. However, pure state-space models and certain research architectures are not yet supported out of the box. The tool's effectiveness is demonstrated through comparative metrics showing that Heretic-generated models achieve comparable refusal suppression to manually created abliterations while maintaining significantly lower KL divergence, indicating less damage to the original model's capabilities.

According to GitGenius activity tracking across 193 issues and pull requests, the repository demonstrates a median response latency of 3.7 hours with a mean of 74.3 hours. The most active issue labels are bug with 18 occurrences and enhancement with 13 occurrences. Primary contributor p-e-w has logged 501 events, with secondary contributors Vinay-Umrethe at 70 events and kabachuha at 38 events. The repository shares overlapping contributors with ggml-org/llama.cpp, oobabooga/textgen, and comfy-org/comfyui, indicating integration within a broader ecosystem of language model tools.

The tool operates through a parametrized variant of directional ablation that identifies and orthogonalizes relevant matrices in transformer components with respect to refusal directions. Refusal directions are computed as difference-of-means between first-token residuals for harmful and harmless example prompts. The ablation process is controlled by several optimizable parameters that users can configure through command-line options or a configuration file.

Heretic includes research-focused features for interpretability studies, accessible through an optional research extra installation. These features include the ability to generate plots of residual vectors through PaCMAP projection from residual space to 2D space, with animated GIF output showing how residuals transform between layers. The tool also provides quantitative analysis of residual geometry through detailed metrics tables.

The project maintains a homepage at heretic-project.org and provides community engagement through Discord and Matrix channels. The community has created and published over 4000 models using Heretic, available on Hugging Face. The tool supports model quantization with bitsandbytes to reduce VRAM requirements and includes built-in evaluation functionality, allowing users to reproduce benchmark results and compare decensored models against baseline versions.

heretic
by
p-e-w

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

heretic
by
p-e-wp-e-w/heretic

Repository Details

heretic by p-e-w

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

heretic by p-e-wp-e-w/heretic

Repository Details

heretic
by
p-e-w

heretic
by
p-e-wp-e-w/heretic