micrograd
by
karpathy

Description: A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

View on GitHub ↗

Summary Information

Updated 9 minutes ago

Added to GitGenius on October 25th, 2025

Created on April 13th, 2020

Open Issues & Pull Requests: 75 (+0)

Number of forks: 2,621

Total Stargazers: 16,656 (+4)

Total Subscribers: 182 (+0)

Issue Activity (beta)

Open issues: 24

New in 7 days: 0

Closed in 7 days: 0

Avg open age: 677 days

Stale 30+ days: 22

Stale 90+ days: 21

Recent activity

Opened in 7 days: 0

Closed in 7 days: 0

Comments in 7 days: 0

Events in 7 days: 0

Top labels

No label distribution available yet.

Most active issues this week

No issue events were indexed in the last 7 days.

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: 17.7 days

Mean response time: 134.7 days

90th percentile: 89.7 days

Tracked items: 15

Most active contributors

eggsyntax - 4 events, 1 issues
conscell - 3 events, 3 issues
AhmedThahir - 2 events, 1 issues
daghanerdonmez - 2 events, 2 issues
ekropotin - 2 events, 1 issues

Related by overlapping contributors

Detailed Description

`micrograd` by Andrej Karpathy is a concise implementation of a tiny autograd engine, designed to demystify the core mechanics of backpropagation and automatic differentiation. It serves as an educational tool, allowing users to build a foundational understanding of how modern deep learning frameworks like PyTorch or TensorFlow compute gradients. The repository’s primary goal is to illustrate the principles of a computational graph and the chain rule from first principles, using only scalar values and basic Python. It strips away complexities, presenting gradient computation's essence in few lines of code.

The `Value` object is `micrograd`'s core. Each `Value` instance encapsulates a single scalar number (`data`) and its corresponding gradient (`grad`), which accumulates during the backward pass. Crucially, every `Value` maintains references to its direct inputs (`_prev`) and the operation (`_op`) that produced it. This forms the explicit computational graph. For example, when two `Value` objects are added, a new `Value` is created, storing the sum, and its `_prev` set points back to the two original operands, along with a tag indicating the `+` operation.

The forward pass in `micrograd` is straightforward: mathematical operations like addition, multiplication, exponentiation, or activation functions (e.g., `tanh`) are overloaded for `Value` objects. When these operations are performed, they not only compute the result but also implicitly construct the computational graph. Each new `Value` object created during this process "remembers" its parents and the specific operation that generated it. This graph, built dynamically as computations unfold, is the essential structure upon which the backward pass operates.

The `backward()` method is `micrograd`'s true power. When called on the final output `Value` of a computation, it initiates gradient propagation. First, the output's gradient is initialized to 1.0. Then, `micrograd` performs a topological sort of the computational graph, ordering the `Value` objects from the output back to the inputs. It then iterates through this sorted list. For each `Value`, it invokes a special `_backward()` function associated with that `Value`. This `_backward()` function knows how to apply the chain rule for the specific operation that created the `Value`, distributing the incoming gradient (`grad`) to its immediate predecessors (`_prev`).

This elegant mechanism allows `micrograd` to compute gradients for arbitrarily complex expressions. The repository demonstrates this by building and training a tiny multi-layer perceptron (MLP). It shows how to define parameters as `Value` objects, perform a forward pass to get a loss, and then call `loss.backward()` to compute all necessary gradients. Subsequently, these gradients are used to update the parameters via simple gradient descent, illustrating the complete training loop of a neural network from first principles.

Ultimately, `micrograd` is an educational masterclass. Building this system from scratch offers profound insights into deep learning frameworks, revealing how the chain rule is systematically applied across a computational graph for efficient gradient computation—the bedrock of modern machine learning.

micrograd
by
karpathy

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

micrograd
by
karpathykarpathy/micrograd

Repository Details

micrograd by karpathy

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

micrograd by karpathykarpathy/micrograd

Repository Details

micrograd
by
karpathy

micrograd
by
karpathykarpathy/micrograd