Description: Multi-framework responsive image component
View ascorbic/unpic-img on GitHub ↗
Unpic-img is a powerful command-line tool and Python library designed to download images from websites, specifically targeting those that employ lazy loading or dynamically generated image URLs. It addresses the common problem where simply viewing a webpage's source code doesn't reveal all the image URLs, making traditional download methods ineffective. The core functionality revolves around mimicking a real browser's behavior – executing JavaScript, waiting for dynamic content to load, and then extracting the final, resolved image URLs.
The project's primary strength lies in its use of Playwright, a modern browser automation library. Playwright allows Unpic-img to launch a full browser instance (Chromium, Firefox, or WebKit) in a headless mode (without a visible GUI), navigate to the target webpage, and interact with it as a user would. This includes scrolling down the page, which is crucial for triggering lazy loading, where images are only loaded as the user scrolls. It then parses the rendered HTML to identify and download all images. This contrasts with simpler scraping tools that only analyze the initial HTML source.
Unpic-img offers a flexible command-line interface (CLI) for quick image downloads. Users can specify the target URL, output directory, and various options like the browser to use, the number of concurrent downloads, and filtering criteria based on image size or file type. The CLI is designed to be straightforward, making it accessible even to users without extensive programming experience. For example, a simple command like `unpic-img https://example.com -o images` will download all images from example.com into a directory named "images".
Beyond the CLI, Unpic-img provides a Python API, enabling developers to integrate its image downloading capabilities into their own applications or scripts. This API exposes functions for launching the browser, navigating to URLs, waiting for specific elements to load, extracting image URLs, and downloading the images. The API allows for more granular control over the downloading process and customization to fit specific needs. Error handling and retry mechanisms are also built-in to improve robustness.
The repository includes comprehensive documentation, examples, and a well-defined structure. It also features a `docker-compose.yml` file, simplifying deployment and ensuring consistent execution across different environments. The project is actively maintained, with recent updates addressing bug fixes and adding new features. Key features include support for downloading images from galleries, handling pagination, and respecting robots.txt to avoid overloading servers. Unpic-img is a valuable tool for anyone needing to reliably download images from modern websites that heavily rely on JavaScript and dynamic content loading.
Fetching additional details & charts...