Skip to content
This repository has been archived by the owner on Mar 16, 2024. It is now read-only.

Web Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a single HTML file

License

Notifications You must be signed in to change notification settings

grey-box/Project-Scraper_SingleFile-Old

Β 
Β 

Repository files navigation

SingleFile

SingleFile is a Web Extension (and a CLI tool) compatible with Chrome, Firefox (Desktop and Mobile), Microsoft Edge, Vivaldi, Brave, Waterfox, Yandex browser, and Opera. It helps you to save a complete web page into a single HTML file.

Table of Contents

Demo

Demo.SingleFile.mp4

Install

SingleFile can be installed on:

You can also download the zip file (https://github.com/gildas-lormeau/SingleFile/archive/master.zip) of the project and install it manually by unzipping it somewhere on your disk and following these instructions:

Getting started

  • Wait until the page is fully loaded.
  • Click on the SingleFile button in the extension toolbar to save the page.
  • You can click again on the button to cancel the action when processing a page.

Additional notes

  • Open the context menu by right-clicking the SingleFile button in the extension toolbar or on the webpage. It allows you to save:
    • the current tab,
    • the selected content,
    • the selected frame.
  • You can also process multiple tabs in one click and save:
    • the selected tabs,
    • the unpinned tabs,
    • all the tabs.
  • Select "Annotate and save the page..." in the context menu to:
    • highlight text,
    • add notes,
    • remove content.
  • The context menu also allows you to activate the auto-save of:
    • the current tab,
    • the unpinned tabs,
    • all the tabs.
  • With auto-save active, pages are automatically saved every time after being loaded (or before being unloaded if not).
  • Right-click on the SingleFile button and select "Manage extension" (Firefox) / "Options" (Chrome) to open the options page.
  • Enable the option "Destination > save to Google Drive" or "Destination > upload to GitHub" to upload pages to Google Drive or GitHub respectively.
  • Enable the option "Misc. > add proof of existence" to prove the existence of saved pages by linking the SHA256 of the pages into the blockchain.
  • You can use the customizable shortkey Ctrl+Shift+Y to save the current tab or the selected tabs. Go to about:addons and select "Manage extension shortcuts" in the cogwheel menu to change it in Firefox. Go to chrome://extensions/shortcuts to change it in Chrome.
  • The default save folder is the download folder configured in your browser, cf. about:addons in Firefox and chrome://settings in Chrome.
  • See the extension help in the options page for more detailed information about the options and technical notes.

FAQ

See https://github.com/gildas-lormeau/SingleFile/blob/master/faq.md

Release notes

See https://addons.mozilla.org/firefox/addon/single-file/versions/

Known Issues

  • All browsers:
    • For security reasons, you cannot save pages hosted on https://chrome.google.com, https://addons.mozilla.org and some other Mozilla domains. When this happens, πŸ›‡ is displayed on top of the SingleFile icon.
    • For security reasons, SingleFile is sometimes unable to save the image representation of canvas and snapshots of video elements.
    • The last saved path cannot be remembered by default. To circumvent this limitation, disable the option "Misc > save pages in background".
    • The following characters are replaced with _ in file names: ~, +, \, ?, %, *, :, |, ", <, >
  • Chromium-based browsers:
    • You must enable the option "Allow access to file URLs" in the extension page to display the infobar when viewing a saved page, and to save or to annotate a page stored on the filesystem.
    • If the file name of a saved page looks like "56833935-156b-4d8c-a00f-19599c6513d3.html", disable the option "Misc > save pages in background". Reinstalling the browser may also fix this issue. You can find more info about this bug here.
    • Disabling the option "File name > open the "Save as" dialog to confirm the file name" will work if and only if the option "Ask where to save each file before downloading" is disabled in chrome://settings/downloads.
  • Firefox:
    • The "File name > file name conflict resolution" option does not work if set to "prompt for a name"
    • Sometimes, SingleFile is unable to save the contents of sandboxed iframes because of this bug.
    • When processing a page from the filesystem, external resources (e.g. images, stylesheets, fonts etc.) will not be embedded into the saved page. You can find more info about this bug here. This bug has been closed by Mozilla as "WontFix". But there is a simple workaround proposed here.
  • Waterfox Classic
    • User interface elements displayed in the page (progress bar, logs panel) won't be displayed unless dom.webcomponents.enabled is enabled in about:config.
    • When opening pages saved with the option "Images > group duplicate images together" enabled, some duplicate images might not displayed. It is recommended to disable this option.

Troubleshooting unknown issues

Please follow these steps if you find an unknown issue:

  • Save the page in incognito.
  • If saving page in incognito did not fix the issue, reset SingleFile options.
  • If resetting options did not fix the issue, restart the browser.
  • If restarting the browser did not fix the issue, try to disable all other extensions to see if there is a conflict.
  • If there is a conflict then try to determine against which extension(s).
  • Please report the issue with a short description on how to reproduce it here: https://github.com/gildas-lormeau/SingleFile/issues.

Command Line Interface

You can save web pages to HTML from the command line interface. See here for more info: https://github.com/gildas-lormeau/SingleFile/blob/master/cli/README.MD.

Integration with user scripts

You can execute a user script just before (and after) SingleFile saves a page. For more info, see https://github.com/gildas-lormeau/SingleFile/wiki/How-to-execute-a-user-script-before-a-page-is-saved.

SingleFileZ

SingleFileZ is a fork of SingleFile that allows you to save a webpage as a self-extracting HTML file. This HTML file is also a valid ZIP file which contains the resources (images, fonts, stylesheets and frames) of the saved page. This ZIP file can be unzipped on the filesystem in order, for example, to view the page in a browser that would not support pages saved with SingleFileZ.

More info here: https://github.com/gildas-lormeau/SingleFileZ

File format comparison

HTML (SingleFile) HTML (SingleFileZ) MAFF MHTML Webarchive (Safari) HTML+folder
Pages are saved as a single file βœ“ βœ“ βœ“ βœ“ βœ“
HTML and styles are minified βœ“ βœ“
Unused HTML and styles are removed from files βœ“ βœ“
Binary resources are not encoded in base 64 βœ“ βœ“ βœ“ βœ“
Files are compressed βœ“ βœ“
Files can be viewed without installing any extension βœ“ βœ“ΒΉ βœ“Β² βœ“Β³ βœ“
Files can be viewed without running JavaScript βœ“ βœ“ βœ“ βœ“ βœ“
Files can be unzipped to extract resources and view pages βœ“ βœ“ n/a
Files contains the text of the page (plain or formatted) which can be indexed βœ“ βœ“β΄ βœ“ βœ“ βœ“

Footnotes:

ΒΉ A switch must be passed from the command line in Chromium-based browsers, and an option must be enabled in Safari.

Β² Only in Chromium-based browsers, and Internet Explorer.

Β³ Only in Safari.

⁴ An option must be enabled in the extension.

Projects using SingleFile

Privacy Policy

See https://github.com/gildas-lormeau/SingleFile/blob/master/privacy.md

Contributors

Code derived from third party projects

Icons

License

SingleFile is licensed under AGPL. Code derived from third-party projects is licensed under MIT. Please contact me at gildas.lormeau <at> gmail.com if you are interested in licensing the SingleFile code for a commercial service or product.

Suggestions are welcome :)

About

Web Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a single HTML file

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 88.0%
  • HTML 7.8%
  • CSS 3.7%
  • Other 0.5%