Skip to content

Latest commit

 

History

History
147 lines (108 loc) · 7.94 KB

README.md

File metadata and controls

147 lines (108 loc) · 7.94 KB

Sushy

A wiki/blogging engine with a static file back-end, full-text indexing and multiple markup support.

This was formerly the site engine for taoofmac.com circa 2015 until I decided to switch back to pure Python for maintainability.

Status

Many years later, I've decided to at least clean up the legacy codebase and bring it up to date. Once done, it should again be deployable to piku/Dokku-alt/Dokku/Heroku.

The goal is to make it run on the 2023 1.0.0 version of Hylang, which was finally released in September 22nd 2024.

Roadmap

  • Switch as much as possible to aiohttp so we can leverage uvloop fully.
  • Fix all the various breaking syntax changes that the Hy project has gone through in the past few years
  • A little more documentation (there's never enough)
  • Blog archive and partial feature parity with the current taoofmac.com site engine
  • End-to-end syntax and linting checks
  • Fix link and image handling, which require some tweaks
  • Working decorators and HTTP serving with the 2023 versions of Hy
  • Removed *earmuffs* in favor of standard Python constants, because Hy now handles those differently
  • (Mostly) working indexing with the 2023 versions of Hy
  • Page aliasing (i.e., multiple URLs for a page)
  • Image thumbnailing
  • Friendlier search results
  • More CSS tweaks
  • Atom feeds
  • piku deployment
  • Blog homepage/prev-next navigation
  • Preliminary support for rendering IPython notebooks
  • Closest-match URLs (i.e., fix typos) (removed for performance concerns on large sites)
  • HTTP caching (Etag, Last-Modified, HEAD support, etc.)
  • Sitemap
  • OpenSearch support (search directly from the omnibar on some browsers)
  • CSS inlining for Atom feeds
  • multiprocessing-based indexer (in feature/multiprocessing, disabled for ease of profiling)
  • SSE (Server-Sent Events) support (in feature/server-events) for notifying visitors a page has changed
  • New Relic Support
  • Internal link tracking (SeeAlso functionality, as seen on Yaki)
  • Multiple theme support (only the one theme for now)
  • Automatic insertion of image sizes in img tags
  • Deployable under Dokku-alt
  • Run under uWSGI using gevent workers
  • Full-text indexing and search
  • Syntax highlighting for inline code samples
  • Ink-based site layout and templates (replaced by a new layout in the feature/blog branch)
  • Baseline markup rendering (Textile, Markdown and ReST)

Stuff that will never happen:

  • Site thumbnailing (for taking screenshots of external links) - moved to a separate app
  • Web-based UI for editing pages (you're supposed to do this out-of-band)
  • Revision history (you're supposed to manage your content with Dropbox or git)
  • Comment support

Principles of Operation

  • All your Textile, Markdown or ReStructured Text content lives in a filesystem tree, with a folder per page
  • Sushy grabs and renders those on demand with fine-tuned HTTP headers (this is independently of whether or not you put Varnish or CloudFlare in front for caching)
  • It also maintains a SQLite database with a full-text index of all your content - updated live as you add/edit content.

Markup Support

Sushy supports plaintext, HTML and Textile for legacy reasons, and Markdown as its preferred format. ReStructured Text is also supported, but since I don't use it for anything (and find it rather a pain to read, let alone write), I can't make any guarantees as to its reliability. Work is ongoing for supporting Jupyter notebooks (which have no metadata/frontmatter conventions).

All markup formats MUST be preceded by "front matter" handled like RFC2822 headers (see the pages folder for examples and test cases). Sushy uses the file extension to determine a suitable renderer, but that can be overriden if you specify a Content-Type header (see config.hy for the mappings).

FAQ

Why?

I've been running a classical, object-oriented Python Wiki (called Yaki) for the better part of a decade. It works, but is comparatively big and has become unwieldy and cumbersome to tweak. So I decided to rewrite it. Again. And again.

And I eventually decided to make it smaller -- my intention is for the core to stop at around 1000 lines of code excluding templates, so this is also an exercise in building tight, readable (and functional) code.

Why Hy?

Because I've been doing a lot of Clojure lately for my other personal projects, and both the LISP syntax and functional programming style are quite natural to me.

I thought long and hard about doing this in Clojure instead (and in fact have been poking at an implementation for almost a year now), but the Java libraries for Markdown and Textile have a bunch of irritating little corner cases and I wanted to make sure all my content would render fine the first time, plus Python has an absolutely fantastic ecosystem that I am deeply into.

Then Hy came along, and I realized I could have my cake and eat it too.

Can this do static sites?

I've used a fair amount of static site generators, and they all come up short on a number of things (namely trivially easy updates that don't involve re-generating hundreds of tiny files and trashing the filesystem) -- which, incidentally, is one of the reasons why Sushy relies on a single SQLite file for temporary data.

But there's no reason why this can't be easily modified to pre-render and save the HTML content after indexing runs -- pull requests to do that are welcome.

Requirements

Thanks to Hy, this should run just as well under Python 2 and Python 3. My target environment is 2.7.8/PyPy, though, so your mileage may vary. Check the requirements.txt file - I've taken pains to make sure dependencies are there for a reason and not just because they're trendy.


Deployment

This repository should be deployable on piku (my featherweight version of Heroku), and also used to be deployable to Dokku -- this was removed in the 2023 refactoring since I don't use it anymore.

As is (for development) the content ships with the code repo. Changing things to work off a separate mount point (or a shared container volume) is trivial.

Configuration

In accordance with the 12 Factor approach, runtime configuration is taken from environment variables:

  • DEBUG - Enable debug logs
  • PROFILER - Enable cProfile statistics (will slow down things appreciatively)
  • CONTENT_PATH - the folder your documents live in
  • THEME_PATH - path under which static assets (JS/CSS/etc.) and templates/views are stored
  • BIND_ADDRESS - IP address to bind the development server to
  • PORT - TCP port to bind the server to

These are set in the Makefile (which I use for a variety of purposes).


Trying it out

Make sure you have libxml and libxslt headers, as well as the JPEG library - the following is for Ubuntu 14.04:

sudo apt-get install libxml2-dev libxslt1-dev libjpeg-dev
# install dependencies
make deps
# run the indexing daemon (updates upon file changes)
make index-watch &
# run the standalone server (or uwsgi)
make serve