Goblog: Homemade static blog engine for Go

I love to write texts, I like to debug examples, try, analyze. What I don't like is messing around with formatting, uploading pictures, checking layout, etc.

Because of laziness, I started using Blogspot. Here you have a sea of templates, various widgets, instant indexing by Google, statistics are different, for some time even comments have become tree-like, and other whistles. Well, everything would be fine, but, alas, Blogspot’s editor is not intended for creating programmer posts. When it is necessary to insert code or tables different, the torment begins. For example, for your other blog, not about programming, boiled eggs, sir! "Blogspot" features are enough.

I also want to keep the original posts in a normal, not in a crap HTML form. It turned out that blog materials are scattered around the computer here and there in several copies. At first you simply write the text in the editor, only breaking into paragraphs, without links and pictures, and at the end save the almost finished document. Then layout in HTML begins, during which, in addition to HTML itself, amendments are made to the original text. At the same time, it is already too lazy to update the original file, and in fact, it remains in its “raw” form. And in the "dry" form remains only HTML'naya garbage. But this is not the end of the story. Often, after publication, you notice a typo, climb into Blogspot and rule right on the page. Again, the very first original and its local “HTML” version remain uncorrected. As a result: the current versions of the posts are only on Blogspot itself. Of course, you can make an automated backup of the entire blog, but again - everything will be only in HTML.
')
Some time ago, I started using ReST . Here life somehow relieved. ReST allows you to write text in a more or less predictable markup (paragraphs, links, code), and then HTML is generated from it, which is inserted (again manually) into Blogspot. Attempts to automate post preview through googlecl have actually failed. There was again a problem when after correcting a typo on the page the original document in ReST became outdated. In addition, ReST did not solve the picture problem. They had to be laid out somewhere beforehand in order to be able to make a full preview.

I can not explain why, but the idea of dynamic engines like Wordpress somehow frightened me. The idea of keeping posts in the database seems to me a brute force.

I almost stopped at the intermediate solution - Doku Wiki , for example, as on vak.ru. Here the engine, though dynamic, but the contents of the pages are stored in files, and there is versioning. Doku can be used as a site engine, not just a blog. Although the design is clumsy, but the pictures and arbitrary attachments are supported by the system.

There was another option that I also almost subscribed to - a blog based on TiddlyWiki . TiddlyWiki is my favorite Windows recording tool. I already wrote about it. Why only on Windows? Because on the Mac, I simply make entries in plain text files, putting them in the meaning of documents or on the desktop, and Spotlight , which indexes everything and everyone on the computer, instantly allows you to search by word fragments. It turns out that the key features of TiddlyWiki - an instant search, no longer makes much sense. But I digress.

It turns out that there are fans who have turned TiddlyWiki into a blogging platform. In a kind of static-dynamic mutant.

For example, a version of the blog with this engine - Rich Signell's Work Log . Esoterica, in my opinion. For example, it is not clear how to screw comments, at least the same Disqus. But if anyone is interested, there is even a public hosting - tiddlyspot .

And here I am really excited by the idea of purely static engines. The beauty here is that you can host such a blog anywhere. Here not only the database is not needed, but also server-side scripting. But further - more. GitHub or Heroku can not only host static sites, but also manage content through git.

For example, there is a static engine Jekyll . In Jekyll, posts are written using Markdown or Textile markup. You can also add arbitrary files to the project, which will be laid out unchanged when the site is generated. In fact, this is the site's engine, in which you can still draw some files as a blog.

Comments, as the main “dynamic” of a blog, can be realized through, for example, Disqus . By the way, there are esthetes of static blogs with the highest degree of Zen - with static comments (for me, even this phrase is an oxymoron). The approach here is this: the post at the bottom has a section with statically derived previously entered comments, and next to a form for entering a new one. You enter a comment, and it is sent to the author of the blog. He confirms it (or not), clicks somewhere, and the comment is placed as a file in a static blog project, everything is reassembled and laid out to the public. It is clear that this is never a real-time, but more like comments with pre-moderation, and the moderator makes contact once a week.

I really appreciate the discussion, and this approach is not for me. And I continue to use Disqus. By the way, from Disqus you can perfectly export the comment base, and, for example, turn it into static pages, if you suddenly have to leave it.

But back to Jekyll. For example, GitHub Pages directly supports Jekyll (its author is the co-founder of GitHub) and knows how to render Jekyll projects (although you can also render it yourself locally). You flood the gek project with Jekyll, and the site becomes visible in GitHub Pages.

At Heroku, the idea is a bit different. Heroku hosts Ruby, so the static site on Heroku is the pages themselves and the web server program that gives them away. It sounds scary, but on Ruby such a server looks very compact, like this:

require 'bundler/setup' require 'sinatra/base' class SinatraStaticServer < Sinatra::Base get(/.+/) do send_sinatra_file(request.path) {404} end def send_sinatra_file(path, &missing_file_block) file_path = File.join(File.dirname(__FILE__), 'public', path) file_path = File.join(file_path, 'index.html') unless file_path =~ /\.[az]+$/i File.exist?(file_path) ? send_file(file_path) : missing_file_block.call end end run SinatraStaticServer

Oddly enough, hosting on Heroku is generally simpler than on GitHub. Also, on Heroku, the blog's git repository remains private, while on GitHub it becomes open, like all other projects. Although it sounds strange to me to keep the blog project (in fact, the site) closed. He's already all through the web.

Yes, both GitHub Pages and Heroku allow you to “tie” a normal second-level domain, if you have one.

So, I chose Jekyll c hosting on Heroku. Alas, if you take a clean Jekyll, you will have to design styles and layout of pages from scratch yourself. If you do this laziness, then you can take Octopress .

Octopress is a static blog engine based on Jekyll, but which is equipped with a beautiful HTML5 page layout, a pack of convenient plug-ins and an automated ability to post a blog on GitHub Pages and Heroku.

So, I took Octopress, twisted it back and forth, tried several posts, tested blog rendering locally, put it on Heroku and GitHub Pages. Everything seemed to be on the ointment.

Next was the most tedious part of Marlezonsky ballet - dragging posts from your favorite Blogspot. In fact, I had to do it manually through cut-and-paste. Weeks of three torments, and I processed my unfortunate three hundred posts.

Everything was ready to launch my new static blog. But here I was waiting for the main disappointment. Precious Jekyll, written in Ruby, rendered my unfortunate three hundred posts (attention!) - 15 minutes (on Mac Air). And as you know, at first it was necessary to try a lot, rebuild, try again, reassemble, etc. And such a time of complete reassembly did not go into any gate.

With a method of poking, I found a bottleneck in the Jekyll / Octopress engine - the lion’s share of these 15 minutes was spent on generating the atom.xml file, RSS feed. For some reason, in the initial templates, only the last twenty posts were included in this RSS file. But I have a small blog, so I included all the posts there, and then the time of generating this file resulted in the fifteen minute assembly of the entire blog.

All this seemed to me somehow absurd (with all my love for Ruby). After a little reflection (by that time I understood more or less the insides of Jekyll) and unwillingness to cringe Jekyll in attempts to speed it up, I wondered if I should write my own static engine according to a similar idea? After all, this is only work with files, text, and possibly templates. In addition, there is no multilingualism in any form in Jekyll, and I had plans to add it there, but with my own engine, my hands are completely untied, and you can make everything slim and beautiful.

What to write? It is possible like a man: in C ++ / boost. It will work very quickly, but boring. I decided to go. Native, very fast compilation (in fact, I do not have a compilation phase, as it is combined with the launch phase), convenient work with strings and file systems, simplified work with memory (garbage collection), regular expressions, arrays, hashes, a template library, library for Markdown . All but the last, "out of the box." There should be no performance problems whatsoever. Here is the release of Go 1, and now there are normal distributions for Windows and Mac.

So, after three evenings, my bike was born - Goblog . The whole project is open. The site and its source code are together.

Principle of operation

There are two main places: the project and the collected site-blog. The first is the source files. During the build process, the files from the project are copied to the assembled site while maintaining the local directory structure. By default, files are copied unchanged, as binary. If a file has the extension html , xml or js , then this file is run through the template system Go . Files with the markdown extension markdown additionally processed before the templates by the Markdown library.

Catalogs:

<root> Here is the assembled site, as seen at http://demin.ws/ .
<root>/_engine - This is a project, here are the source code and the site generator. Technically, this directory is visible through the web.

Subdirectories and files in the _engine directory:

_includes - Files that can be inserted through the {{include "filename"}} macro.
_layouts - _layouts files (see below).
_site - Actually, the directories and files of the site. This directory is the root of the future site. Files from it at assembly are shifted to the collected site. Some are processed by templates.
_posts - _posts sources. These files are processed specially. In addition to templates, they are renamed according to the blog structure, where the date is part of the URL: " /blog/////-/ "

Posts are Markdown files with a special title and name. These files are laid out in a separate directory /blog with subdirectories-dates. Information about posts is collected in special variables that are made visible from templates. Also, the posts build a reverse index for the search.

Layouts

The idea of layouts is inherited from Jekyll. If a post or page has a layout attribute in the header ( for example ), then the specified layout template (from the _layouts directory) is loaded for its rendering, the body of the post or page is inserted into a specific location of the layout (I have a pageholder Page.child ), and then everything is rendered together. This allows you to uniformly design groups of similar pages (for example, posts). Layouts can be nested.

Generator

And now, in fact, the generator is main.go.

All I do to build (in the _engine directory) is:

 make

The following output is displayed:

 _engine$ make gofmt -w=true -tabs=false -tabwidth=2 main.go go run main.go Go static blog generator Copyright (C) 2012 by Alexander Demin Words in russian index: 18452 Words in english index: 3563 15.672979s Processed 344 posts.

If all is well, then in the project root (in the directory .. relative to _engine ) files are generated, ready for display. On my Mac Air, the build takes 15 seconds (hello, Jekyll / Octopress, and bye). Since everything is under git, it is always clearly visible where and what files appeared, disappeared or changed.

Then you can check the site locally (see below).

If everything is ready, you can add the modified files (both sources from _site/ and the collected files) to the local repository:

git add ../*
git commit -m "New post about ..."

And posting on GitHub Pages:

git push

Almost immediately after the push files appear on demin.ws .

There are several additional commands in the Makefile to make life easier.

Local testing

To launch the site locally, I temporarily add " 127.0.0.1 demin.ws " to /etc/hosts and launch a mini web server. Remember how he looked on Ruby? Small, right? And now the version on Go ( server.go ):

 package main import "net/http" func main() { panic(http.ListenAndServe(":80", http.FileServer(http.Dir("..")))) }

So:

 go run server.go

And you can test the site locally (you may have to run through sudo to “sit down” on port 80).

In principle, you can not touch /etc/hosts and use the address localhost:80 , but the RSS feed atom.xml file contains absolute links with the domain, so for if you need to test RSS, you can’t do without changing the address.

Syntax highlighting

As a Markdown extension, I have a special tag for inserting blocks of code:

 {% codeblock lang:xxx %} ... {% endcodeblock %}

I inherited this tag from Octopress. Markdown already has syntax for the code:

 ``` xxx ... ```

where xxx is a language.

But my tag makes it easier for me to add attributes, for example, including the display of line numbers, the conversion of tabs, etc.

Next, it was necessary to solve the issue of syntax highlighting. I twisted several online libraries that paint directly on the page through JavaScript, but each one had some minimal problem, so I decided to paint the code statically.

The first thing that came to mind was the pygments . Everything is good, but thanks to Python, it works extremely slowly. Time to complete the site from 15 seconds increased to two minutes. The main time was spent on coloring the code. Thoughts came to the theme of the cache of already painted fragments and other nonsense, but after a short search, the problem was solved radically.

It was necessary to simply take the colorizer written in the correct language for this task. Two alternatives were found: Source-highlight and Highlight . Both are written in C ++, so they work almost instantly.

For example, here a person compared the performance of pygments and syntax-highlight .

I liked the Highlight more. It is more supported in languages (for example, in GNU's even Go is not). After switching to Highlight, the full assembly time returned to ~ 15-16 seconds, and I was satisfied.

The colorizer call was made via a regular expression callback that handles the {% codeblock %} tag ( highlight () function).

Editors for Markdown

Fully editors with preview for Markdown. I use MarkdownPad for Windows and Marked on Mac.

Tags (categories) posts

I decided not to make tags at all. Based on my own experience, I realized that I never use tags either on my blog or on others. Moreover, over time, views on the logic of categorizing information change, and sometimes it is necessary, just for compatibility with the past, to place tags in which you no longer see the point. What, for example, is the meaning of the c++ tag on my blog? Has anyone ever used it?

But minimalism is not the way to complicate life. On the contrary. Personally, I am constantly looking for something in my blog in old posts. On Blogspot, I just went to the main page, pressed ⌘-F (oh, sorry, CTRL-F) and searched for fragments of words in the headings. For this purpose, I began to display links to almost all informative posts in the right column.

In the new blog, everything “works” in the same way right on the first page with a catalog of posts. When transferring posts, I changed some of the headlines, making them more informative and searchable.

But! All this is no longer important, as now the full-featured context search is running on the blog.

Checks

One of the annoying inconveniences of Jekyll is the failure of any checks of anything. And I went through this fully in the process of dragging posts from Blogspot. Broken links, incorrect dates, forgotten quotes, incomplete languages and other attributes of posts and much more. Therefore, Goblog everywhere wherever possible checks all - formats, links, semantics, etc. If there is an error somewhere, the build stops. When I added the check_links () function, which checks all local links for all files in an already assembled site, I caught a fair amount of dead links.

Two languages

There was also a problem that, I think, was solved very elegantly: bilingualism. I need a blog and a website in two languages. But I really didn’t want to hardcod the “transparent” support of Russian and English, besides, the versions in different languages can be radically different, and it’s not difficult for me to support their templates independently. As a result, I have just the concept of a language for each file being processed (or a post) specified in the header. Goblog does not know about languages. It simply makes information about the language of the file or post accessible through templates. And I myself decide where the files are. For example, everything Russian is from the root of the site, and all English has the prefix " /english ".

For example, the Russian title page and the English title page .

Than I'm not happy

I do not like web programming: javascript, css, html, and there is no more web design, which I do not know how to do. But then I still had to delve into it (it was easier with Octopress). I took as a basis the site of the author Jekyll . Made everything minimalist simple. Moreover, most people still read via RSS and go to the site only if they want to leave a comment. Therefore, it is necessary that the RSS works and the post page is convenient (which means simple to me, without sophisticated fonts and strange formatting) for reading.

Morality

Do you think I'll convince to use my engine now? Not at all. Although I tried to make the engine as flexible and unattached as possible specifically to my blog, I had to transfer old posts and their comments, support two languages, etc. As a result, there are pieces in the code “sharpened” specifically for my blog (especially in the area of Disqus-links to comments on old posts).

I can only recommend that it is possible to write the static engine of a personal website / blog yourself. Why? And because this problem is solved in several evenings (one time), and it will contain only what you really need (the rest will be too lazy to program) (two). I am sure that everything could be done on Ruby and on Python, PHP, etc. But it was stupid to miss the opportunity to practice in a new language with a real task.

■

PS This was written almost a week, in fragments. In parallel, I wrote a search. Suddenly, I realized how nevertheless it is unrealistically convenient to work with git with a blog. You write a post in the background - you work in one branch, you add functionality - another branch. When something is ready, it merges into master and push on GitHub. Beauty.

Source: https://habr.com/ru/post/142287/

All Articles