Saturday, August 25, 2012

Automatic Github Pages generation from Sphinx documentation

Sphinx is a very common documentation tool which gobbles up ReStructuredText and other free-form markup formats and outputs great HTML, PDF and other formats. It is meant for reference manuals and API documentation due to its good integration with source code (especially Python). The libuv book is written using Sphinx so that it looks so good with minimum effort.

Sphinx uses make to generate the HTML, which is great. The only problem is that deploying this to Github Pages requires multiple commands to switch branches to gh-pages, pull in the source text, then cleanup the working copy and switch back to master. This is boring after about one time, so I automated it, and I think other projects can benefit from it as well. Once you follow the instructions, running:

make gh-pages

will take the latest commit, switch to the gh-pages branch, generate HTML, push it to Github, then clean everything up and switch back to master.

NOTE: You need to commit or revert any working copy modifications before running this.

One-time commands

These steps only need to be run the first time when you want to generate Github Pages. First setup the branch to have no parents. Let me stress again the importance of making sure all changes are committed! Otherwise they’ll be lost.

$ cd repo
$ git checkout --orphan gh-pages
$ git rm -rf .
$ echo "First commit" > index.html
$ git add .
$ git commit -m "Just to create the branch."
$ git push origin gh-pages

Now the gh-pages branch is setup. We can start generating the actual pages instead of the current index.html.

Set the source files

Edit the Sphinx Makefile. Add a variable GH_PAGES_SOURCES. This should be a list of the files/directories that contain the documentation sources. This will usually be only source which contains the Sphinx reST docs, but if you are embedding external code or images, those directories have to be listed as well. In addition the Makefile has to be in the list. For the libuv book it is:

GH_PAGES_SOURCES = source code libuv Makefile

Add the target

Create a target gh-pages with the following commands (remember to use TABs in Makefiles):

gh-pages:
    git checkout gh-pages
    rm -rf build _sources _static
    git checkout master $(GH_PAGES_SOURCES)
    git reset HEAD
    make html
    mv -fv build/html/* ./
    rm -rf $(GH_PAGES_SOURCES) build
    git add -A
    git ci -m "Generated gh-pages for `git log master -1 --pretty=short --abbrev-commit`" && git push origin gh-pages ; git checkout master

Here is how it goes. The checkout simply switches branches. Then we remove all the old data to prevent any rebuilding artifacts. Since the gh-pages branch won’t have any of your original data, but only the HTML output, we need to pull the sources from the master branch. Then we generate the HTML. We move these from the build folder to the top level. Then we remove all the sources and the now empty build folder. We stage all the changes. Finally the last line generates a commit message for the gh-pages changes which is the first line of the latest commit on master and pushes to Github. The reason the git checkout master command is on the same line and semi-colon separated; if the push were to fail for some reason (network error, DAG inconsistency), I want to be returned to master in my working copy. If you don’t want this to happen, feel free to move it to a new line on its own.

Done!

You can now get back to your main task - writing great documentation. Whenever you have an urge to show it to the world, simply run make gh-pages and your latest documentation is served fresh!

5 comments:

  1. Thanks for the procedure. I am new to git, and your simple directive helped me figure out how GitHub handles gh-pages. I employed your Makefile directive as a simple shell script for my two projects, http://vmlaker.github.io/mpipe and http://vmlaker.github.io/pythonwildmagic (the scripts are in doc directories.) Many thanks and best regards,
    -Vel

    ReplyDelete
  2. sphinx-deployment project that I'm working on simplifies the github pages deployment of sphinx, please have a look at https://github.com/teracy-official/sphinx-deployment Enjoy :)

    ReplyDelete
  3. First off, thank you so much for this post. It was clear, concise and worked beautifully.

    The one fly in the ointment for me is the failure of the git hosted page to load the css files that are generated by sphinx's html build. they live in the _static folder but if one inspects the elements on git's webpage it doesn't load. However, if i simply load index.html on my local browser then everything loads fine and the style looks great.

    Any guidance you had on this would be phenomenal. Thank you!

    ReplyDelete
  4. Are the generated CSS file paths in the HTML relative or absolute. They should be '_static/(theme).css'

    ReplyDelete
  5. Awesome guide, thanks!

    The issue with the _static folder not being visible has to do with Github hiding files beginning with an underscore by default. Disable it by adding a ".nojekyll" file in the root dir: https://github.com/blog/572-bypassing-jekyll-on-github-pages

    ReplyDelete