Switching from WordPress to Jekyll
Over the course of a little less than a week, I migrated this blog from WordPress 3.8 to Jekyll. I used WordPress for years, probably since 2005 or 2006 when I switched away from Movable Type. I finally decided I wanted something more lightweight than WordPress, and in particular I wanted a blog that would load faster. Jekyll gives me both of those. The part that took the longest to migrate was customizing the new layout, based on Dbyll, and formatting the code examples in my posts.
I wasn’t happy with how slowly my site loaded when I used WordPress. I had the latest version, and I used plugins like WP Super Cache, but every page still took several seconds before anything even started to appear. I switched from WordPress to Jekyll for my portfolio a while back but that is a much simpler site. All I really need there is a single page listing my major projects and introducing myself. 3till7.net has always been mainly a blog, and it has several years worth of posts, but despite this, Jekyll still does what I need. Jekyll might have been a bad choice if I wanted to be able to post when I’m away from my laptop, but that’s never the case. My posts are usually long enough that I work on them for a while from my laptop; short, quick content goes to my Tumblr, Twitter, or other social networks.
Importing Posts
I first tried to use a plugin to download a copy of my WordPress database from within the WordPress admin dashboard. It seemed to download okay, but when I went to use my Jekyll WordPress importer script (see below), it errored out partway through because of a duplicate key error. Not wanting to fuss with it, I went to phpMyAdmin and exported a copy of my database from there. Again, the Jekyll importer failed, but this time because it said it couldn’t find the wp_users
table. Strange, that table was definitely listed in phpMyAdmin. I imported the SQL dump into a local database and, sure enough, there was no wp_users
table. A few other tables were missing. For whatever reason, I had to download another dump of my database from phpMyAdmin, this time with just the last few tables in the list included. I loaded both SQL dumps into a local database and this time the Jekyll importer finished successfully.
Jekyll WordPress importer
1 2 3 4 5 6 7 8 9 10 |
require "jekyll-import" # I replaced the dbname, user, and password in my local copy. You can do # that or pass them as command-line parameters to this script. JekyllImport::Importers::WordPress.run({ "dbname" => ARGV.shift, "user" => ARGV.shift, "password" => ARGV.shift, "host" => "localhost", "prefix" => "wp_", "clean_entities" => true, "comments" => true, "categories" => true, "tags" => true, "more_excerpt" => true, "more_anchor" => true, "status" => ["publish"] }) |
To get jekyll serve
to complete successfully, I had to go through the imported posts and adjust formatting on some of them. I have a lot of code examples in my blog and those had previously been styled inside <pre>
tags. That was one of the reasons I wanted to move to Jekyll: writing posts wasn’t fun in the WordPress admin interface. It’s a nice interface, but I prefer to be in Sublime Text, like I am most of the day anyway for work and side projects. With Jekyll, not only am I writing inside my favorite editor, but I get to write in Markdown and skip a lot of HTML cruft.
Anyway, jekyll serve
did not like posts that had a bunch of code within <pre>
and <code>
tags instead of indented or wrapped in backticks. So I fixed the most egregious formatting errors enough to get Jekyll to generate my site.
Disqus Comments
Then I wanted to import my comments. Comments were previously stored in my WordPress database, but since Jekyll produces static HTML pages, you need to use some external commenting system. I chose Disqus and used the WordPress admin dashboard to export all my WordPress comment into a big XML file. Disqus imported that file pretty quickly and all my comments were in. Despite adding their JavaScript to my post template in Jekyll, I wasn’t seeing comments for all my entries. This turned out to be a couple of problems:
Disqus matches URLs
Within the JavaScript I added to make Disqus comments appear, you can provide a disqus_url
variable that tells Disqus which comments go with the post. My WordPress URLs looked like http://www.3till7.net/2014/01/13/post-title/
while my Jekyll URLs were http://www.3till7.net/2014/01/13/post-title/index.html
. I created a custom filter to strip out the trailing index.html
:
1 2 3 4 5 6 7 |
module Jekyll module Filters def disqus_url full_url full_url.sub(/index\.html$/, '') end end end |
Then I used that filter in my post.html template:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
<div class="comments"> <div id="disqus_thread"></div> <script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'mySiteShortname'; var disqus_url = 'http://www.3till7.net{{ page.url | disqus_url }}'; /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> <a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a> </div> |
Jekyll had different dates for some posts
I used permalink: /:year/:month/:day/:title/index.html
for my permalinks in Jekyll so I could preserve the same URLs as what I had in my WordPress blog, so I was surprised when some post URLs weren’t matching up. It turned out that when I ran the Jekyll importer, an additional property was set on each imported post:
date: 2012-09-16 22:36:49.000000000 -04:00
This was a date in addition to the date that was part of the post file name. I noticed that when the date property was at hour 19 or later, the URL generated for that post would be on the next day. So the file name might be 2012-09-16-my-post.markdown, and the date might be 2012-09-16 22:36 like above, but the URL in my generated Jekyll site would be /2012/09/17/my-post. I ended up writing a script (see below) to go through all my Jekyll posts and ensure that each one mapped to an existing URL on my WordPress site. I used this script before I took down the WordPress site and deployed Jekyll, of course. Once I had identified all posts whose permalinks were off by one day, I manually went through those posts and changed their date
property so that its hour, minute, and second were 00:00:00. I left the file names alone, since they were correct all along.
Code Formatting
After I got comments showing up and permalinks matching, I went through my posts and adjusted their formatting further. I had to strip out WordPress shortcodes like [caption]
because they just show up as plain text in Jekyll. I also had to change how code was presented, which was the majority of the formatting issues I had to fix. In WordPress, I would use <pre>
tags with the lang
attribute specifying which syntax highlighting rules to use. In Jekyll, you use the highlight
block to add syntax highlighting to your code. I discovered you still need to indent your code by four spaces, even within the highlight block. I had the worst time trying to convert some posts with sample PHP code before I figured this out.
Having to indent code within the highlight block caused the code to appear as indented by four spaces in the final generated HTML page. To get around this, I added highlight_indent_fix.rb
to _plugins/
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# See https://gist.github.com/zerowidth/5334029 Jekyll::Tags::HighlightBlock.module_eval do def render context code = strip_leading_space_from super if context.registers[:site].pygments render_pygments context, code else render_codehighlighter context, code end end def strip_leading_space_from code code.split("\n").map do |line| line.sub!(/^\s{4}/, '') end.join("\n") end end |
Archives
I wanted to have a page listing all my tags and categories, and then separate pages for each tag and each category to list the posts in that tag/category. I found a category archive generator plugin and used that as well as adapting a second copy for tags. I found a tag cloud tag that I used on my archive page, too.
Search
Search was something provided in WordPress that I wanted to have with Jekyll, too. Since it’s all static files, I went with a JavaScript solution. Jekyll + lunr.js is what powers the search bar right now. It generates a gigantic JSON file of your site’s content that it searches through.
Generating Assets
For styling my site, I didn’t want to write plain CSS or JavaScript, so I set up LESS and CoffeeScript plugins. For LESS, I added gem 'therubyracer'
and gem 'jekyll-less'
to my Gemfile, ran bundle
, and added bundler.rb to my _plugins/
:
Deploying
For deployment, I started to set up a Git post-receive hook on my server, but that means the server has to have Ruby, Pygments, Bundler, and all the gems I use installed. There’s no sense in that since I generate a copy of my site every time I test it locally. I followed Nathan Grigg’s advice about using rsync and set up the following deploy.sh script:
1 2 3 |
#!/usr/bin/env bash # See http://nathangrigg.net/2012/04/rsyncing-jekyll/ rsync --compress --itemize-changes --recursive --checksum --delete _site/ myuser@mydomain.com:my_public_html_dir/ |
I made sure ~/.ssh/authorized_keys
on my server had a copy of my public key. Now any time I want to deploy, I run jekyll serve
locally to get an up-to-date copy of my _site
folder. Then I run ./deploy.sh
and my site gets updated with only modified files getting uploaded. The script prints out a list of which files it uploads.
WordPress Uploads
The easiest way to add images and other media to your WordPress posts is by using the media uploader built into the WordPress dashboard. I wanted to preserve all the files I had linked from my posts when I transitioned to Jekyll. I downloaded my wp-content/uploads directory and put it in my Jekyll site as assets/uploads. I did a search and replace across all my posts to update links. Since I had pruned down which posts I was migrating from my WordPress blog to the new Jekyll blog, though, a lot of files in the uploads directory were no longer used. I wrote the following script to delete files from assets/uploads that were not mentioned in any post:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
class UnusedUploadFinder attr_reader :root_dir, :upload_dir, :posts_dir, :deleted_file_count, :deleted_dir_count def initialize @root_dir = File.expand_path(File.dirname(__FILE__)) @upload_dir = File.join(@root_dir, 'assets', 'uploads') @posts_dir = File.join(@root_dir, '_posts') @deleted_file_count = 0 @deleted_dir_count = 0 end def file_contains? path, str File.readlines(path).grep(/#{str}/).size > 0 end def post_contains? str Dir.glob(@posts_dir + '/**/*.markdown') do |post_path| return post_path if file_contains?(post_path, str) end false end def delete_unused_files Dir.glob(@upload_dir + '/**/*') do |item| if File.file?(item) file_name = item.sub(/^#{@upload_dir}/, '') if post_path=post_contains?(file_name) post_name = File.basename(post_path) puts "Post #{post_name} references #{file_name}" else puts "No post references #{file_name}, deleting it" File.delete item @deleted_file_count += 1 end end end end def delete_empty_directories Dir.glob(@upload_dir + '/**/*') do |item| if File.directory?(item) && Dir[item + '/*'].empty? puts "Deleting empty directory #{item}" Dir.delete item @deleted_dir_count += 1 end end end def process delete_unused_files delete_empty_directories end end finder = UnusedUploadFinder.new finder.process puts "----------------------" puts "Deleted #{finder.deleted_file_count} files" puts "Deleted #{finder.deleted_dir_count} directories" |
URL Mapper
This URL mapper script can be used to check if your new Jekyll site will have the same URLs as the existing site you’re replacing. It expects you to have a sitemap.xml file in your Jekyll site whose entries point to your existing domain. That is, if you’re developing your Jekyll site on localhost:4000, the sitemap.xml file should not have http://localhost:4000 entries, but instead entries at your existing site URL. You can use Michael Levin’s sitemap generator.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
require 'net/http' require 'rexml/document' class URLMapper attr_reader :sitemap_path, :jekyll_urls, :missing_urls, :total_urls def initialize @jekyll_urls = [] @missing_urls = [] @total_urls = 0 @sitemap_path = File.join(File.expand_path(File.dirname(__FILE__)), '_site', 'sitemap.xml') end def process extract_jekyll_urls check_jekyll_urls print_summary end private def extract_jekyll_urls xml_str = File.read(@sitemap_path) doc = REXML::Document.new(xml_str) doc.elements.each('urlset/url/loc') do |element| # Turn Jekyll-style permalinks into the permalinks used on WordPress @jekyll_urls << element.text.sub(/index\.html$/, '') end end def check_jekyll_urls @jekyll_urls.each do |url| print url uri = URI.parse(url) print '... ' request = Net::HTTP.new(uri.host, uri.port) response = request.request_head(uri.path) if response.code == '200' puts "valid!" else puts "error #{response.code}!" @missing_urls << url end @total_urls += 1 end end def print_summary puts '---------------------------' count = @missing_urls.size plural = count == 1 ? '' : 's' puts "Found #{count} unmapped URL#{plural} out of #{@total_urls}:" @missing_urls.each do |url| puts "\t#{url}" end end end URLMapper.new.process |