No More Bundle Exec

September 6th 2012 | Tags: programming ruby

Bundler is pretty darn good. Installing all your gems globally sucks. bundle install --path does a great job of fixing that but it means you need to bundle exec any shell commands you want to run, which again sucks. There are lots of attempts to fix this, but they're all fairly convoluted.

I'm a fan of simpler solutions wherever possible. I use zsh as my shell, which has a handler you can hook into if the command you're trying to run is not found. It's a simple matter to hook that into a custom shell script from your ~/.zshrc:

function command_not_found_handler() {
    ~/bin/command-not-found $*
}

I know bash supports this kind of handler (Ubuntu uses it to provide command helpers for not-yet-installed programs) but I don't know the exact details. Alas, my favorite shell ever, fish, only provides the executable to its corresponding helper, so while it can suggest an alternate command, it can't auto-correct it.

My script happens to be in Ruby, but it could just as easily be a standard shell script as all I'm doing is some file existence tests:

#!/usr/bin/env ruby

# ARGV is the entire command we wanted to run, but we
# really only care about the actual executable for fallbacks
command = ARGV.first

def run(cmd)
  $stderr.puts "Running #{cmd.inspect} instead"
  system(cmd)
end

case
when File.exist?("./.bundle/config") && File.exist?("./bin/#{command}")
  run("bundle exec #{ARGV.join(' ')}")

else
  exit 127
end

Now, as long as you're being sure to bundle install --binstubs it should Just Work. And because it only functions if you're in a directory that's been bundled, you don't run into the security risks that you would by trying to get ./bin added to your $PATH directly.

Lastly, the case statement instead of an if is a bit redundant in the simple case above, I've actually got a few more filters for things like isolate and git - don't forget to quote anything that might need space literals:

# Paste git repo url to clone it
when command =~ /^git(@|:\/\/).*\.git$/
  run("git clone #{command.inspect}")

# paste compressed url to download+extract it
when command =~ /^(?:ftp|https?):\/\/.+\.t(?:ar\.)?gz$/
  run("curl #{command.inspect} | tar xzv")

when File.exist?("./tmp/isolate/ruby-1.8/bin/#{command}")
  run("rake isolate:sh['#{ARGV.join(' ')}']")

Joining Git Repositoires

May 15th 2012 | Tags: git

At work, we provide an API for our app and maintain web-based documentation for said API. We originally had the documentation in a separate git repo, but as it makes much more sense to maintain the docs directly alongside the code it documents we wanted to merge the two repositories. This was done in two steps.

Moving files

First, we need to prep the docs repo such that the content is in a reasonable location, rather than the root directory. This is done fairly easily with git filter-branch (zsh):

% mkdir -p doc/api_docs
% git checkout -b for_transplant # work in a branch for safety
% for file in <files/dirs to move>;
      do git filter-branch --tree-filter \
        "test -e $file && mv $file doc/api_docs || echo skip" \
        HEAD;
      done

This goes through out commit history, and runs through each commit moving old content into a subdirectory. It's slow in that it does a full history pass for each file, but I didn't care to figure out how to move everything except the docs directory.

Merging the repos

First, for clarity, let's make a new branch based off of the original commit in our destination repo:

% git log --oneline | tail -n 1
e7c9feb Initial commit
% git checkout e7c9feb
% git checokut -b doc_import

First, prepare the main repo for the incoming transplant by cloning a bare copy (git refuses to pull an external repo into a normal checkout).

% git clone --bare git@github.com:example/myapp.git myapp-bare

We can then pull in the commit objects from the other repository:

% cd myapp-bare
% git fetch -f ../api_doc_site for_transplant:api_docs

The -f tells git to ignore the different initial commits, and then we explicitly specify the remote and local branches to move commits to. You might need to resolve a merge conflict at this point, if for example both repositories committed a different .gitignore in their initial commit.

We can now go back to our regular repo and pull those branches in:

% cd ../myapp
% git remote add bare ../myapp-bare
% git pull bare api_docs

Finally, merging that branch into master gets things all up-to-date, and we can start unified work while maintaining the full original history of the documentation.


Parsing JSON in SQL

November 19th 2011 | Tags: ruby rails sql

The Problem: You have a database column with some data serialzed as JSON in it that you'd like to pull out into its own column to index it.

The Solution: Run a data migration to pull the value out. Table has 5 million rows and you don't want to round trip all that data through ActiveRecord? Just parse the json directly with some SQL:

def json(key, field='params')
  key_json = "\"#{key}\":"

  # key start/end locations, including ""
  k_a = "LOCATE('#{key_json}', #{field})"
  k_z = "LOCATE('\"', #{field}, #{k_a}+1)" # this is terminating "

  # is there a space after colons?
  spad = "IF(LOCATE('\": ', #{field}), 1, 0)"

  # is value a string?
  val_string = "LOCATE(CONCAT('#{key_json}', IF(#{spad},' ',''), '\"'), #{field}, #{k_a})"
  qpad = "IF(#{val_string}, 1, 0)"

  # value start/end locations, excluding "" if present
  v_a = "(#{k_z}+1 + 1 + #{spad} + #{qpad})" # 1 for colon, spad for optional space, qpad for possible quote

  end_if_string = "LOCATE('\"', #{field}, #{v_a})"
  end_if_not_string = "IF(LOCATE(',', #{field}, #{v_a}), LOCATE(',', #{field}, #{v_a}), LOCATE('}', #{field}, #{v_a}))"

  v_z = "IF(#{val_string}, #{end_if_string}, #{end_if_not_string})"

  value_string = "SUBSTRING(#{field} FROM #{v_a} FOR (#{v_z} - #{v_a}))"
  "IF(#{k_a}, #{value_string}, NULL)"
end

up do
  execute "
    UPDATE model_table
    SET status = #{json('status')}
  "
end

The generated sql looks pretty gnarly but mysql ran through it stupidly fast. I shudder to think how long it'd take activerecord to load and update each record individually.


Isolating Rails

January 19th 2011 | Tags: rails ruby

Rails 3 is now very friendly with regards to dropping Bundler support, only loading it if it's installed and a Gemfile exists. Since Isolate is so awesome, I thought I'd just drop a quick script in here to convert an existing Rails app to use Isolate instead of Bundler.

#!/usr/bin/env ruby

require 'fileutils'

File.open("Isolate", 'w') do |isolate|
  File.readlines("Gemfile").each do |line|
    next if line =~ /^\w*#/
    next if line =~ /^source/
    next if line =~ /^\w*$/

    line.sub!(/, :require.*(,|$)/, '\1')
    line.sub!(/^([ \t#]*)group/, '\1environment')

    if line =~ /:git/
      line = "# Don't use git, build it as a gem\n# " + line
    end

    isolate.puts line
  end
end

File.open("config/boot.rb", 'a') do |boot|
  boot.puts
  boot.puts("require 'isolate/now'")
end

FileUtils.rm('Gemfile')

This should convert an existing Gemfile to an Isolate file, remove the Gemfile (so that rails won't try to load it), and update the app to load Isolate appropriately.

I'm basically only guessing that the group/environment setup is correct, so if anyone has any corrections to this let me know and I'll update it.


Migrating Disqus

December 22nd 2009 | Tags: blog

In changing this blog over to jekyll, my urls changed (there's now a trailing slash). Easy enough to tell google about it, just set up redirects, but there's no easy way to tell Disqus about it so my comments migrate over.

The good news is that it's pretty straightforward using their API, the only bad news is that I can't delete the new threads auto-generated for the new urls, so I'm just moving them out of the way.

I'm using the HTTParty gem to wrap API access, like so:

require 'rubygems'
require 'httparty'
require 'json'

class Disqus
  include HTTParty
  base_uri 'disqus.com'
  format :json

  def initialize(key, version='1.1')
    @key = key
    @version = version
  end

  def auth
    {:user_api_key => @key, :api_version => @version}
  end

  def get(action, opts={})
    result = self.class.get("/api/#{action}/", :query => opts.merge(auth))
    result["message"]
  end

  def post(action, opts={})
    result = self.class.post("/api/#{action}/", :body => opts.merge(auth).to_params)
    p result
    result["message"]
  end
end

Do note that I'm adding trailing slashes to the api calls to avoid a redirect. Doesn't matter for the GET, but the redirect on POST was causing issues.

With this in hand, I'm grabbing my forum, looping through the threads, and renaming any that have comments (a whopping 3 of them).

key = "secret" # get yours at http://disqus.com/api/get_my_key/
disqus = Disqus.new(key)

forum = disqus.get(:get_forum_list).first # I just have one
forum_api_key = disqus.get(:get_forum_api_key, :forum_id => forum["id"])

start = 0 # manual pagination, eww
loop do
  threads = disqus.get(:get_thread_list, :forum_id => forum["id"], :start => start)
  break if threads.empty?

  threads.each do |thread|
    posts = disqus.get(:get_thread_posts, :thread_id => thread["id"])
    next if !posts.empty?

    target_url = thread["url"]+"/"

    # There's another thread in the way...
    if other_thread = disqus.get(
      :get_thread_by_url, 
      :forum_api_key => forum_api_key,
      :url => target_url
    )
      # free up the url we want to use
      disqus.post(
        :update_thread,
        :forum_api_key => forum_api_key,
        :thread_id => other_thread["id"],
        :url => target_url + 'old'
      )
    end

    # update thread url
    disqus.post(
      :update_thread,
      :forum_api_key => forum_api_key,
      :thread_id => thread["id"],
      :url => target_url
    )
  end

  start += 25
end

Et voilĂ , old comments are in the right place now.


Jekyll: Custom Liquid Tags

December 4th 2009 | Tags: blog

The base install of Jekyll at the moment doesn't let you run any arbitrary ruby code. This is so that they can use it for github pages and not need to worry about making a super-secure sandbox just to generate some HTML.

Unfortunately, that means we're out of luck for creating custom liquid filters. The most annoying deficiency for me is tags. The way the default liquid map filter works isn't friendly with @site.tags, so to generate my Tags page I had to do some really crazy stuff with capture:

<div id="articles">
  <table>
    {% for tag_ in @site.tags %}
      {% capture tag %}{{ tag_ | first }}{% endcapture %}
      <tr><th>{{ tag }}</th>
          <th><a name="{{ tag }}" class="anchor">&nbsp;</th></tr>
      {% for post in @site.posts %}
        {% if post.tags contains tag %}
          <tr><td>{{ post.date | date: '%b %e, %Y' }}</td>
              <td><a href="{{ post.url }}">{{ post.title }}</a></td></tr>
        {% endif %}
      {% endfor %}
    {% endfor %}
  </table>
</div>

Fortunately, it wasn't to hard to make a fork, and in my fork I added a super simple code loading option. Now, I can add a quick extension in _lib/filters.rb like so:

module Jekyll
  module Filters
    def keys(input)
      input.keys
    end

    def tagged(input, tag)
      input.select{|post| post.tags.include? tag}
    end
  end
end

Now tags.html looks like this:

<div id="articles">
  <table>
    {% for tag in @site.tags|keys|sort %}
      <tr><th>{{ tag }}</th>
          <th><a name="{{ tag }}" class="anchor">&nbsp;</th></tr>
      {% for post in @site.posts|tagged:tag %}
        <tr><td>{{ post.date | date: '%b %e, %Y' }}</td>
            <td><a href="{{ post.url }}">{{ post.title }}</a></td></tr>
      {% endfor %}
    {% endfor %}
  </table>
</div>

There's a bit of trickery there that liquid doesn't document very well on lines 3 and 5 - in the second half of the for block you can chain filters on the collection you're iterating over. The short format used is something along the lines of collection|filter:arg,arg,arg|filter...

Similarly, I had some ugly code in my regular archive page to group by year and put headings in:

{% for post in site.posts %}
  {% unless post.next %}
    <tr><th>{{ post.date | date: '%Y' }}</th><th>&nbsp;</th></tr>
  {% else %}
    {% capture year %}{{ post.date | date: '%Y' }}{% endcapture %}
    {% capture nyear %}{{ post.next.date | date: '%Y' }}{% endcapture %}
    {% if year != nyear %}
      <tr><th>{{ post.date | date: '%Y' }}</th><th>&nbsp;</th></tr>
    {% endif %}
  {% endunless %}

  ...
{% endfor %}

Now, a few extra liquid filters later, it looks like this:

{% for post in site.posts %}
  {% if post|last_of_year? %}
    <tr><th>{{ post.date | date: '%Y' }}</th><th>&nbsp;</th></tr>
  {% endif %}

  ...
{% endfor %}

If you want to get easy extensions in your own project, rather than maintaining Yet Another Jekyll Fork, please vote up my merge request on github.


As a side note, blogging about liquid is a pain. The least pain I've found so far is to use liquid to output the leading open brace for all tags. Looks like garbage in my text editor, but it gets the job done:

{{'{'}}{ post.title }}

Blogging about blogging about liquid (as above) I leave as an exercise to the reader.