Building an Object Graph in Rails

June 4th 2015 | Tags: ruby rails

I was needing to do some object cleanup in our rails app the other day, and purge some malformed objects, so I put together a quick script using some ActiveRecord reflection to walk the object chain.

# Squelch SQL logs, if you're running from rails console
ActiveRecord::Base.logger.level = 1

# Output formatters. Trailing ':' on both, plus the staggered indent
# of 4n and 4n+2 makes the output valid YAML, if automated analysis
# is called for.
def puts_node(node, indent)
  puts "    "*indent + + "#" + + ":"
def puts_assoc(assoc, indent)
  puts "    "*indent + "  " + assoc.to_s + ":"

def puts_tree(node, seen=[], indent=0)
  puts_node(node, indent)

  unless seen.include? node
    # seen maintains a list of nodes to avoid mutual recursion
    seen << node
    node.class.reflections.keys.each do |assoc|
      # To see all locations an object is referenced, get rid of the
      # "- seen" here. I only cared about which objects were present
      # anywhere in the tree, so this was fine.
      associated = Array(node.send(assoc)) - seen
      next if associated.empty?

      # Print an entry for the association, then recurse
      puts_assoc(assoc, indent)
      associated.each do |subnode|
        puts_tree(subnode, seen, indent+1)

  # Also outputs a footer listing all seen objects once, take it or leave it
    seen.each {|node| puts_node(node, indent)}

  nil # return nil to avoid flooding terminal in rails console

# Usage: call with the root node for the object graph

Worked like a charm, and made it easy to compare my bad object with other good ones. Just be careful of any global objects (a common shared subscription package, for instance, that has_many :users) that could lead to traversing your entire database, or logging associations that could overwhelm your output on older or heavily used objects. Subtracting a blacklist from reflection keys on line 18 would do the trick there.

No More Bundle Exec

September 6th 2012 | Tags: programming ruby

Bundler is pretty darn good. Installing all your gems globally sucks. bundle install --path does a great job of fixing that but it means you need to bundle exec any shell commands you want to run, which again sucks. There are lots of attempts to fix this, but they're all fairly convoluted.

I'm a fan of simpler solutions wherever possible. I use zsh as my shell, which has a handler you can hook into if the command you're trying to run is not found. It's a simple matter to hook that into a custom shell script from your ~/.zshrc:

function command_not_found_handler() {
    ~/bin/command-not-found $*

I know bash supports this kind of handler (Ubuntu uses it to provide command helpers for not-yet-installed programs) but I don't know the exact details. Alas, my favorite shell ever, fish, only provides the executable to its corresponding helper, so while it can suggest an alternate command, it can't auto-correct it.

My script happens to be in Ruby, but it could just as easily be a standard shell script as all I'm doing is some file existence tests:

#!/usr/bin/env ruby

# ARGV is the entire command we wanted to run, but we
# really only care about the actual executable for fallbacks
command = ARGV.first

def run(cmd)
  $stderr.puts "Running #{cmd.inspect} instead"

when File.exist?("./.bundle/config") && File.exist?("./bin/#{command}")
  run("bundle exec #{ARGV.join(' ')}")

  exit 127

Now, as long as you're being sure to bundle install --binstubs it should Just Work. And because it only functions if you're in a directory that's been bundled, you don't run into the security risks that you would by trying to get ./bin added to your $PATH directly.

Lastly, the case statement instead of an if is a bit redundant in the simple case above, I've actually got a few more filters for things like isolate and git - don't forget to quote anything that might need space literals:

# Paste git repo url to clone it
when command =~ /^git(@|:\/\/).*\.git$/
  run("git clone #{command.inspect}")

# paste compressed url to download+extract it
when command =~ /^(?:ftp|https?):\/\/.+\.t(?:ar\.)?gz$/
  run("curl #{command.inspect} | tar xzv")

when File.exist?("./tmp/isolate/ruby-1.8/bin/#{command}")
  run("rake isolate:sh['#{ARGV.join(' ')}']")

Joining Git Repositoires

May 15th 2012 | Tags: git

At work, we provide an API for our app and maintain web-based documentation for said API. We originally had the documentation in a separate git repo, but as it makes much more sense to maintain the docs directly alongside the code it documents we wanted to merge the two repositories. This was done in two steps.

Moving files

First, we need to prep the docs repo such that the content is in a reasonable location, rather than the root directory. This is done fairly easily with git filter-branch (zsh):

% mkdir -p doc/api_docs
% git checkout -b for_transplant # work in a branch for safety
% for file in <files/dirs to move>;
      do git filter-branch --tree-filter \
        "test -e $file && mv $file doc/api_docs || echo skip" \

This goes through out commit history, and runs through each commit moving old content into a subdirectory. It's slow in that it does a full history pass for each file, but I didn't care to figure out how to move everything except the docs directory.

Merging the repos

First, for clarity, let's make a new branch based off of the original commit in our destination repo:

% git log --oneline | tail -n 1
e7c9feb Initial commit
% git checkout e7c9feb
% git checokut -b doc_import

First, prepare the main repo for the incoming transplant by cloning a bare copy (git refuses to pull an external repo into a normal checkout).

% git clone --bare myapp-bare

We can then pull in the commit objects from the other repository:

% cd myapp-bare
% git fetch -f ../api_doc_site for_transplant:api_docs

The -f tells git to ignore the different initial commits, and then we explicitly specify the remote and local branches to move commits to. You might need to resolve a merge conflict at this point, if for example both repositories committed a different .gitignore in their initial commit.

We can now go back to our regular repo and pull those branches in:

% cd ../myapp
% git remote add bare ../myapp-bare
% git pull bare api_docs

Finally, merging that branch into master gets things all up-to-date, and we can start unified work while maintaining the full original history of the documentation.

Parsing JSON in SQL

November 19th 2011 | Tags: ruby rails sql

The Problem: You have a database column with some data serialzed as JSON in it that you'd like to pull out into its own column to index it.

The Solution: Run a data migration to pull the value out. Table has 5 million rows and you don't want to round trip all that data through ActiveRecord? Just parse the json directly with some SQL:

def json(key, field='params')
  key_json = "\"#{key}\":"

  # key start/end locations, including ""
  k_a = "LOCATE('#{key_json}', #{field})"
  k_z = "LOCATE('\"', #{field}, #{k_a}+1)" # this is terminating "

  # is there a space after colons?
  spad = "IF(LOCATE('\": ', #{field}), 1, 0)"

  # is value a string?
  val_string = "LOCATE(CONCAT('#{key_json}', IF(#{spad},' ',''), '\"'), #{field}, #{k_a})"
  qpad = "IF(#{val_string}, 1, 0)"

  # value start/end locations, excluding "" if present
  v_a = "(#{k_z}+1 + 1 + #{spad} + #{qpad})" # 1 for colon, spad for optional space, qpad for possible quote

  end_if_string = "LOCATE('\"', #{field}, #{v_a})"
  end_if_not_string = "IF(LOCATE(',', #{field}, #{v_a}), LOCATE(',', #{field}, #{v_a}), LOCATE('}', #{field}, #{v_a}))"

  v_z = "IF(#{val_string}, #{end_if_string}, #{end_if_not_string})"

  value_string = "SUBSTRING(#{field} FROM #{v_a} FOR (#{v_z} - #{v_a}))"
  "IF(#{k_a}, #{value_string}, NULL)"

up do
  execute "
    UPDATE model_table
    SET status = #{json('status')}

The generated sql looks pretty gnarly but mysql ran through it stupidly fast. I shudder to think how long it'd take activerecord to load and update each record individually.

Isolating Rails

January 19th 2011 | Tags: rails ruby

Rails 3 is now very friendly with regards to dropping Bundler support, only loading it if it's installed and a Gemfile exists. Since Isolate is so awesome, I thought I'd just drop a quick script in here to convert an existing Rails app to use Isolate instead of Bundler.

#!/usr/bin/env ruby

require 'fileutils'"Isolate", 'w') do |isolate|
  File.readlines("Gemfile").each do |line|
    next if line =~ /^\w*#/
    next if line =~ /^source/
    next if line =~ /^\w*$/

    line.sub!(/, :require.*(,|$)/, '\1')
    line.sub!(/^([ \t#]*)group/, '\1environment')

    if line =~ /:git/
      line = "# Don't use git, build it as a gem\n# " + line

    isolate.puts line
end"config/boot.rb", 'a') do |boot|
  boot.puts("require 'isolate/now'")


This should convert an existing Gemfile to an Isolate file, remove the Gemfile (so that rails won't try to load it), and update the app to load Isolate appropriately.

I'm basically only guessing that the group/environment setup is correct, so if anyone has any corrections to this let me know and I'll update it.

Migrating Disqus

December 22nd 2009 | Tags: blog

In changing this blog over to jekyll, my urls changed (there's now a trailing slash). Easy enough to tell google about it, just set up redirects, but there's no easy way to tell Disqus about it so my comments migrate over.

The good news is that it's pretty straightforward using their API, the only bad news is that I can't delete the new threads auto-generated for the new urls, so I'm just moving them out of the way.

I'm using the HTTParty gem to wrap API access, like so:

require 'rubygems'
require 'httparty'
require 'json'

class Disqus
  include HTTParty
  base_uri ''
  format :json

  def initialize(key, version='1.1')
    @key = key
    @version = version

  def auth
    {:user_api_key => @key, :api_version => @version}

  def get(action, opts={})
    result = self.class.get("/api/#{action}/", :query => opts.merge(auth))

  def post(action, opts={})
    result ="/api/#{action}/", :body => opts.merge(auth).to_params)
    p result

Do note that I'm adding trailing slashes to the api calls to avoid a redirect. Doesn't matter for the GET, but the redirect on POST was causing issues.

With this in hand, I'm grabbing my forum, looping through the threads, and renaming any that have comments (a whopping 3 of them).

key = "secret" # get yours at
disqus =

forum = disqus.get(:get_forum_list).first # I just have one
forum_api_key = disqus.get(:get_forum_api_key, :forum_id => forum["id"])

start = 0 # manual pagination, eww
loop do
  threads = disqus.get(:get_thread_list, :forum_id => forum["id"], :start => start)
  break if threads.empty?

  threads.each do |thread|
    posts = disqus.get(:get_thread_posts, :thread_id => thread["id"])
    next if !posts.empty?

    target_url = thread["url"]+"/"

    # There's another thread in the way...
    if other_thread = disqus.get(
      :forum_api_key => forum_api_key,
      :url => target_url
      # free up the url we want to use
        :forum_api_key => forum_api_key,
        :thread_id => other_thread["id"],
        :url => target_url + 'old'

    # update thread url
      :forum_api_key => forum_api_key,
      :thread_id => thread["id"],
      :url => target_url

  start += 25

Et voilĂ , old comments are in the right place now.