Snippet: col

#!/usr/bin/env ruby
col = ARGV.pop.to_i-1
while line = gets
  puts line.chomp.split(/\s+/)[col]
end

For when you just want a list of filenames from version control, hg st | grep '?' | col 2

And because I can never remember the standard unix tool that does the same thing, and awk is awkward.

Spec Timing

As an update to a previous article,

if $specs_timed.nil? && ENV.has_key?('SLOW')
  $specs_timed = true
  $timings = []

  Spec::Example::ExampleGroup.prepend_before do
    @start = Time.now
  end
  Spec::Example::ExampleGroup.append_after do
    elapsed = Time.now - @start
    if elapsed > ENV['SLOW'].to_f
      $timings << [elapsed, "#{self.class.description} #{description}"]
    end
  end

  at_exit do
    puts "\nSlow Specs:"
    $timings.sort{|a,b| a.first <=> b.first}.each do |time, name|
      puts " %7.4f #{name}" % time
    end
    puts "  None!" if $timings.empty?
  end
end

Then, simply run

rake SLOW=0.1

Two gotchas if you're using Rails though: instead of hooking S::E::ExampleGroup, you'll need to hook Spec::Rails::Example::RailsExampleGroup. Second, if you have any spec failures the timings don't seem to get output, since spec/rails aborts execution after failing.

RejectConf Talk - RCov Hack

It seems that there were people making both audio and video recordings of talks at RejectConf this year.

I gave a short talk on the RCov hack I've been working on (mentioned previously) which seemed to go well. The quality of the video isn't that great, but Geoff's recording of just the audio is good. It should go along well with the slides (pdf) if anyone's curious.

Ruby of the Future

Pat Eyler's April contest asks what changes could improve Ruby without losing the feel of the language. I've got four ideas. Even I would consider them quite radical, and don't think they're likely to occur, but if in the future Matz decided that sweeping changes were in order and that backwards compatability wasn't an issue, they'd be on my wish list.

First, trim Ruby's core. Go through the whole shooting match and pull out anything that isn't likely to be used by at least 80% of the population, and move it into the standard library. Pay special attention to anything written in pure Ruby. CSV, Generator, PrettyPrint, RSS, RUNIT, Rinda, SOAP, and others could be pushed out of Ruby's core.

Similarly, anything with redundancies probably should be moved out too - Ruby's current core has getopts.rb, GetOptLong, OptionParser, Options, and there's a handful of 3rd party libraries available to parse commandline options. Do these all really need to be in core? (Heck, do they all need to be in StdLib for that matter? getopts.rb has as its only comment that it is deprecated and to use another library instead.)

Cleaning up Ruby's core would allow alternate implementations to focus more specifically on what is needed to get Ruby running, without the distractions of extra bundled libraries. It would also provide a smaller base of knowledge for those new to Ruby to learn before they can claim to "know" the language.

My second wish would be to improve the OO-ness of Ruby. Yes, we're miles ahead of Python and its large number of global functions, but a split from some of the old Perlisms could possibly result in an improvement of maintainability. If a number of the magic global variables were scrapped (or at least replaced with the English module) someone who doesn't muck about with them on a day-to-day basis can at least have a chance of figuring out what's going on. I know that I still need to look up $$, $_, $' and others when I see them, and I've been working in Ruby for about a year and a half now.

If everyone were forced to use English-style accessors, and actually go through the OO interface for information (ie, Regexp.last_match instead of magic globals) it should lead to more self-documenting code.

Third, I'd like to see what could be done to trim down the syntax a little. Do we really need ?A to provide us the same information as "A"[0]? I'm loathe to suggest %w() and friends for trimming as I find them useful fairly often, but that's also something that could be looked at. Having the language specified with a parser generator means that alternative implementations can be almost guaranteed to be matching the spec with little effort. Making the core language simpler, with fewer gotchas, also improves the speed at which it can be learned.

Lastly, I'd like to see a step taken towards standardizing on some kind of Unicode, and updating the core language (strings, regexps, and all) to make it as seamless to use as possible. The Rails guys have got things started with multibyte strings in ActiveSupport, but I think it would need to be baked in to the language to be as comprehensive as possible.

All four of these changes would require a lot of work, and likely result in incompatabilities from current versions of Ruby (excepting possibly the last one). However, I think taking these as a base to work on streamlining the language, it could result in a Ruby that is easier to learn, easier to work with, and easier to duplicate or extend, while still retaining the core features that make it what it is.

Comments

Best entry so far in my opinion. I agree with all of your points, particularly the first two.

  • Peter Cooper, at 19:01, Apr 28 2007

Finally got the comment problem fixed, looks like an older version of Mephisto was doing weird things with the page cache. If anyone happens to notice that comments are back up, you can join in now :)

  • Jamie, at 19:13, Apr 30 2007

RPlug 0.2

Rplug has a new release up, which should now be useful for the world at large, as it has gained support for projects in subversion.

The update process now preserves the .svn turds rather than breaking the working copy, which is possible now that SourceControl has taught svn (and svk) how the manifest command should be implemented (11 lines of ruby).

I should probably do a check after I've done the export and cull any now-empty directories from the plugin dir, but that'll come in time, I'm sure.

Problems with Hoe

RPlug and SourceControl now officially have Gems out. SourceControl is probably useless for anybody at the moment, but if you are working on a rails repository under SVK and want to manage SVN-backed plugins, RPlug should handle it just fine. Just gem install rplug -y. More compatability to come in the future.

[Updates below]

I've been having problems getting SourceControl deployed, turns out (unsurprisingly) to be user error - I'm new to this whole rubyforge/gem scene.

So, for the record, prior to releasing a gem using Hoe, one needs to get rubyforge configured. For me, this wound up being:

$ rubyforge setup
$ rubyforge config rplug
$ rubyforge config sourcecontrol

After all that, SourceControl is deploying just fine.

I'm presuming that the initial problem was that the gem (and internal file structure) is source_control, but due to limitations on rubyforge the project name is sourcecontrol - somewhere along the way that confusion stopped it from working.

Today, I went mucking around with the packages for it, removed the old one named 'sourcecontrol' and added 'source_control' - removing ~/.rubyforge/auto-config.yml and re-running the rubyforge setup/config picked up the new package id, and everything seems to run just fine now.

RPlug Up and Running

Well, it's got the basic functionality it needs, so I'm about to put out a 0.1.0 gem for RPlug. It has a dependency on SourceControl, which I think only deserves a 0.0.5 release because it only does the bare minimum to support RPlug at the moment.

Both projects are entirely up in subversion if anyone wants to check them out, but they're not quite ready for public consumption at the moment.

Example usage and output follows.

% rplug install exception_logger http://svn.techno-weenie.net/projects/plugins/exception_logger svn
Recorded exception_logger, run 'rplug update' to pull the latest revision

% rplug update
Working in project dir /home/jamie/dev/redvase
Updating exception_logger...
  upgrading to revision 2733
  updating local repository
  Done.
Updating mocha...
  Done.
Updating helper_test...
  Done.
Updating arts...
  Done.
Updating liquid...
  Done.

% rplug status
Working in project dir /home/jamie/dev/redvase
Managing the following plugins:
  arts, revision 70
  exception_logger, revision 2
  helper_test, revision 85
  liquid, revision 140
  mocha, revision 99
Not Managing the following plugins:
  test_timer

% rplug update -p exception_logger -r 2563
Working in project dir /home/jamie/dev/redvase
Updating exception_logger...
  upgrading to revision 2563
  updating local repository
  Done.

For those new to the blog, I'm currently reinventing a few wheels here - RPlug is a replacement for Piston that stores meta-info in config/plugins.yml rather than the version control system, and which does not tie itself directly to Subversion even when given a compatible system (like SVK). It does this by using SourceControl (itself intended as a replacement for RSCM) to handle the interface to the SCM system.

How to crash Ruby

  class BrokenError < StandardError
    def backtrace
      raise(StandardError.new)
    end
  end

  begin
    raise BrokenError.new
  rescue e
    puts 'rescued'
  end

Because of the exception in the backtrace generation, processing just dies. If you have an at_exit block, it will still be run, so I suppose I'm not really crashing the ruby interpreter, I suppose, but it comes close.

Found this one out migrating a rails app from 1.1.6 to 1.2. Instead of doing this:

  render 'controller/action'

the deprecation warning suggests the following:

  render :file => 'controller/action'

Unfortunately, this causes the error if you're still trying to run in 1.1.6. A more complete fix is to make sure to add usefullpath to the render call to prevent an older TemplateError from horking, like so:

  render :file => 'controller/action', :use_full_path => true

Recursiveness

One of the big things I learned at University was that while "Recursion is a Wonderful Thing" (Thank you, Dr. Roelants), sometimes the performance can really hurt. Those times, it can pay to spend the effort turning that recursive function into a simple loop. Sure, it might not be as clean, or as elegant, or as natural to understand, but we're looking at performance here, right?

Ryan Davis recently posted about using RubyInline to optimize a recursive factorial method. He ended with a caveat that sometimes you need to look at other things than just moving the code into C for speed. His idea was to cache the data as it goes along. There are times when that won't help you in the log run (for example, generating a stats graph where caching as you draw helps, but the cached values will be stale the next time you need to do it) but changing it around to iterative can sometimes give you a further speedup.

def fib_iter(n)
  return 1 if n < 3  
  f = f1 = 1
  (2..n).each do
    f, f1 = (f+f1), f
  end
  f
end

The benchmarking speaks for itself. (Same parameters as Ryan's benching, 10,000 runs doing fib(15)):

                      user     system      total        real
fib-ruby         21.180000   3.640000  24.820000 ( 24.989140)
fib-hash-reset    0.510000   0.070000   0.580000 (  0.609976)
fib-cache-reset   0.510000   0.050000   0.560000 (  0.570715)
fib-iter          0.160000   0.020000   0.180000 (  0.209565)
fib-hash          0.020000   0.000000   0.020000 (  0.034616)
fib-cached        0.020000   0.010000   0.030000 (  0.035222)

Benchmarks for fib-ruby and fib-cached come from Ryan's post. fib-iter and fib-hash are mine.

The two "-reset" methods are indicative of times when global caching won't help you, which is still a significant speedup over the uncached versions. (For fib(15), uncached will need ~610 method calls, compared to ~15) The iterative method is about 1/3 their speed, but when you can globally cache you can get huge gains - if I increased the number of runs in the benchmark, the discrepancy between fib-iter and fib-cached would increase even more.

So once again, it seems that there's a different best solution for two different problems.

And the fib-hash benchmark? It's not significantly faster than Ryan's fib-cached method, but it bumps the fib logic from a method that uses a hash into the hash itself. It's a neat trick I picked up a while ago, but probably too ugly to make significant use of unless your benchmarking tells you otherwise - it's really hard to read at first glance:

def hashfib(n)
  return 1 if n <= 1
  h = Hash.new{|h,k| h[k] = h[k-1] + h[k-2] }
  h[1] = 1
  h[2] = 1
  h[n]
end

The cached version uses @@h instead of h, and ||=s it.

Rapid Prototyping with OpenStruct

I was doing a bit of data processing the other night. A little copying here, a bit of typing there, formatting into YAML, then loaded into a Ruby script. Loop through the hashes YAML loaded, and try to make some sense out of it.

I'm happy to say that I wound up doing the most comfortable thing for munging the data, and it turned out pretty well: OpenStruct. For those who don't know about it (require 'ostruct'), OpenStruct is exactly as the name says. It's a struct, in that it just holds data, but it is open for extending after you've created it. One can almost treat it like a Hash, but with method calls instead of indexing. (In fact, this week's RubyQuiz was converting YAML-loaded Hashes to OpenStructs)

What I was doing was looping through the Hashes, and creating OpenStructs on the fly to hold the data. At the same time, I was back-referring to previous OpenStructs and appending data to them. I didn't think much of it until I thought to myself that I needed to do some calculations on the data, and the most logical spot for it was in one of my OpenStruct objects.

I was disappointed for a moment because I knew the methods didn't fit in OpenStruct itself, when I realized that it was just time to refactor a bit - take the OpenStructs that were holding the data, promote them to instances of a concrete class, and fit the logic in there.

A quick class def, a handful of attr_accessors, rename the OpenStruct instantiation to my new class, and I was off again, none worse for the wear. Ahh, duck typing, I couldn't have done it without you.