Byte Friendly

Random thoughts on programming and related topics.

How to Automatically Open New Octopress Post in Editor?

| Comments

Recently I discovered Octopress and instead of writing another article on how to migrate to it from Wordpress, I decided to do something else. And, fortunately enough, the topic presented itself. :)

So, Octopress is not your conventional blog engine. You create new posts by running a command in the terminal. I’m quite comfortable with creating new posts with a rake task. What I didn’t like is that I needed to run another command to actually open the post in my editor. Here’s my fix to that. It patches the new_post command, by adding new optional parameter to it, :open_in_editor. If you pass true (or other truthy value), then the post will be opened in my default $EDITOR (which is TextMate at this moment). Now the creation of new post can look like this:

1
rake new_post['How to automatically open new post in editor?',:open]

It Is All Relative

| Comments

I remember reading my first book on Javascript. It was covering Javascript 1.0 and included changes in Javascript 1.1 I thought then: “Wow, that’s an advanced stuff!” If I was told that things like jQuery or Node.js will someday exist, I wouldn’t believe that. And if tomorrow I fall into a time vortex and travel to 90s, I will probably kill myself. It’s like the stone age of technology. It’s better to get to a real stone age. At least, the air is still clean. :)

Can I Haz a Gem?

| Comments

Back when I was just starting to program, we used to joke about Delphi coders.

A Delphi coder wants to build a tool for cheating in games (read/write memory of another process). So, before writing a single line of code he goes to a forum and asks: “Are there ready-made components for building game cheating tools?”

I guess, this happens to every widely adopted technology that has plugins/libraries. An example.

Screenshot 2012 04 05 07.17.14

Connect to Multiple MySQL Servers

| Comments

You know how everyone is obsessed these days with scalability? Well, I am too. There’s a project of mine where I need to connect to multiple MySQL servers, depending on the current client id. The simplest implementation looks like this:

1
2
3
4
def self.use_table_for_app aid
  config = get_connection_config aid
  ActiveRecord::Base.establish_connection config
end

This works, but it creates a new connection on every call. Let’s suppose, we’re processing a queue and we’re handling events from different apps (customers). The cost of setting up a connection can easily outweigh the actual work. We need to cache that somehow.

Now, ActiveRecord::Base has a method called connection_config that returns, well, configuration of current connection. We can compare that to what we have on hands and, if they match, do not reconnect. Here’s how our code looks like now:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def self.use_table_for_app aid
  config = get_connection_config aid

  unless can_use_connection?(config) && ActiveRecord::Base.connection.active?
    ActiveRecord::Base.establish_connection config
  end
end

def can_use_connection? config
  current_config = ActiveRecord::Base.connection_config

  # current config can have more keys than our config (or vice versa), so simple hash comparison doesn't work.
  config.each do |k, v|
    # if even a single parameter is different - can't reuse this connection.
    if config[k] != current_config[k]
      # TODO: log warning
      return false
    end
  end
  true
end

Looks almost good. Now let’s extract this to a module for ease of re-using. Here’s the final code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
module SqlModelHelpers
  def self.included klass
    klass.send :extend, ClassMethods
  end

  module ClassMethods
    def setup_connection aid
      # TODO: do not reestablish good connections.
      # 1. get config to connect
      # 2. compare with current config (ActiveRecord::Base.connection_config)
      # 3. if configs match and connection is open - do not establish_connection
      config = get_connection_config aid

      unless can_use_connection?(config) && ActiveRecord::Base.connection.active?
        ActiveRecord::Base.establish_connection config
      end
    end

    def can_use_connection? config
      current_config = ActiveRecord::Base.connection_config

      config.each do |k, v|
        if config[k] != current_config[k]
          # TODO: log warning
          return false
        end
      end
      true
    end
  end
end

class MyModel < ActiveRecord::Base
  include SqlModelHelpers

  def self.use_table_for_app aid
    setup_connection aid

    table_name
  end
end

Now if we want to use this functionality in some other models, we just include that module. This code can certainly be improved further, post your suggestions :)

Define Module Programmatically

| Comments

1
2
module Foo
end

Given a module Foo, how do you define a nested module Bar?

1
2
3
4
module Foo
    module Bar
    end
end

There’s a number of ways. First, most obvious one is to eval a string.

1
2
3
4
5
6
7
8
9
10
11
module Foo
end

name = 'Bar'

Foo.class_eval <<RUBY
  module #{name}
  end
RUBY

Foo::Bar # => Foo::Bar

While this certainly works and gets job done, it has some flaws. First, it’s a string, so some editors and IDEs will get confused and lose coloring/completion. Second, there’s no validation on module name. In best case, you’ll get compiler error. In worst case, you’ll get hard to track bugs.

1
2
3
4
5
6
7
module Foo
end

bar = Module.new
Foo.const_set(:Bar, bar)

Foo::Bar # => Foo::Bar

This one is better. You clearly state that you’re going to set a constant, and module name is pretty restricted.

Are there other ways to do this?

Ruby: for.. In Loop

| Comments

You live to learn every day. Today I discovered for..in loop in ruby. When I saw it in a question on stackoverflow, I was like “Hey, dude, this is ruby, not javascript!” in my head. But, apparently, it’s legal ruby :)

1
2
3
4
5
6
7
8
9
arr = [1, 2, 3]

for a in arr
  puts "element #{a}"
end

#=> element 1
#=> element 2
#=> element 3  

You can put ranges in there also.

1
2
3
for a in (1..10)
  puts a
end

I usually write such loops with .each, but good to know there’s another way.

Fun With Classes: Find Which Class Is the Biggest :)

| Comments

Did you know that in Ruby you can compare classes with ‘less-than’ and ‘greater-than’ operators? I did not (until today). Observe:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class A
end

class B < A
end

class C < A
end

class D < B
end

A < B # false
A > B # true
D < A # true
D < B # true
D < C # nil
C > D # nil

This operator returns a boolean value (as one would expect), so you can write code like this:

1
puts "B inherits from A" if B < A

Note how it resembles the class definition syntax. I think that this is simply brilliant (not very intuitive, though. I wouldn’t think of writing this code).

Now that we know this fact, let’s try to sort the classes :)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class A
end

class B < A
end

class C < A
end

class D < B
end

class E < A
end

class Class
  # we need to define the spaceship operator for classes, since it's not defined yet.
  def <=> other
    return 0 if self == other
    return -1 if self < other
    1
  end
end

klasses = [A, B, C, D, E]

klasses.sort # [E, C, D, B, A]

Now this class hierarchy is sorted, with children first and parents last. Homework: come up with a practical application for this :)

How to Dump Your MongoDB Database Partially (Only Selected Tables)

| Comments

Let’s say you want to dump your MongoDB database. There’s a handy tool that does just that, mongodump.

1
mongodump

If executed without any arguments it will try to connect to localhost:27017 and dump all databases. You can specify a single database that you’re interested in and it will dump just this database.

1
mongodump -d mydb

But in some cases you don’t want to dump the whole database. In my case, it’s an analytics application and 99% of data is in raw event collections (events20120320, events20120321, …). I am interested only in a small number of “important” collections. But mongodump doesn’t provide us with an option to specify several collections. You can only dump one collection at a time. If you don’t mind some typing, that’s easy.

1
2
3
mongodump -d mydb -c mycoll1
mongodump -d mydb -c mycoll2
mongodump -d mydb -c mycoll5

But we’re all programmers, so let’s automate this stuff. I, personally, never used bash loops before, and it seems like a good use case for them. Let’s do it.

1
2
3
4
5
6
colls=( mycoll1 mycoll2 mycoll5 )

for c in ${colls[@]}
do
  mongodump -d mydb -c $c
done

First line defines a bash array literal. Don’t use commas to delimit array elements, they’ll become part of the element. The ${colls[@]} string means “all array elements” and it can be used anywhere where a variable is expected. The rest is pretty straightforward, I think.

Deploying With Sinatra + Capistrano + Unicorn

| Comments

Today we’ll be deploying a simple Sinatra app with Capistrano, using Unicorn as our web server. First things first: let’s think of a stupid name for this project. What about “sincapun”? Any objections? Good, let’s proceed.

1
2
mkdir sincapun
cd sincapun

Minimal runnable version:

1
2
3
4
5
6
7
8
9
10
# Gemfile
gem 'sinatra'


# sincapun.rb
require 'sinatra'

get '/' do
    "Hello world"
end

Let’s see if we can run it…

1
2
3
4
5
$ bundle exec ruby sincapun.rb
[2012-03-09 16:04:17] INFO  WEBrick 1.3.1
[2012-03-09 16:04:17] INFO  ruby 1.9.3 (2012-02-16) [x86_64-darwin11.3.0]
== Sinatra/1.3.2 has taken the stage on 4567 for development with backup from WEBrick
[2012-03-09 16:04:17] INFO  WEBrick::HTTPServer#start: pid=14871 port=4567

So far, so good. Now let’s convert it to modular app and create a proper rackup file.

1
2
3
4
5
6
7
8
9
10
11
12
# sincapun.rb
require 'sinatra'

class Sincapun < Sinatra::Base
  get '/' do
    "Hello world"
  end
end

# config.ru
require './sincapun'
run Sincapun

At this point our app is runnable by every Rack-compatible server (thin, unicorn, …). Let’s add some Capistrano to it.

1
2
3
4
5
6
# Gemfile
source :rubygems

gem 'sinatra'
gem 'unicorn'
gem 'capistrano'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ bundle install
Using highline (1.6.11)
Using net-ssh (2.3.0)
Using net-scp (1.0.4)
Using net-sftp (2.0.5)
Using net-ssh-gateway (1.1.0)
Using capistrano (2.11.2)
Using kgio (2.7.2)
Using rack (1.4.1)
Using rack-protection (1.2.0)
Using raindrops (0.8.0)
Using tilt (1.3.3)
Using sinatra (1.3.2)
Using unicorn (4.2.0)
Using bundler (1.0.22)
Your bundle is complete! Use `bundle show [gemname]` to see where a bundled gem is installed.

$ capify .
[add] writing './Capfile'
[add] making directory './config'
[add] writing './config/deploy.rb'
[done] capified!

Let’s improve our capistrano config a little bit.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# config/deploy.rb
# We're using RVM on a server, need this.
$:.unshift(File.expand_path('./lib', ENV['rvm_path']))
require 'rvm/capistrano'
set :rvm_ruby_string, '1.9.3'
set :rvm_type, :user

# Bundler tasks
require 'bundler/capistrano'

set :application, "sincapun"
set :repository,  "git@github.com:stulentsev/sincapun.git"

set :scm, :git

# do not use sudo
set :use_sudo, false
set(:run_method) { use_sudo ? :sudo : :run }

# This is needed to correctly handle sudo password prompt
default_run_options[:pty] = true

set :user, "myname"
set :group, user
set :runner, user

set :host, "#{user}@myhost" # We need to be able to SSH to that box as this user.
role :web, host
role :app, host

set :rails_env, :production

# Where will it be located on a server?
set :deploy_to, "/srv/#{application}"
set :unicorn_conf, "#{deploy_to}/current/config/unicorn.rb"
set :unicorn_pid, "#{deploy_to}/shared/pids/unicorn.pid"

# Unicorn control tasks
namespace :deploy do
  task :restart do
    run "if [ -f #{unicorn_pid} ]; then kill -USR2 `cat #{unicorn_pid}`; else cd #{deploy_to}/current && bundle exec unicorn -c #{unicorn_conf} -E #{rails_env} -D; fi"
  end
  task :start do
    run "cd #{deploy_to}/current && bundle exec unicorn -c #{unicorn_conf} -E #{rails_env} -D"
  end
  task :stop do
    run "if [ -f #{unicorn_pid} ]; then kill -QUIT `cat #{unicorn_pid}`; fi"
  end
end

We need a config file for Unicorn. Here is what it may look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# define paths and filenames
deploy_to = "/srv/sincapun"
rails_root = "#{deploy_to}/current"
pid_file = "#{deploy_to}/shared/pids/unicorn.pid"
socket_file= "#{deploy_to}/shared/unicorn.sock"
log_file = "#{rails_root}/log/unicorn.log"
err_log = "#{rails_root}/log/unicorn_error.log"
old_pid = pid_file + '.oldbin'

timeout 30
worker_processes 2 # increase or decrease
listen socket_file, :backlog => 1024

pid pid_file
stderr_path err_log
stdout_path log_file

# make forks faster
preload_app true

# make sure that Bundler finds the Gemfile
before_exec do |server|
  ENV['BUNDLE_GEMFILE'] = File.expand_path('../Gemfile', File.dirname(__FILE__))
end

before_fork do |server, worker|
  defined?(ActiveRecord::Base) and
      ActiveRecord::Base.connection.disconnect!

  # zero downtime deploy magic:
  # if unicorn is already running, ask it to start a new process and quit.
  if File.exists?(old_pid) && server.pid != old_pid
    begin
      Process.kill("QUIT", File.read(old_pid).to_i)
    rescue Errno::ENOENT, Errno::ESRCH
      # someone else did our job for us
    end
  end
end

after_fork do |server, worker|

  # re-establish activerecord connections.
  defined?(ActiveRecord::Base) and
      ActiveRecord::Base.establish_connection
end

That should do it. Now you can deploy your app, assuming that you have RVM on the server, you can SSH into it and write to /srv directory.

1
2
cap deploy:setup
cap deploy

Deploy should spit a lot of text into the console, and there should be no errors. Verify that our unicorns are launched correctly by logging into the server and running this:

1
2
3
4
$ ps aux | grep sincapun
myuser   24851  2.0  0.1  88480 21024 ?        Sl   11:42   0:00 unicorn master -c /srv/sincapun/current/config/unicorn.rb -E production -D
myuser   24854  0.1  0.1  88480 19732 ?        Sl   11:42   0:00 unicorn worker[0] -c /srv/sincapun/current/config/unicorn.rb -E production -D
myuser   24857  0.1  0.1  88480 19732 ?        Sl   11:42   0:00 unicorn worker[1] -c /srv/sincapun/current/config/unicorn.rb -E production -D

To access these unicorns from the internet, you need to put a reverse proxy in front of them. But that is another story.

You can get a full copy of this code from Github repo.

How to Keep Your System Clock Synchronized on Ubuntu?

| Comments

Your server’s hardware clock isn’t perfectly accurate. It may run faster or slower (in my experience it was always slower). So it is important to synchronize it every so often, or else you might encounter some unexpected bugs. There’s a command in Ubuntu that synchronizes hardware clock against atomic clock servers. It’s called ntpdate

1
2
$ sudo ntpdate ntp.ubuntu.com
 9 Mar 03:59:12 ntpdate[7225]: step time server 91.189.94.4 offset 179.440440 sec

This one was three minutes behind. Could be worse, though. So, now clock is more or less accurate. To keep it this way, let’s add an hourly cron job. Create a file called ‘ntpdate’ (for example) at ‘/etc/cron.hourly’ with this content:

1
2
3
#! /bin/sh

ntpdate ntp.ubuntu.com

We don’t need sudo here, because these jobs are run with root privileges. Now make that file executable.

1
sudo chmod +x /etc/cron.hourly/ntpdate

We’re all set now. Come back a few days later and verify that clock doesn’t deviate as much anymore.