Helpful commands

As a follow-up to my last post here are some commands that I use throughout the day. They are admittedly nothing special but they help me out.

.bashrc aliases:

alias grc='git rebase --continue'
alias gca='git commit -a --amend'
alias gs='git status -sb'
alias gd='git diff'
alias gsn='git show --name-only'

The one worth explaining is gca. This one stages and commits everything to the previous commit. I use this constantly to keep adding stuff to my WIP commits. One thing to watch out for is this will mess up a merge conflict fix inside a rebase operation because you’ll end up putting everything in the merge before the conflict. You want to do grc instead.

scripts:

force_push — I use this to automate the process of updating my remote branch and most importantly to prevent me from force pushing the one branch that I must NEVER force push.

#!/usr/bin/env bash
CURRENT_BRANCH=`git rev-parse --abbrev-ref HEAD`
if [ $CURRENT_BRANCH == 'master' ]; then
  echo "YOU DO NOT WANT TO DO THAT"
  exit 0
fi
 
echo "git push origin $CURRENT_BRANCH --force"
read -p "Are you sure? [Yn] "
if [ "$REPLY" != "n" ]; then
  git push origin $CURRENT_BRANCH --force
fi

rebase_branch — There’s not really a lot to this, but I use it reflexively before I do anything.

#!/usr/bin/env bash
git fetch
git rebase -i origin/master

merge_to_master — I do this when I’m done with a branch. This makes sure that there will be a clean fast-forward push. Notice how it reuses rebase_branch.

#!/usr/bin/env bash
rebase_branch
CURRENT_BRANCH=`git rev-parse --abbrev-ref HEAD`
echo "git checkout master"
git checkout master
echo "pull origin master"
git pull origin master
echo "git merge $CURRENT_BRANCH"
git merge $CURRENT_BRANCH

git-vim — this one is still a bit of a work in progress, but the idea is to grab the files you’ve changed in Git and open them in separate tabs inside Vim. You can then run it with git vim which I alias as gv.

#!/usr/bin/env ruby
 
# uncommitted files
files = `git diff HEAD --name-only`.split("\n")
if files.empty?
  # WIP files
  files = `git show --name-only`.split("\n")
end
 
system("vim -p #{files.join(" ")}")

Of course, all these scripts need to be put somewhere in your executable path. I put them in ~/bin and include this location in my path.

So my workflow would look like this

git checkout -b new_branch
# hack hack hack
git commit -a
# hack hack hack
gca
# hack hack hack
gca
# all done now
rebase_branch
# whoops a merge conflict
# resolve it
git add .
grc
# Time to get this code reviewed on Github
force_push
# Code accepted, gonna merge this
merge_to_master

Git workflow

In my last post I described how at my work we use code review feedback to iteratively improve code. I want to describe how Git fits into this process, because this is probably the biggest change I had to make to my preexisting workflow. Basically I had to relearn how to use Git. The new way of using it (that is, it was new to me) is extremely powerful and in a strange way extremely satisfying, but it does take a while to get used to.

Importance of rebasing

I would describe my old approach and understanding as “subversion, but with better merging”1. I was also aware of the concept of rebasing from having submitted a pull request to an open source project at one point, but I didn’t use it very often for reasons I’ll discuss later. As it turns out understanding git rebase is the key to learning how to use Git as more than a ‘better subversion’.

For those who aren’t familiar with this command, git rebase <branch> takes the commits that are unique to your branch and places them “on top” of another branch. You typically want to do this with master, so that all your commits for your feature branch will appear together as the most recent commits when the feature branch is merged into master.

Here’s a short demonstration. Let’s say this is your feature branch, which you’ve been developing while other unrelated commits are being added to master:

Feature branch with ongoing mainline activity
Feature branch with ongoing mainline activity

If you merge without rebasing you’ll end up with a history like this:

History is all jacked up!
History is all jacked up!

Here is the process with rebasing:

# We're on `feature_branch`
git rebase master # Put feature_branch's commits 'on top of' master's
git checkout master
git merge feature_branch

This results in a clean history:

Feature branch commits on top
Feature branch commits on top

Another benefit of having done a rebase before merging is that there’s no need for an explicit merge commit like you see at the top of the original history. This is because — and this is a key insight — the feature branch is exactly like the master branch but with more commits added on. In other words, when you merge it’s as though you had never branched in the first place. Because Git doesn’t have to ‘think’ about what it’s doing when it merges a rebased branch it performs what is called a fast forward. In this case it moved the HEAD2 from 899bdb (More mainline activity) to 5b475e (Finished feature branch).

The above is the basic use case for git rebase. It’s a nice feature that keeps your commit history clean. The greater significance of git rebase is the way it makes you think about your commits, especially as you start to use the interactive rebase features discussed below.

Time travel

When you call git rebase with the interactive flag, e.g. git rebase -i master, git will open up a text file that you can edit to achieve certain effects:

Interactive rebase menu
Interactive rebase menu

As you can see there are several options besides just performing the rebase operation described above. Delete a line and you are telling Git to disappear that commit from your branch’s history. Change the order of the commit lines and you are asking Git to attempt to reorder the commits themselves. Change the word ‘pick’ to ‘squash’ and Git will squash that commit together with the commit on the preceding line. Most importantly, change the word ‘pick’ to ‘edit’ and Git will drop you just after the selected ref number.

I think of these abilities as time travel. They enable you to go back in the history of your branch and make code changes as well as reorganize code into different configuration of commits.

Let’s say you have a branch with several commits. When you started the branch out you thought you understood the feature well and created a bunch of code to implement it. When you opened up the pull request the first feedback you received was that the code should have tests, so you added another commit with the tests. The next round of feedback suggested that the implementation could benefit from a new requirement, so you added new code and tests in a third commit. Finally, you received feedback about the underlying software design that required you to create some new classes and rename some methods. So now you have 4 commits with commit messages like this:

A messy commit history
A messy commit history
  1. Implemented new feature
  2. Tests for new feature
  3. Add requirement x to new feature
  4. Changed code for new feature

This history is filled with useless information. Nobody is going to care in the future that the code had to be changed from the initial implementation in commit 4 and it’s just noise to have a separate commit for tests in commit 2. On the other hand it might be valuable to have a separate commit for the added requirement.

To get rid of the tests commit all you have to do is squash commit 2 into commit 1, resulting in:

  1. Implemented new feature
  2. Add requirement x to new feature
  3. Changed code for new feature

New commit 3 has some code that belongs in commit 1 and some code that belongs with commit 2. To keep things simple, the feature introduced in commit 1 was added to file1.rb and the new requirement was added to file2.rb. To handle this situation we’re going to have to do a little transplant surgery. First we need to extract the part of commit 3 that belongs in commit 1. Here is how I would do this:

# We are on HEAD, i.e. commit 3
git reset HEAD^ file1.rb
git commit --amend
git stash
git rebase -i master
# ... select commit 1 to edit
git stash apply
git commit -a --amend
git rebase --continue

It’s just that easy! But seriously, let’s go through each command to understand what’s happening.

  1. The first command, git reset, is notoriously hard to explain, especially because there’s another command, git checkout, which seems to do something similar. The diagram at the top of this Stack Overflow page is actually extremely helpful. The thing about Git to repeat like a mantra is that Git has a two step commit process, staging file changes and then actually committing. Basically, when you run git reset REF on a file it stages the file for committing at that ref. In the case of the first command, git reset HEAD^ file.rb, we’re saying “stage the file as it looked before HEAD’s change”; in other words, revert the changes we made in the last commit.
  2. The second command, git commit --amend commits what we’ve staged into HEAD (commit 3). The two commands together (a reset followed by an amend) have the effect of uncommitting the part of HEAD’s commit that changed file1.rb.
  3. The changes that were made to file1.rb aren’t lost, however. They were merely uncommitted and unstaged. They are now sitting in the working directory as an unstaged diff, as if they’d never been part of HEAD. So just as you could do with any diff you can use git stash to store away the diff.
  4. Now I use interactive rebase to travel back in time to commit 1. Rebase drops me right after commit 1 (in other words, the temporary rebase HEAD is commit 1).
  5. I use git stash apply to get my diff back (you might get a merge conflict at this point depending on the code).
  6. Now I add the diff back into commit 1 with git commit --amend -a (-a automatically stages any modified changes, skipping the git add . step).

This is the basic procedure for revising your git history (at least the way I do it). There are a couple of other tricks that I’m not going to go into detail about here, but I’ll leave some hints. Let’s say the changes for the feature and the new requirement were both on the same file. Then you would need to use git add --patch file1.rb before step 2. What if you wanted to introduce a completely new commit after commit 1? Then you would use interactive rebase to travel to commit 1 and then add your commits as normal, and then run git rebase --continue to have the new commits inserted into the history.

Caveats

One of the reasons I wasn’t used to this workflow before this job was because I thought rebasing was only useful for the narrow case of making sure that the branch commits are grouped together after a merge to master. My understanding was that other kinds of history revision were to be avoided because of the problems that they cause for collaborators who pull from your public repos.  I don’t remember the specific blog post or mailing list message but I took away the message that once you’ve pushed something to a public repo (as opposed to what’s on your local machine) you are no longer able to touch that history.

Yes and no.  Rebasing and changing the history of a branch that others are pulling from can cause a lot of problems. Basically any time you amend a commit message, change the order of a commit or alter a commit you actually create a new object with a new sha reference. If someone else naively pulls from your branch after having pulled the pre-revised-history they will get a weird set of duplicate code changes and things will get worse from there. In general if other people are pulling from your public (remote) repository you should not change the history out from under them without telling them. Linus’ guidelines about rebasing here are generally applicable.

On the other hand, in many Git workflows it’s not normal for other people to be pulling from your feature branch and if they are they shouldn’t be that surprised if the history changes.  In the Github-style workflow you will typically develop a feature branch on your personal repository and then submit that branch as a pull request to the canonical repository. You would probably be rebasing your branch on the canonical repository’s master anyway. In that sense even though your branch is public it’s not really intended for public consumption. If you have a collaborator on your branch you would just shoot them a message when you rebase and they would do a “hard reset” on their branch (sync their history to yours) using git reset --hard remote_repo/feature_branch. In practice, in my limited experience with a particular kind of workflow, it’s really not that big a deal.

Don’t worry

Some people are wary of rebase because it really does alter history. If you remove a commit you won’t see a note in the commit history that so and so removed that commit. The commit just disappears. Rebase seems like a really good way to destroy yours and other people’s work. In fact you can’t actually screw up too badly using rebase because every Git repository keeps a log of the changes that have been made to the repository’s history called the reflog. Using git reflog you can always back out misguided recent history changes by returning to a point before you made the changes.

Hope this was helpful!

---

  1. Not an insignificant improvement, since merging in Subversion sucks.[back]
  2. Which I always think about as a hard drive head, which I in turn think about as a record player needle[back]

Installing Ubuntu Pangolin on Beagle Bone

Just a quick note if you’re like me and you want to put Ubuntu on your Beagle Bone. The Beagle Bone is a sweet little palm-sized motherboard/processor guy that’s nice for little hardware projects. It comes with the Angstrom operating system loaded onto the SD card. This operating system was fine for initial development but I ran into an issue where I couldn’t leave a server to run and log out. If I checked back the server was always dead no matter what I did to detach it (maybe it was just pegging itself and restarting). So I wanted to see if Ubuntu would handle any better.

Anyways, I kept trying to just flash an image from the official Ubuntu page onto an SD card and boot it, but it wasn’t working. Then I saw that there is a script that does all the work for you mentioned on this page:

http://elinux.org/BeagleBoardUbuntu#Canonical.2FUbuntu_Images

Works like a charm, and Ubuntu does seem to be more stable than Angstrom on the Beagle Bone. Thanks to the author of the script, Robert C. Nelson.

Rails and Ext non-Ajax Signup Form with Password Confirmation

This is, uh, a technical post.

Probably there are others who want to do the same somewhat senseless thing: use Ext to do form validation while keeping a boring non-Ajax post-and-response. The bottom line is that Ext favors doing it the Ajax way, and the Ajax way isn’t that hard to set up with Rails (just handle the form submission as normal but return JSON or XML to signal success or failure). But if you’re like me and working on a deadline, there can be a cognitive burden to switching to Ajax posting that you might want to avoid. Paradoxically, you might find yourself wasting a lot of time trying to figure out how to do it the “old-fashioned” way. Well, here’s one working standard-submission Signup Form, with fancy validations and all the kinks worked out.

Here’s the top half of the file users/new.html.erb, which is nearly the same as the code generated by restful-authentication:

<% @user.password = @user.password_confirmation = nil %>
<%= error_messages_for :user %>
<div id="no-js-form">
    <% form_for :user, :url => users_path, :html => {:id => "signup-form"} do |f| -%>
    <p>
        <label for="login">
            Real Name
        </label>
        <br/>
        <%= f.text_field :name, :id => "signup_name_field" %>
    </p>
    <p>
        <label for="login">
            User Name
        </label>
        <br/>
        <%= f.text_field :login, :id => "signup_login_field" %>
    </p>
    <p>
        <label for="email">
            Email
        </label>
        <br/>
        <%= f.text_field :email, :id => "signup_email_field" %>
    </p>
    <p>
        <label for="password">
            Password
        </label>
        <br/>
        <%= f.password_field :password, :id => "signup_password_field" %>
    </p>
    <p>
        <label for="password_confirmation">
            Confirm Password
        </label>
        <br/>
        <%= f.password_field :password_confirmation, :id => "signup_password_confirmation_field" %>
    </p>
    <p>
        <label for="password_confirmation">
            Role
        </label>
        <br/>
        <%= f.select :role, [["consumer","consumer"],["vendor","vendor"]], :id => "signup_role_field" %>
    </p>
    <p>
        <%= submit_tag 'Sign up', :id => "signup_submit_button" %>
    </p>
    <% end -%>
</div>
<div id="js-form-panel">
</div>

The only differences are a div wrapping the form (“no-js-form”) and the “js-form-panel” at the end. You’re going to laugh at me, but this form is buzzword-friendly; it’s unobtrusive in an ugly way. If javascript is turned on, the form will work, and the following will fail:

<script type="text/javascript">
    /* 
     Thanks to:
     http://www.extjswithrails.com/2008_03_01_archive.html for standardSubmit tip (hard to find!)
     http://extjs.com/forum/showthread.php?t=23068 for password confirmation
     Anyone else I stole semantics from
     */
    // Look, I'm copying over the authenticity token to send in the JS-generated form. LOL!
    var authenticity_token = document['forms'][0]['authenticity_token'].value;
 
    Ext.onReady(function(){
        $('no-js-form').hide();
 
        var myForm;
 
        function submitHandler(){
            form = myForm.getForm();
            form_as_dom = form.getEl().dom;
            form_as_dom.action = form.url;
            form_as_dom.submit();
        }
        myForm = new Ext.form.FormPanel({
            monitorValid: true,
            standardSubmit: true,
            url: "/users",
            applyTo: "js-form-panel",
            title: "Signup as a New User",
            width: 310,
            autoHeight: true,
            items: [new Ext.form.TextField({
                allowBlank: false,
                msgTarget: 'side',
                name: "user[name]",
                id: 'js_signup_name_field',
                fieldLabel: "Real Name"
            }), new Ext.form.TextField({
                allowBlank: false,
                vtype: 'alphanum',
                msgTarget: 'side',
                name: "user[login]",
                id: 'js_signup_login_field',
                fieldLabel: "Username"
            }), new Ext.form.TextField({
                allowBlank: false,
                vtype: 'email',
                msgTarget: 'side',
                name: "user[email]",
                id: 'js_signup_email_field',
                fieldLabel: "Email"
            }), new Ext.form.TextField({
                allowBlank: false,
                inputType: 'password',
                vType: 'password',
                msgTarget: 'side',
                name: "user[password]",
                id: 'js_signup_password_field',
                fieldLabel: "Password"
            }), new Ext.form.TextField({
                fieldLabel: "Password Confirm:",
                allowBlank: false,
                inputType: 'password',
                name: "user[password_confirmation]",
                initialPasswordField: 'signup_password_field',
                vType: 'password',
                msgTarget: 'side',
                id: 'js_signup_password_confirmation_field',
                fieldLabel: "Confirm Password",
                validator: function(value){
                    return (value == document.getElementById("js_signup_password_field").value) 
|| "Your passwords do not match";
                }
            }), new Ext.form.Hidden({
                name: "authenticity_token",
                value: authenticity_token
            }), new Ext.form.Hidden({
                name: "user[role]",
                value: "consumer"
            }), ],
            buttons: [{
                handler: submitHandler,
                text: "Signup",
                formBind: true
            }]
        });
 
    });
 
</script>

The noteworthy steps are: first, I hide the ‘no-js-form’, then I copy the authenticity_token that gets generated by a rails form to put in the js-generated form. Then, standardSubmit : true is the config option that makes a FormPanel not submit as an XmlHttpRequest. The funny code in the submitHandler is getting the underlying form object and calling submit on it, but as I write this it doesn’t make sense why both would be necessary. Finally, formbind : true causes the submit button to be deactivated while there are failing validations, and there’s some handy code for making sure that the password_confirmation matches password (totally lifted from somewhere else, see above).

Apple Cocoa Cavil

I’m going to try to sound more like Andy Rooney1 up here on this blog. Also, how about I indicate when the boooring technical notes begin and end with technical and interesting.

This is one of my favorite xkcd comics. It really speaks to my experience. Usually I can pull away before I’ve finished registering for comments. Sometimes I’m halfway through a closely reasoned argument when I realize how perfectly pointless and non-personal-goal-advancing my actions are. Then, in the worst case scenario, there I am mixing it up with the other comment-warriors. Here’s me windmilling my way through a post about dolphin killing on Japanprobe. This used to be the url for a pitched brawl in which I interjected a few uninformed comments. Etcetera.

Anyway, I thought I’d write this post at a more meta level to dissuade myself from commenting elsewhere. So here goes (technical):

Have you ever noticed that Objective-C is really, really weird? Like, they just took all the C- and C++- style conventions and changed them? Me too. And on top of that it’s compiled and you do memory management and the engineers make APIs that have objects called NSCamelCasedFactoryMethodObjectFacilitator2. Okay, so then someone makes a script-y dynamic thing for managing the Objective-C stuff, good idea. And when designing this scripting interface they make the following language syntax design decisions:

Finally, the instruction separator is a dot, like in English sentences:
myString := ‘hello’.

The following example shows how to send a message to an object:
myString class

See, this is funny, because it’s completely different from every other programming language3. That is all.

Umm, but there is a somewhat interesting take-away. Both Apple and Microsoft have designed really sucky APIs (in terms of intuitability rather than functionality) , compared to which GTK is fairly sane (it gets a bit clunky when dealing with “GtkIter” operations). But the MacOS developers follow Apple’s improvements of this API, cooing over the increased simplicity afforded by the new NSMakesYourToastRegistry. It’s the same with new C#/ Windows API developments. So (this is actually the interesting part) the lesson is that when people work within a “closed” development system, they lose their sense about good and bad design!4

Here’s the idea. Closed development systems don’t get good feedback and don’t have good change mechanisms, so even very good engineers (probably Apple’s are some of the best) end up working in the dark a little. It gets all culty, because there’s an elect that makes the design decisions and a laity that passively learns the new scripture. And everyone’s straining so hard to understand what the design class hath laid down that they’re no longer perceiving the design objectively. And proprietary lock-in helps, because it leads to fatalism (“what can I do, switch to Windows?”). There are all these weird little island communities where the natives are effectively locked-in to a platform because they’ve already invested the energy to understand its weird design. This isn’t even necessarily a proprietary vs. opensource thing. There are strange over-designed opensource projects that aren’t particularly open because of this class division (and most opensource projects rely on only 1-3 main contributors, it seems). All I’m saying is that bad APIs / development languages happen when designers aren’t being influenced in the right way by the end-user developers, and I’m speculating that this has to do with particular attitudes and processes associated with proprietary code and also a kind of design elitism. I mean, doesn’t Objective-C code (as code) suck?

---

  1. I include this link because I think this already marginal reference will become incomprehensible in ten years.[back]
  2. Yeah, I’ve got their number all right.[back]
  3. Actually, these are pretty interesting design decisions. The := assignment syntax is wack, but probably necessary for named arguments or something. The dot on the end is okay, but you’re moving the OO-messaging operator into the generally useless semicolon position. By using the space for messaging, you’re now saying “subject verbs(args)” instead of “subject.verb object, args” (in Ruby you can omit the parens for a function). [back]
  4. So I sort of believe that. Mainly I’m bitter because I can’t get some code to work on MacOS.[back]

Setup for Alexandria Development: Part II

(…after too much grief today installing Mephisto and mucking with Apache virtualhosts; I’ll get Part I back from the ether eventually) Update: Done. Update: This is a post moved over from the short-lived Mephisto blog, and ported back in time.

First of all, the alexandria binary is just a ruby script that does a require ‘alexandria’ and runs Alexandria.main.

Alexandria.main is a method on the Alexandria ‘module’ that is used throughout the code (modules are ‘namespaces’ to avoid naming conflicts). This method is found in lib/alexandria.rb:

As you should be able to see, this method isn’t doing anything but setting up some global variables (like $DEBUG) and logging, and doing something weird with http_proxy. The real line is Alexandria::UI.main. That’s in lib/alexandria/ui.rb:

module Pango
  def self.ellipsizable?
    @ellipsizable ||= Pango.constants.include?('ELLIPSIZE_END')
  end
end
 
module Alexandria
  module UI
    def self.main
      Gnome::Program.new('alexandria', VERSION).app_datadir =
        Config::MAIN_DATA_DIR
      Icons.init
      MainApp.new
      Gtk.main
    end
  end
end

Gtk.main is the main loop of a gtk program. You set up your windows and widgets before running it, and it makes them all spin until you exit. So, after Icons.init runs (guess what that does), MainApp.new does all the work from now on.

The Pango code above this is interesting for seeing some Ruby syntax and features. Pango is a text-rendering and layout library inside gtk. The code is adding an elipsizable? “question” method (return true/false) to the Pango module. self.elipsizable? means that it’s defining a class method, a method on a class that doesn’t depend on instance data. ||= is a way of saying, “set the variable to this unless it’s already been set to something else (ie, it’s not nil)”.

Unfortunately, MainApp.new is in the massive MainApp class at lib/alexandria/ui/main_app.rb. This class does a lot (too much). The main thing it does is handle all the callbacks from the main window and its widgets. Let’s just take a look at the top:

 
module Alexandria
  module UI
    class MainApp < GladeBase
      attr_accessor :main_app, :actiongroup, :appbar
      include Logging
      include GetText
      GetText.bindtextdomain(Alexandria::TEXTDOMAIN, nil, nil, "UTF-8")
 
      module Columns
        COVER_LIST, COVER_ICON, TITLE, TITLE_REDUCED, AUTHORS,
        ISBN, PUBLISHER, PUBLISH_DATE, EDITION, RATING, IDENT,
        NOTES, REDD, OWN, WANT, TAGS = (0..16).to_a
      end
 
      # The maximum number of rating stars displayed.
      MAX_RATING_STARS = 5
 
      def initialize
        super("main_app.glade")
        @prefs = Preferences.instance
        load_libraries
        initialize_ui
        on_books_selection_changed
        restore_preferences
      end
    #... snip
    end
    # ... snip
  end
end

A couple points here. MainApp inherits from GladeBase. The attr_accessor is a declaration that makes the @main_app, @actiongroup and @appbar instance variables publicly readable and settable. super(“main_app.glade”) calls the initialize method on GladeBase with the glade file that contains the definitions for all the widgets Alexandria uses. The names of the methods tell you about what they do (good!). Because these methods need to know about what the user’s preferences are, @prefs has been made available before they are called.

To understand what MainApp is doing, it seems like we need to understand what GladeBase is.

module Alexandria
  module UI
    class GladeBase
      def initialize(filename)
        file = File.join(Alexandria::Config::DATA_DIR, 'glade', filename)
        glade = GladeXML.new(file, nil, Alexandria::TEXTDOMAIN) { |handler| method(handler) }
        glade.widget_names.each do |name|
          begin
            instance_variable_set("@#{name}".intern, glade[name])
          rescue
          end
        end
      end
    end
  end
end

So GladeBase is using GladeXML to get the widgets out of the xml file and load them into memory. It then iterates through them, *adding them to MainApp (instance_variable_set is doing the work). So if there’s a widget called @main_menu, MainApp will get this variable to work with. These widgets work exactly as though they had been created “by hand”.

If you’ve been following, take a look at load_libraries and see if the code there makes sense. Here’s a short snippet:

      def load_libraries
        completion_models = CompletionModels.instance
        if @libraries
          @libraries.all_regular_libraries.each do |library|
            if library.is_a?(Library)
              library.delete_observer(self)
              completion_models.remove_source(library)
            end
          end
          @libraries.reload
        else
          #On start
 
          @libraries = Libraries.instance
          @libraries.reload
# ...

This is where things start to get confusing. load_libraries is also being used to reload libraries, so first it checks to see if @library has been defined already (refactoring opportunity). In the normal case, Libraries gets called by by invoking Libraries.instance. To understand this, you have to know that Libraries uses a factory class method to make sure that Libraries only gets created once (making the Libraries instance a “singleton”).

At the bottom of load_libraries is some interesting code:

# ...
        @libraries.all_regular_libraries.each do |library|
          library.add_observer(self)
          completion_models.add_source(library)
        end
# ...

This is telling each library in @libraries (the Libraries singleton) to add self as an “observer”. What does this mean? It means that class Library is “observable”. To see what that means you have to look at Library. First let’s look at Libraries, in lib/alexandria/library.rb:

  class Libraries
    attr_reader :all_libraries, :ruined_books
 
    include Observable
    include Singleton
 
# ... snip
 
    #######
    private
    #######
 
    def initialize
      @all_libraries = []
    end
 
    def notify(action, library)
      changed
      notify_observers(self, action, library)
    end
  end
end

Libraries is including the Observable and Singleton modules to give it special methods (in Python these are called “mixins”). Singleton gave it the instance method. Observable is giving it the notify_observers method. What this method does is “call up” all the observers of this instance by calling their update methods.

Libraries has many Librarys (it’s a little weird to give a class a plural name). Each library is an observer of Libraries. Library is also Observable:

 
  class Library < Array
    include Logging
# ...
    include Observable

As we saw above, MainApp adds itself as an observer to each library. If you look on MainApp you’ll see that it has an update method:

def update(*ary)
# ...
  end

*ary means that it accepts an array as its argument. This method gets called from many places in Library, like this:

        source_library.notify_observers(source_library,
                                        BOOK_REMOVED,
                                        book)

That’s all for now. To learn more about Observers read this.

Setup for Alexandria Development: Part I

This is the first in a series of brain-dumps of my knowledge about Alexandria and related development issues. Be warned, the approach I will take in these posts will be to discuss boring and perhaps obvious details as they occur to me. You are advised to skim.

Getting the code

First things first, you should be able to checkout a copy of Alexandria from subversion. You can find instructions here, but unless you want to pull down the entire tree this is the actual URL you want:

svn co svn+ssh://method@rubyforge.org/var/svn/alexandria/trunk/alexandria

Btw, this is worth looking at if you want to play around with code without committing to a central repository.

Initial setup

Let’s look at the directory structure of the checked out copy (called the working directory).

(alexandria root)
alexandria.desktop.in (Used to add Alexandria to the Gnome menu)
Rakefile                         (The `rake` command looks for this)
/spec                            (Specs go in here)
alexandria.xcodeproj        (MacOS XCode project file)
/data                            (Configuration files go here)
/lib                               (Alexandria code libraries are here)
tasks.rb                        (Rakefile uses this file)
/bin                              (Actual system-wide alexandria command goes here)
/debian                         (Contains templates needed to create debs)
/tests                           (For old 'test/unit' tests)
/doc                             (Docs go here)
/po                               (Language files go here)
/schemas                       (Used in gconf, configuration file like Windows registry)

You will need to get a copy of rubygems. For some reason, the Ubuntu packaged rubygem never seems to actually work, so you should just compile and install rubygems from here. On Ubuntu or Debian, you should run sudo apt-get install build-essential ruby1.8-dev because some gems will need to build “extensions”. You can use either your distro’s rake or install rake from gem. You install gems with:

sudo gem install (package)

You should install rake, rspec, rcov and zentest (autotest):

sudo gem install rake rspec rcov zentest

To work on the website you will also need staticmatic.

Rake and Testing

In the root of your working directory you should now be able to type rake -T and you will see a long list of rake “tasks” defined in the Rakefile and tasks.rb. The most important tasks for development purposes are sudo rake install to install to your system (it installs in /usr/lib/ so be careful) and rake spec, for running the test suite.

Rspec is super cool, but you’ll have to study the tutorials to learn how to use it. A great way to learn Ruby and Rspec at the same time is to ‘spec out’ basic Ruby types! For example, if you’re unsure about how an array method works, you can do this:

describe Array do
   it "should sort strings alphabetically" do
      ["b", "a", "c"].sort.first.should == "a"
   end
end

Just don’t get confused by the pattern of writing specs to cover code that’s already been written. The basic idea behind Behavior-Driven Development is that you write tests that show how your code will behave before writing the code. The only way to really learn how to do this is to force yourself to write some code this way.

Because BDD is supposed to happen before you write code, Alexandria has very poor test “coverage” at the moment, and its not easy to add specs to the code the way it is now. Still, it’s good practice to try and understand the behavior of a method on a class and write a spec for it. Take a look at the files in specs/alexandria for examples.

When a project has good test coverage it’s possible to work according to a very fast “red-to-green” development cycle. Autotest is a tool that will run ‘rake spec’ every time you change a file that’s being monitored. This is great because, again if the test suite is good, you can know the second you break the code! It’s even better if you use desktop notifications with Autotest. This is the version I use with Ubuntu Gutsy. One note: the file he links to is only good for Gentoo, you want this one.

That’s all for now. I’ll do another one tomorrow.

What’s this?

I just got this at the top of a search for “ruby rake” on Google.

Ruby — Rake: 4
According to http://jimweirich.umlcoop.net/index.cgi/Tech/Ruby - More sources »

The url under “More sources” goes here. All I can figure is that this is some kind of authority thing, or like the wtf feature on Technorati. jimweirich is a 4 or something. Maybe this is nothing, or maybe this is the beginning of semantic categorization on Google!!! ??? Why is this important? Well, if you search for Martin Luther King, one of the top links goes to a white supremacist hate page. It may be that Google is moving away from its raw algorithm, which can be gamed, and toward a trustweb system. Actually, it just occurred to me that that result could be from the Google search results tagging system that is already in place. So, is this old news?

religion.

12:00 PM me: Does [your company] use whitespace or tabs?

12:01 PM Ian: you mean spaces?

everyone uses spaces.

four spaces, in fact.

It’s Guido gospel.

12:00 PM me: Does [your company] use whitespace or tabs?

12:01 PM Ian: you mean spaces?

everyone uses spaces.

four spaces, in fact.

It’s Guido gospel.

me: But spaces suck.

12:02 PM Ian: not even remotely.

me: I know that’s the gospel, but it doesn’t make sense.

Ian: It makes excellent sense.

Easier to deal with. Only one kind of whitespace.

me: Do Windows and Linux use different tab characters?

Ian: no.

12:03 PM me: Dude, two-space tabs.

Google uses two-space whitespace, btw.

Ian: well, nobody else does.

me: I know. It drives me crazy.

12:04 PM

Ian: I like four. Everything lines up properly.


def myfunc():
____blah

me: Eh. I use two-space tabs in Ruby, and I don’t like to change when I program in Python. Gajim uses tabs, though.

Ian: We were never told this, it’s just the general rule.

12:05 PM me: Well, it’ll break if you mix them.

Ian: I am aware.

me: That’s retarded.

Ian: Not really. It has to break.

12:06 PM me: I know, but it’s still retarded.

12:07 PM

Ian: I mean, it’s been the standard forever. Tabs are bloody annoying, since they look like spaces but aren’t.

me: But tabs are semantic! Just turn on printer’s symbols if it bothers you.

12:08 PM What’s annoying is backspacing and it goes back…one…character…at...a..time.

12:10 PM I swear, future generation will look back on this as utter madness.


6 minutes

12:17 PM Ian: well. I don’t have to do that.

Vim does tht for me.

12:18 PM me: I thought so.

Ian: it backspaces a tab at a time if appropriate, otherwise space. It’s perfectly natural.

me: Well, that’s not so bad.

Ian: but my code will always render in exactly the same way on everyone’s machine. Lines will have the same length.

12:19 PM if it’s 79 chars, it won’t wrap on somebody else’s editor who has their tabs set to 8 or something

me: I’m right, though. But it is utter gibbering insanity.

What is sacred in web pages is verboten in code. This is ridiculous to me.

12:20 PM Ian: what is sacred in web pages?

whitespace is ignored.

me: Tab means indent!

Ian: tab doesn’t mean a damn thing in a web page

me: User sets the indent!

I know. Using space is like using <br /> in webpages.

12:21 PM You’re trying to control display.

And you call it a virtue.

Ian: well, yeah. html isn’t for content.

me: Madness.

Ian: indentation is set in CSS

me: Yes!

That’s my point.

12:22 PM Tab means <indent />

Ian: But it doesn’t.

In a web page, “beginning of paragraph” means <indent/>

12:23 PM there’s no tabbing.

You can’t artificially insert a tab character.

me: If someone said, don’t use <p>, use <br />, some users change the margins on paragraphs, you’d say he was an idiot.

Ian: You can’t double-tab.

no users change the margins on paragraphs. My own CSS does.

me: I understand. I’m saying tab means indent, a semantic element. It means level of scope in Python.

12:24 PM Ian: but it doesn’t. whitespace means level of scope.

me: But if they wanted to, they could. Then it wouldn’t display properly. Best to use <br />

Ian: no, they couldn’t.

me: Ahh!!!

12:25 PM Yes, they could. They could change the default stylesheet, and make it !important.

Ian: The end user doesn’t control the display of a web page, except for text size.

me: Ugh.

They have a degenerative sight disorder that requires the paragraphs to be widely spaced.

12:27 PM I’m saying the principle that is sacred in web pages is considered a liability in code, and only really in Python and shell scripts, because indentation is just for looks in C++, Ruby, Java, etc.

Ian: But no one will ever do that. I don’t understand how this is at all relevant. Code display has nothing to do with layout. The goal is to do it the same way as everybody else.

and that sacred principle is…?

i still don’t get it.

Since there are no tab characters in web pages.

12:28 PM me: Let the user determine presentation. That’s the principle. If they want to apply another stylesheet that makes your page look stupid, so be it.

Ian: But that isn’t a sacred principle in web pages.

me: Yes it is.

It’s why we don’t use tables and <br /> for everything. It’s why we don’t compose web pages in Word.

12:29 PM Ian: No, it isn’t.

We don’t do it that way because it’s extremely limited.

And it won’t display the way /we/ want it to.

me: Dude, wtf? Use flash if you want to control display.

12:30 PM Ian: But that’s totally wrong! That’s warped!

me: I understand that the user usually views a page the way you want him to.

But he doesn’t have to.

Ian: Always. Unless they’re hacking it.

In which case I don’t care.

12:31 PM Build the page to deal with big text and small viewports, but otherwise whatever.

me: What are you talking about? They can view a page in Lynx, or with a screen reader, or using a Greasemonkey script, or whatever.

12:32 PM

Ian: There aren’t other variations, except for the extreme outliers where people hack your CSS.

me: If it’s important to have code displayed with a certain size tab, you could include a hint at the top.

Ian: People using greasemonkey scripts know the page will be fucked up. Lynx doesn’t apply, since it strips CSS. Screen readers are a completely different thing.

12:33 PM me: I am horrified.

Ian: I dunno where you get this insane idea.

me: I don’t know why you’re fighting me on this. The whitespace thing, sure. But not this principle.

12:34 PM Ian: You can’t account for all users. Especially not if they are making up their own CSS.

me: <br />This is a paragraph.<br /> See, it’s better? Works every time, no matter what the user does.

Ian: It’s impossible to predict that.

Except you can’t do anything. That’s idiotic.

12:35 PM me: Yes, because it’s attempting to define display with markup.

Ian: but <p> tags aren’t for the benefit of the user

they are boxes with default CSS that you, the designer, change.

12:36 PM They’re roughly semantic, but you don’t use them wherever you have text.

me: Okay, I get you.

But a screen reader would use the paragraphs to know where to pause, for example.

Ian: They certainly don’t mean “paragraph,” and they’re only indented if you explicitly set text-indent.

If it’s a screen reader, you have a different style sheet

12:37 PM me: Yes!

Ian: and you use pause-before:blah

in the CSS

me: Do you define a css audio stylesheet for your pages?

Ian: Hell no.

me: So they use the default settings.

Ian: Certainly not for [my company].

12:38 PM me: It’s whatever they want.

And you can override stylesheet settings with !important.

Ian: Also it strips out all layout, so it’s irrelevant.

me: Huh? That’s layout. It doesn’t read them in any order.

12:39 PM Ian: WHO can?

The blind greasemonkey users?

me: Yes.

Ian: I will never, ever design a page for a blind greasemonkey user.

me: Argh.

Please see the analogy.

12:40 PM

Ian: I see what you’re getting at, but I think you’re totally wrong.

The user /can/ define presentation, but only by /breaking/ the original code and rewriting it.

Or using an application that discards certain things, like a screen reader.

me: “As god is my witness, I will never allow another programmer to view my code at anything but four spaces to an indent level. I would rather die.”

Ian: Or lynx.

12:41 PM So if you really want to, you can, before editing any code, translate all spaces into tabs, then do your editing, then retranslate and save.

That is roughly comparable.

It’s a simple greasemonkey script.

me: You’re saying it’s something freaky, because it’s rare. But it’s just rare. It’s something that’s built in to html.

12:42 PM

Ian: if you just have to have your indentation be a certain width, you can. But who the hell cares? The end user of code is the computer.

You make it useful for future coders, of course

me: You do know that all the CSS Zen Garden sheets refer to the same page, right?

Ian: Make it readable and whatnot

Yes.

It’s a basic HTML structure.

12:44 PM divs with some ps and uls

me: Anyways, I can’t change the whitespace to tabs. People would yell at me.

Ian: Well, then you change it back, before saving.

me: Whywhywhy?

Ian: Because code isn’t written for you.

It’s written for everyone.

12:45 PM I take that back: it isn’t written for anyone.

it’s written to be run.

You make it readable, not pretty

more to the point: you make it /editable/

12:46 PM (which web pages aren’t)

me: What’s so bad about tabs??? They only occur at the beginning of the line. If there’s one, it means one level of indent, two two levels, etc.

If the user chooses to view them at 4 spaces per tab, they display like that, if 2, then that.

12:47 PM

Ian: Nothing in particular, except it’s a whole nother character to deal with. If “whitespace=space” it’s easier.

From a coding perspective.

I don’t have to wonder if there are tabs anywhere, because they’re all spaces.

12:48 PM me: The thing is, it doesn’t even matter in Ruby! I can write the whole script without any beginning of line spaces at all! It’s only Python that cares! And Guido bases it on the C++ coding standard, where it also doesn’t matter!

Ian: If I want to indent only one space, I can.

If I want to line up my dictionary values, I can.

12:49 PM me: In Gedit, tabs are arrows and spaces are dots.

Ian: If you turn that shit on. But most people don’t. Most people use emacs and vim.

me: Well, okay, there’s something.

12:50 PMThings only get out of whack if you mix tabs and spaces, it’s true.

Ian: mostly it’s just annoying to have arrows and dots scattered throughout your code.

me: It makes it clear for me.

12:51 PM I don’t understand why “knowing if whitespace is a tab or a space” is more important than knowing that you haven’t accidentally backspaced and set a line to three space indent instead of four.

That happens all the time.

12:52 PM Ian: that never happens.

I have autoindentation on.

me: It’s happened to me. It’s happened in code that I’ve downloaded.

Ian: Then someone wrote it poorly.

12:53 PM That happened to me when I used gedit, which is a stupid application.

or notepad or something.

me: All this effort for a marginal problem of “knowing whether a character is a whitespace or tab” when it introduces another marginal problem.

Ian: But there are no problems.

12:54 PM My code is always clean, no matter who looks at it.

me: Just like there were no problems with the five year plans!

Umm.

12:55 PM Ian: Unless they have their line width set to something short. But then they would be an iiot.

idiot

me: Google uses two spaces! Four spaces is too much!

12:56 PM Ian: Google uses two spaces because fewer spaces translate into less downloaded.

me: Let the programmer decide!

Ian: Any web programmer worth his or her salt packs their code before uploading.

12:57 PM me: No, because code shouldn’t be nested beyond more than two or three levels anyway.

So it should be easy enough to read at two spaces.

Ian: ?

12:58 PM I mean, yeah, code rarely gets that deep

Except not really, when you have vars inside functions inside functions inside classes.

me: Most Ruby code uses two spaces and it’s easy to read.

12:59 PM About four or five levels.

Ian: Well, if Ruby takes over the world, perhaps other people will do it that way.