Tuesday, December 7, 2010

Thou Shalt Not Lie: git rebase, amend, squash, and other lies

As I've used git more, and used more advanced features, my opinions about the merit and uses of certain features have changed. At work we're constantly re-evaluating our git workflow in light of the problems we encounter with the way we're doing things. I may yet re-evaluate these opinions, but rest assured they come not from some theoretical nit-pick, but from real experience.

Rewriting History is a Lie

Using git to rewrite history is a sin. It's called lying. Don't do it. With git there are several forms in which lying comes, but usually you know you're lying if you break the functionality of `git branch --contains some-branch`, `git blame`, `git bisect`, or if you have to use `git push -f origin some-branch`.

Squash merging is a lie

A squash merge (`git merge --squash some-branch`) takes all the commits from a topic branch, combines them into a single commit, and applies that commit. The history now looks as if you had a flash of brilliance, made a ton of changes, and did everything right the first time. It makes you look good, but it's a lie.

Since, you have not actually merged any of the work you did on the topic branch, but instead have merged another totally new commit, `git branch --contains some-branch` cannot tell you that you have merged your branch anywhere. When a QA person or your boss says, "Hey, is some-feature {merged into QA, deployed}" you have to resort to `git log` spelunking.

Also, anytime you are creating a new commit with the same changes as another commit, you are destroying `git blame`'s ability to tell you who to flog publicly. And as we all know, public floggings are the lifeblood of software development teams.

Finally, when you squash merge, the totally awesome rerere feature will not help you re-resolve conflicts. Instead you will have to re-resolve conflicts when you cherry pick the squash commit, or when you squash merge your topic branch to the UA branch or to master. Why? Because rerere depends on being able to detect that you are resolving conflicts between the same SHAs as before, but every time you squash merge it creates a totally new commit. It creates a totally new commit even if you squash merge, `git reset --hard HEAD^`, and then immediately squash merge again.

Rebasing is a lie

Rebasing using `git rebase foo` allows you to rebase your topic branch on foo, instead of whatever it was based on before. This makes it look like you were working from foo the whole time. However, each commit to your topic branch was birthed in a context and by a sequence of events that was unique to that time and that topic branch. You are yanking those commits out of their context and putting them into a totally new context.

Rebasing is effectively a retroactive merge. It is pretending that you "merged" foo into your topic branch 3 days ago, but you didn't. You merged it in today, and you are lying to everyone.

In fact, the commits from your topic branch don't exist anymore, because every single rebased commit gets remade into a totally new commit with a totally different SHA. Pick any one of those rebased commits: Does the commit message make sense anymore? Does the code compile? Do the tests pass?

The answer to these last two questions will tell you whether you can use the totally awesome bisect feature of git. This feature does a binary search through your history to help you discover exactly where some bug or problem was introduced into your code. If you cannot compile and run tests on every commit, then you cannot use `git bisect`, which is a shame.

Being able to compile and run tests for each commit is also useful for resolving conflicts. Say I'm trying to decide whether your change will work with my code, or my change with your code, or maybe some totally different code that "resolves" the conflict, how can I know "resolved" means resolved? The most basic thing I can do (though it's not foolproof) is compile the code and run the tests, but if your code doesn't even compile or run the tests by itself, then I can't intelligently resolve the conflict. Your lie has hurt me and the rest of the team.

Finally, if for some reason you wanted to cherry-pick a commit, but it doesn't compile or run the tests, then cherry-picking it into another branch would break that branch. Thanks for that! Your lie has now become destruction of property, which is a misdemeanor.

Amending a commit is a lie

Amending a commit (`git commit --amend`) to add changes, or to change the commit message is a lie. It breaks `git branch --contains some-branch`. It forces you to do a `git push -f origin some-branch`. It's a sin. See above arguments, enough said.

Selective, retroactive commits are lies

In this last form of lying git is not an accomplice, because you are lying to git. What happens here is that you have been working for a while (20 minutes, 30 minutes, an hour, whatever), and you decide to commit your work, but instead of committing all of your work as you have come to it naturally, you decide to break your work up into several small, "logical" commits. This makes you look good, but it's a lie.

You've changed both "foo.clj" and "bar.clj" and you think there is no dependency between them, so you commit "foo.clj" separately from "bar.clj". The problem is that you don't really know this unless you compile the code and run the tests. Since you don't know for sure that these selective commits can compile and run the tests, you've broken `git bisect`, undermined trust in conflict resolution, and break my stuff if I cherry-pick your commit.

You may think that your commits are "logical", but I would say if you are not committing your work the way that you come to it naturally, then it is not logical. Sure you started solving problem A and then discovered problem B, if they are related and you have to solve problem B in order to solve A, then commit the changes together. They are related so its logical to commit them at the same time.

If you want to solve them separately, then make a WIP commit or a stash, create another branch and solve B, then come back to your original branch and rebase it on B (gasp, see below for explanations of this inconsistency!).

Conclusion

There are several awesome features of git: blame, branch --contains, rerere, and bisect. These features are borked if you lie. If you're going to lie, you may as well use SVN, since these awesome features don't exist there anyway. But don't do that! Let's just all agree not to lie.

Epilogue

Some people may get the impression that I do not think you should ever use `git rebase`, `git commit --amend`, or `git merge --squash`, but that would not be true. These are powerful tools that may be necessary sometimes, but should be used sparingly and judiciously. Perhaps its OK to lie sometimes. If someone comes up to you and says, "How does this make me look?" You choose your words carefully.

One case for history rewriting is integration branches. At work we have integration branches for qa, ua, etc. We have a workflow where we move a single feature through the pipeline kanban style, so we have to be able to merge a feature to an integration branch, rebase it out, etc. In this case no one has an expectation that they should be able to see a clean history for these branches, and no one expects to base their work on one of them.

Other than integration branches, a good rule of thumb is that you should not rewrite history for things that are already push out into the world. This would limit these rewriting tools to uses locally to "fix" things up before pushing them. I still cannot whole-heartedly advise use of rewriting in this case, but if you need to, and the rewriting your doing is limited enough that you're sure the new commits will still compile and pass tests, then fine. Just do it carefully, and don't make it habitual. The End.

Saturday, July 3, 2010

The game of Risk for children

We had family game night last night, and my 4yo son wanted to play Risk. He likes the little army men, and before we had just played with the army men on the board, but I decided to invent a simplified game for him to play. I just made it up as we went along, but I was pleased with how it turned out.

Each person takes a turn drawing a card from the pile. If a person draws a wild card, then they can place their army on any empty country, or if there are no empty countries, they can pick any inhabited country and do "battle." During a battle each player rolls one die, and the player with the highest number wins (in the case of a tie you roll again). The winner either keeps the country, or kicks out whomever was inhabiting the country.

When a player picks a country card, if the country on the card is empty, then they place an army on the country. If the country is not empty they do battle.

Each player keeps the cards they draw, and during their turn if they have three of a kind, then they get to place an extra army on any empty country they choose, or if there are no empty countries, then they can choose an inhabited country and do battle.

The game ends when the pile of cards is exhausted, and there is an army on every country. My thought was that at the end we would count up the number of countries that each player had, and the one with the most would win. We didn't get to do that last night, though, because my son wanted to drive a cannon through all the countries on the board making all the armies fly everywhere (who can blame him!).

I don't think there is any strategy to this game. You are basically subject to the randomness of the cards you draw, and to the dice in the case of battle. You could assign extra points for holding an entire continent at the end of the game, but I'm not sure there are enough opportunities to choose on which country you'd like to place an army.

So there may be room for improvement, however, it was surprisingly entertaining, and it kept the attention of my son. :)

Thursday, June 10, 2010

The Flerb Paradox

Chas Emerick has ranted against Emacs conducted a survey of Clojure usage. That's right, in the middle of a friggin survey of Clojure usage he took the time to bash Emacs, and I'm kinda fed up with Chas' Emacs hate. I'm a heavy Emacs user, and I think its the bees knees, and I would like to correct some errors in Chas' article.

Fact #1: Learning Emacs is not necessary for learning Clojure

No one has ever said it is. It's a flat-out false statement. Chas does not use Emacs to write Clojure. Neither do 30% of Clojure users according to his survey results, and when you look at the comments on Brian Carper's article, you see many people saying they gave up trying to learn Emacs, and went with another editor.

Fact #2: No one is pushing Emacs

The clojure.org homepage does not say it is a syntax error to write Clojure code in an editor other than Emacs. In fact, I remember a time when I almost unsubscribed from the Clojure mailing list because there was so much talk about VimClojure (not that I hate vim, it was just noise to me). Contrary to Chas' belief, people are not running around thumping their Emacs. Chas probably just sees Emacs mentioned with Clojure in blogs and tweets and IRC and gets the impression that people are being bigots, but as revealed by his survey there is a simple explanation for this: 70% of us are using Emacs. In fact, if anyone is obsessed with Emacs its Chas, he seems incapable of talking about Clojure IDEs (or perhaps just Clojure) without ranting against Emacs.

Fact #3: There are alternatives

Frankly, if I was the developer of Counterclockwise, Enclojure, La Clojure, or VimClojure (that's right four other, non-Emacs editing environments for Clojure!), I would probably be hurt my Chas' comments. He looks at Emacs, which he finds too difficult to be worth his time, and concludes that there is no decent editing environment for Clojure. He even goes so far as to offer to pay someone to develop a green-field environment for him.

Fact #4: People hate Clojure's syntax more than any particular editor

Clojure's Lisp syntax and abundance of parenthesis is mentioned as a weakness/blind spot in the survey results just as many times, if not more, than IDE complaints. Many people think syntax is scaring off newbies more often than not. I think that is a natural reaction. I too was put off by the parenthesis at first, but once Lisp clicks for you, you realize that the parenthesis are necessary for its power. Someone who puts off learning Clojure because its syntax looks weird and different is missing out on the power that is enabled by that "weirdness." You have to be willing to try something different, completely different, and go through some pain to gain this incredible power.

The Flerb Paradox

That leads me to my last point. I'm not sure how to say this without it coming across as me being a jerk. Let me just say that I'm not out to attack Chas or anyone else personally. I don't think Chas is an idot. I don't hate him. I just wish he'd stop riding his anti-Emacs hobby horse. However, what I'm about to say is controversial: editing environments vary in their productivity enhancement. This is a corollary to the Blub Paradox that I call the Flerb Paradox.

Editing environments lie along a continuum of productivity enhancement from a plain text editor on up. The Flerb editing environment is a fictional environment that is somewhere in the middle of the continuum. A user of the Flerb editing environment looks down one end of the continuum at Notepad and balks at its lack of feature x (syntax highlighting, code completion, etc.). He knows he is looking down the continuum and his editor makes him more productive than those below it. Otherwise, why wouldn't he just use notepad?

However, when he looks up the power continuum at Emacs he doesn't realize he's looking up the continuum. He only sees something strange and confusing. He doesn't understand why anyone would want to use something like Emacs, because Emacs can do everything Flerb can do, but it has all this hairy stuff thrown in for no reason. It has different key bindings and Meta keys and buffers! It seems to be designed to confuse its users. This is the Flerb paradox.

Now, am I saying that Emacs is the best editor that could possibly be invented? No. What I'm saying is that it's the most productive editor that is available. There's still room for other editing environments. Perhaps one of them will turn out to be more productive than Emacs. There are plenty of people out there who would rather use something other than Emacs, and I'm fine with that. I think they won't be as productive as possible, but that's their choice.

Learning Emacs was tough. It was painful. And just like most tough things in life, it was worth it. But this is coming from someone who switched to the Dvorak keyboard layout, and taught himself to use the mouse with his left hand. I like to do things for the challenge, because it keeps me sharp. I fear the day I become dull and complacent.

So, while I said Clojure developers aren't Emacs bigots, I've turned into one, and it's Chas' fault :-), just like superheros create supervillians that create superheros. If Chas hadn't railed against Emacs so much I wouldn't have turned into the supervillian I am today. >;-)

Tuesday, March 16, 2010

Hiring a Rails Lead

My company (Sonian) is hiring a Rails Lead position. This is a remote, work-from-home, position, and it's an awesome situation...I love it! It is a team of excellent developers that are fun to work with.

Preferably an applicant:

  • has been a team lead before
  • has worked on a distributed team before
  • has worked in a pair programming environment before

This is a full-time position for US based individuals only at a well funded startup that develops and sells an e-mail archiving solution with several paying customers. In addition to Rails, we're using modern tech like Git, Chef, Amazon AWS, Clojure, and distributed file systems. We're indexing terabytes of data for searching. It's a cool and serious application!

If you are interested in applying, please send a resume to jobs@sonian.com. If you have any other specific questions about Sonian or the position, feel free to contact me at paul@stadig.name or call me at 703-634-9339.

Wednesday, February 3, 2010

Eva Cassidy

I'm probably late to the game with this one, since most of her recordings date from the late 1990's, and her album skyrocketed to number 1 in England around 2000, but I have just discovered an amazing musician.

Eva Cassidy grew up just outside Washington DC in Bowie, MD. She struggled with being extremely shy, and didn't like to be in the spotlight. She sang as a background singer for other musicians, and occasionally played at Blues Alley in DC. She recorded demos that she sent to labels, but her talents were too wide ranging. Record labels had a hard time imagining how to sell an artist that excelled a jazz, folk, pop/rock, R&B, and gospel.

In 1996, after some success collaborating with Chuck Brown on "The Other Side" and releasing a live album from some gigs she did at Blues Alley, she moved to Annapolis, MD and took a job painting murals at elementary schools. The summer of 1996 she was experiencing hip pain that she assumed was from using the stepladder at work. X-rays revealed she had cancer throughout her body. A few short months later, at the age of 33, she died.

Eva had a beautiful, clear voice, and for being extremely shy and afraid of the spotlight, she poured herself and her passion into her performances. Just search YouTube for videos, watch them, and you'll see what an amazing talent she had.

Her fame and success has come posthumously. When a DJ in England played her song on the radio, her album became a smash hit, and more and more people have been discovering her music. Perhaps part of the attention her music has gotten is because of her tragic story, but I believe her music stands on its own, and I'm glad to have discovered it.