Tag Archives: code

Parentheses in Ruby

I’ve been doing a lot of “heart” and “mind” posts lately, and “code” is starting to feel neglected. If you’re not a programmer, feel free to give this one a miss.

I put off writing this post til the last minute because I want to talk about parentheses in Ruby, and I’d like to give you a definitive “do this” answer, but the reality is I just don’t know, so I’m gonna throw out some thoughts and let you guys tell me what you think.

Ruby Makes Parentheses Optional

Okay, that’s not news. But a lot of people come to Ruby from languages where parens are required, and they see that they’re optional in Ruby, and so they keep them in out of habit. And then they complain that Ruby is dumb for not having first-class functions.

What is often overlooked here is that optional parens are important. Matz chose to sacrifice first-class functions just so he could make parentheses optional. So when I started programming in Ruby, I made an effort to eschew them when possible. Now, “optional” means optional–it doesn’t mean banned and it doesn’t mean required. So there’s room for wiggle here.

Some Strong Styles Have Emerged

There are some very strong idioms that have emerged, however. Well, mainly just one: Never use empty parens. This is actually two rules: never put parens at the end of a method definition if the method takes no arguments, and never use parens when sending a message that takes no arguments. There’s a lot of really good reasons for doing this, and I won’t go into them here.

Generally Favoring Parens

I know some smart programmers who favor parens in ruby. Their general argument is simply “it makes the code more readable”.

My experience has been that this argument is indistinguishable from “it makes the code look like the other languages I’ve spent years learning to read”. This is not necessarily wrong. I think both styles have tradeoffs, and there’s a mental cost to be paid to learn to read a new style. Since this cost has to be paid up front, the tradeoff is really imbalanced in the short-term. But is this a false economy in the long run? I’m asking because I don’t know.

Generally Eschewing Parens

On the far other side of the spectrum lies “Seattle.rb style”. Josh Susser originally coined the term “Seatttle style” to mean “not using parens in method definitions”, but Ryan Davis and Seattle.rb have taken it and run with it to mean “never use parentheses unless the compiler requires it”.

I spent a couple weeks experimenting with this style. It definitely had an effect on my code. I never did get to where I like omitting parens from method definitions, and maybe I am subject to my own argument above about embracing the tradeoff. One thing I did find, however, was that trying to avoid parens everywhere else had a profoundly rewarding effect on how I felt about my code. The original argument from Matz about making parens optional was that parens are often just noise, and getting rid of this noise is important.

Parens and Readability

When I talked about this with the rest of the Ruby Rogues, I quickly found myself in a 3-against-1 battle (Katrina wasn’t available). Josh made the argument that omitting parens decreases readability. You have to read this entire line of code to know when the first method call is actually done:

puts array.delete hash.fetch :foo

And I have to agree. That line of code is horrible.

But… it’s not the lack of parens that make it horrible. That line of code is horrible all by itself. I don’t think this is really any better:

puts array.delete(hash.fetch(:foo))

(Another strong convention in Ruby is to never use parens with calls to puts; otherwise another layer of parens could be added, but they would be gratuitous.)

Now, some programmers will find that line of code more readable. My point is that this is a problem. The line looks a bit more comforting, but you still actually have to read the whole line to know what’s happening. This line is doing two manipulations on unrelated primitives followed by a side effect call to puts. This line desperately needs some intention-revealing variable, methods, and selectors.

I see code like this all the time in parenful code. It sort of gets a pass with the parens stuck on. We say “yeah, this could be refactored, but it’s readable enough for now.”

But scroll back up to the version without parens. That line is unforgiveable. It has to go.

How valuable is that? I’m asking; I feel like it’s a lot, but I had to spend a few weeks learning to read a whole new style before I got that value. So I don’t know the answer. I don’t if it’s worth it. What I do know is that by eschewing parens, I never ever write lines like the above. The code just won’t let me. It bugs me, and I refactor it, and then my code feels better to look at, and the whole program becomes much more readable. That’s immensely valuable to me. But I don’t know how to communicate that it’s worth the trouble–or even if it’s worth it at the end of the day.

Readability For Other Developers

Another good question is if one developer hasn’t learned this style, and another one has, should parens be the default to maximize readability? I grind my teeth whenever people start making “lowest common denominator” arguments because it’s really hard sometimes to see the line between “this is overcomplicated” and “let’s do something more stupid because it’s easier”.

So part of the experiment this week has been paying attention to how my coworkers react to my code. It’s hardly been a scientifically rigorous study, but given a sample size of one week and two code reviews, I have one data point. It comes from a coworker who strongly favors parentheses, but has been tolerant of my style. His exact words were:

“Your code is a pleasure to read.”

I am confident that he was not talking about my parenthesis usage, but rather about the clean, refactored, expressive code that resulted from me not liking the way some lines of code looked without parens stuck on.

It’s hardly a conclusive argument, but it’s one that encourages me to continue researching. I’m not ready to stand up and proclaim that parentheses are nothing but noise that covers up code smells, but I am giving serious thought to secretly believing it. 🙂

Weirich-Style Kung Fu

A hybrid style is emerging from these discussions, and it’s structurally based on “The Weirich Rule”, which is about blocks rather than parentheses, but the similarities are definitely there and I find the analogy appealing. In Ruby you can write blocks with braces or with do…end; the Weirich Rule states that if the block returns a value, using braces signals that intent to the reader. If the block exists for its side effects, using do…end signals this intent.

A Weirich-inspired rule for parentheses, then, look like this:

If you care about the return value, send the message using parens.

The people who have I have talked to–well, listened to–about this rule are passionate about it and convinced that it increases readability and communicates intent. I want to get behind this rule, I really do. it’s a clear, bright-line rule for when to use parentheses or not. It’s the sort of rule that might get included in a book about ruby, for example.

But…

But I still have a problem with this style: when I went full-on no-parens mode, I was forced by the ugliness of my code to refactor it to be simpler and cleaner. When I use this paren style, I can feel that pressure evaporate. I’ve tried this rule for a week, and and I can feel my code quality suffering. I’m not convinced it’s a good rule.

Maybe I just need more engineering discipline. On the other hand, however, I have found that coding styles that require more discipline tend to embarrass me when I don’t step up. I gravitate much more strongly to coding styles that encourage and support me in having more engineering discipline, because they result in me writing better code.

But that brings me back to square one, which is trying to convince everyone to try giving up parentheses all over the place, and I don’t know if I have enough tin foil hats.

So that’s where I’m at. Thoughts?

Teach Yourself a New Programming Language in 21 Minutes (Or 2-3 Years, It Depends)

You’re sitting at work, grinding out a bug in the legacy system, when your boss comes in and tells the team that you finally get the chance to rewrite the whole system–and even better, you get to do it in Clojure! (Or Scala or Erlang or Rust or Dart or some other Language You Only Know A Little About But Have Secretly Wanted To Learn For A While Now.)

Or maybe you’re happy with the language you’re using, but your VP of Software Architecture just spent $150,000 on a suite of Enterprise Tools which includes a module that will let your project scale infinitely into the cloud… all you have to do is learn Clojure. (Or Scala or Erlang or Rust or Dart or some other Language You’ve Only Heard a Few Mutterings About But Desperately Want To Avoid Learning.)

Either way, you’ve got a problem: you need to ramp up in this new language. And whether you want to become a super expert guru ninja rockstar in the language, or just learn enough of it to make it go away, you want to do it fast–and that means you want to avoid making the same mistakes I made learning Ruby and JavaScript over the years. Mistakes which I have learned to fix, mind you, and so without further ado I present:

Teach Yourself A New Programming Language In 21 Minutes (Or 2-3 Years, It Depends)

All you need to do to learn a new language is learn:

  • How the language encapsulates data
  • When the language invokes execution of code
  • Where do the semicolons and braces go

I’m not entirely kidding. This was my strategy for a decade and to this day if I need to get something bashed out quickly in a new language I’ll skim a language reference and let the compiler tell me when I make syntax errors.

In general, as long as you’re staying largely inside the world of what I call “BOLS” (Block-Oriented, Lexically-Scoped) Languages, such as C, C++, VB, Java, C#, PHP, Lua, Python, Ruby or perl, you can in fact learn enough pidgin to get by very very quickly with this method. Granted, in those last three languages it will be obvious to experienced programmers that you’re writing inelegant code. But you can get by, is what I’m saying.

If you’re the second kind of programmer I mentioned, you might be done. Just read this next section for a caveat and then you can hopefully stop there.

Don’t Stop There If You Can’t Stop There

As you’re learning the new programming language, ask yourself the two vital tradeoff questions:

  • Do I really want to learn this language bad enough to actually learn this language?
  • Can I afford the time and energy needed to fumble around being bad at this language?

See, this strategy is especially useful if you know you don’t plan to ever actually learn the language. I have written some pretty arcane bash scripts in my day, but to be honest I wrote my first bash script 10 years ago and in that time I’ve written less than 10,000 lines of bash scripting code. It’s just not worth learning to me, so I keep some files around with examples of the most obvious kinds of things I want to do, and when I need a new bash script–usually about twice a year–I have all the pieces I need right there. BAM. Ignorance is bliss, and laziness is, occasionally, brilliant time management.

But this strategy is especially awful if you know you don’t plan to ever actually learn the language, but you turn out to be wrong. It ends up that you find yourself using it on a regular basis, and hilariously, you don’t even notice this for years and years. This is true for me of elisp, the flavor of lisp used to program emacs. I’ve written elisp for years longer than I have bash, but maybe only twice as many lines of code. I find myself needing an elisp tweak on a weekly basis, and end up spending an hour researching how to do it. And two or three times a year I find a problem that I could solve elegantly in elisp, if only I knew how to express what I was thinking as lisp code. But I don’t solve the problem. I merely sigh, and learn to live without whatever cool new feature I was thinking of.

I wrote that paragraph in present tense because I still haven’t figured out that I really do need to actually learn that language. Shut up.

Sometimes work and politics can affect your decision as well. If you and your boss agree that the Next Big Thing will be written in Language Y, then you have the need and your manager has the afford.

(And sometimes these two forces are in conflict. I could write an entire blog post on the political machinations involved when you and your boss disagree on the do/don’t want or can/can’t afford questions. Skunkworking a cool language or shirking a lame one is a topic for another post, one I’ll probably never write, but basically it would be all about office politics. I’ve seen people get fired for a successful skunkworks project and others get promoted for sandbagging a project. People sure are complicated!)

Okay, NOW if you’re the second type of programmer, you can safely stop reading. If not, you’re pretty much out of luck for the “21 Minutes” part of learning a new language. But keep reading, because the 2-3 years bit isn’t until the very end. Most of the mistakes I’ve made learning a new language I have made in the first few days.

Truly Embracing A Language Takes Time, But You’re In A Hurry, So…

If you want to embrace a language, it’s going to take time. You’re going to have to internalize the language’s entire approach to solving problems. You’re going to have to learn its idiosyncracies and its warts, and you’re going to have to learn its power moves and elegant applications. That all takes time, but if you’ll permit me to point out my favorite pitfalls and some less-traveled paths around them, I can maybe show you a trick or two for leveraging your learning.

But first, good news/bad news. I’m writing this assuming you already know a programming language or three or seven. The good news is, the more languages you know, the faster you can recognize the basic syntax patterns and logic structures of a new language. But the bad news is: the more languages you know, the more they tend to blur into a common model of computation in your head. This can make you blind to the elegant weirdnesses of your new language. Resist the urge to judge weird things quickly; they often turn out to be the most powerful features of the language once you “go native”. If you see something you can’t stand, remind yourself that you haven’t seen everything there is to see, and give the new idiom a chance. Try it out and learn its tradeoffs. It’s okay to discard a bad idea in JavaScript once you understand why it’s a bad idea in JavaScript… but it’s never a good idea to discard a feature of a new language based on your instincts–because your instincts come from other languages, not this one. (Read up on The Blub Paradox if you haven’t heard of it before.)

TL;DR This Is Mostly About Your Blind Spots

I should have put that TL;DR up at the top, but if you don’t know by now that I’m that kind of jerk, you must be new. Welcome to my blog! Anyway, here’s the list. I’ve included illustrations about Ruby, Python and JavaScript, because those are the three languages where I stunted my own growth unnecessarily the longest.

  • As you start learning the core principles of the language, listen hard for hints and clues from its culture. Ruby has block syntax, but a rubyist often cares more about naming than blocks vs. Procs. Python has list comprehensions, but a pythonista often cares more about being able to quickly uncover all the working parts than to have a slick but magical-looking API. JavaScript has objects and polymorphism, but listen to a good JavaScript programmer and you’ll find them more interested in the functions themselves–and their prototypes.
  • Be ready to try totally new ways of thinking. Be ready to abandon bottom-up provability for Ruby’s top-down “programming by wishful thinking” approach. Be ready to trade off a more efficient algorithm in Python for one that is more readable and maintainable. Be alert to the pains you’ll feel trying to write object-oriented code in JavaScript–that’s JavaScript’s prototype system refusing to be hammered completely into an OOP-based inheritance model.
  • Learn where the minefields are. Ruby is a memory hog. Python’s primitives aren’t actually objects. All numbers in JavaScript are floating-point numbers–there are no integers.
  • Learn which minefields you can ignore or work around. Entire models of webservice design have been rethought and reinvented to compensate for Ruby’s apache-unfriendly execution model. Python isn’t THAT slow to begin with, but if you really need bare-metal speed it’s easy to write a C extension. Many JavaScript libraries provide “polyfills”–bits of code for old versions of JavaScript that implement features added to newer versions of the language so that you can write code against a stable JavaScript version and still have a prayer of it working in most browsers.
  • Most importantly, learn which minefields you CAN’T ignore. Treat them like minefields that you must commute through daily. Stop and pay close attention. Map them out carefully. For example, metaprogramming in Ruby is a very dangerous feature, but it’s not a defect; it can be used responsibly. Python’s whitespace enforcement makes it difficult to express certain ideas succinctly, but that whitespace enforcement produces a predictable rhythm to the trained pythonista’s eye, and it is most definitely a feature–don’t let your code fight it; learn to restructure your thinking. And while almost everybody considers JavaScript’s semicolon insertion rules to be eccentric bordering on insane, you MUST learn them if you want to avoid having your program suddenly stop working just because you deleted a comment or swapped the load order of two completely unrelated files.

You can make good inroads to these blind spots in a few days or weeks if you’re just aware of them. So here’s the hard part: Last of all, learn the idioms. From here on out it’s all about learning to think in the language. That’s going to take you a bit longer, and there’s nothing for it but to talk to other programmers, find some online references, maybe buy a cookbook… but mostly, it’s going to take writing code. Lots and lots of code. In my experience (both personal and observing coworkers and clients) this last part’s a doozy–plan on it taking a couple years or more.

Whaaat. You did say you really wanted to learn this language, didn’t you?

Software and Chicken Entrails

I had a collection of epiphanies today about Informational Software.

“Informational Software” is a term I use to describe software that helps you understand and make decisions about information. It is not a product and does not make your business money, but it can be used to help you understand your business and therefore, in theory, help you make money. For example, analytics software does not make you money, but you can use it to understand your traffic and hopefully to then minimize your costs and maximize your revenues. (Note that if you are selling an analytics package, this definition is still true, because in that case the software is a product, not an information tool*.)

Informational Software comes in two types: software that interprets trends and data and makes decisions for the user, and software that collects and reveals data so the user can make their own decisions. Expert systems are an example of the former, analytics packages are an example of the latter.

Let’s call this mess of data the chicken entrails. We want to read them and predict the future, right? Sure we do. That’s what chicken entrails are for.

Okay, enough definition. Here are the epiphanies:

First: If your users are untrained and the data is simple, your system can advise the user and/or make decisions for them. If they cannot read the entrails, or the entrails are too simple to bother the entrail readers, do not let/make them read the entrails.

Second: (Pay attention, this is the important bit) If your users are highly trained and the data is very complex, do not attempt to interpret the entrails for them. Just show them the entrails. Highly trained users who have asked you for information software do not want you to do their job, they want a tool to help them do their job better.

Third: My instinct is always write software that interprets entrails, regardless of the complexity and regardless of user knowledge. Learn to stop and figure out what it really needed.

Fourth: Interpreting complex data is really, really hard. On a small, knock-it-out project, it is almost certainly doomed to fail. Now reread the 2nd epiphany: On small, knock-em-out projects, it is completely unnecessary.

Right now I’m working on software that reveals entrails to some truly arcane masters of entrail reading. They have become masters because the information system currently available to them is literally designed to protect the data from their eyes. I have found two pieces of data, correlated them, and put them into a report, and they think I’m a super genius. Not because my software is smart, but because it is smart enough to be dumb.

I really like epiphanies where I suddenly realize that it is not necessary to be doomed to failure.

* And if you’re smart enough to reason “but what if you’re using your analytics tool to analyze the sales of your analytics tool”, I congratulate you on your cleverness. Can you also see the flaw in this reasoning?

Two Questions

Recently I was talking with a friend about coaching and specifically the act of helping younger developers improve themselves. I had a sort of microepiphany when I realized that I’ve been improving myself for over two decades with the same pair of questions, originally unconsciously and only recently in my active consciousness. The next time you do something you want to get better at, ask yourself these two questions:

What about this makes me feel good? This is a VERY specific question, and it is NOT “what do I like about this?”. It’s often hard to answer. You are not allowed to say “I don’t know”, and you are not allowed to settle for answering the much easier question “what about this do I like?”—although that can be a great guide into discovering what it is that makes you feel good. If you wrote an elegant passage of code, or did something clever, or shipped a really nasty hack but saved the company (thus buying them time to refactor your nasty hack) by shipping on time, that’s what you like. But go beyond this. What about that makes you feel good? Did it make you feel smart? Did it make you feel artistic? Did it make you feel like a hero? Did it make you feel like somehow, against all the odds, you might just be starting to “get it” as a programmer?

Take a moment and really just let yourself feel good about what you did. If you can find that and tap into it, you have just found a well inside yourself that you will return to again and again in the future. Congratulations, you’ve just found the reason you’re going to spend the rest of your life getting better at this.

If you can’t answer this question, don’t sweat it. But don’t be surprised if your life ends up going a different direction. Find something else that makes you feel good, and do that instead.

What about this could I do better? Most days, you’ll think of something right off. There was some duplication, or a lack of symmetry in the code, or your variable names were kind of awkward.

Other days it’s a bit harder. “Writing this bit felt a bit grindy, like I was pushing out lots of boilerplate. I don’t see how to fix it, but does it really have to hurt this much?”

The best days are the days when you try and try and just can’t answer it. Important: this doesn’t mean you did something perfect. Far from it, and far better: it means you’ve actually managed to see your blind spot. “This”, your brain is telling you, “this empty space, here… is where more knowledge will fit.” Those are the days that herald “getting it” on a whole new level.

So, them’s my questions for you. What made you feel good? What could do better?

Felt any good or done any better recently?

Donkeypunching Ruby Koans

Do you want instant enlightenment? Sure, we all do.

And now you can have it!

Tonight I presented Ruby Koans at URUG. It started out simple enough, but then we got on a weird quirk about trying to make the Koan tests pass without actually satisfying the test requirements. We monkeypatched Fixnum, then started playing with patching Object#to_s… basically we were looking for TMAETTCPW: The Most AEvil Thing That Could Possibly Work. I spelled Evil AEvil because it’s extra evilly.

Mike Moore had the bright idea to just break off all the assert methods in Test::Unit; after that it just became a challenge to discover how to get the rest of the koans to run at all.

With sincere apologies to Matz, Jim and Joe, here is the result:

https://gist.github.com/1108850

Tourbus 2 is Out!

I just released Tourbus 2.0! You can get it by cloning the tourbus repo in that link, or by simply installing the gem from rubyforge.

What’s TourBus?

TourBus is a ruby framework for stress-testing a website. You define “Tourist” classes that “tour” their way through your site, and then tell tourbus to send a load of them at your site.

What’s New

Better Syntax, and tested support for most rubies. TourBus 2.0 has been tested and found worky on:

  • JRuby 1.6.0 <– strongly recommended, as it has better threading
  • MRI 1.8.7p334
  • MRI 1.9.2p180
  • REE 1.8.7-2011.03

Upgrading from Tourbus 0.9

  • Your tour classes will change; they are now called tourists and they go on tours, instead of being called tours who run tests (which really never made sense anyway)
  • Open your tour class, and change it to inherit from Tourist instead of Tour.
  • Change before_tests and after_tests to before_tours and after_tours.
  • Rename all your test_ methods to be tour_ methods. E.g. “def test_simple” => “def tour_simple”
  • That’s it! Tourbus should now run normally.

Quick and Easy Setup

gem install tourbus

Okay, let’s say you have a website running at localhost:3000 and you want to test that home.html includes the text “hi there” even when being pounded by hundreds of visitors at once. Let’s install and set up everything all at once! cd into your project folder, and do the following:

mkdir tours
 
echo 'class Simple tours/simple.rb

That’s it, you now have a tourist ready to wander over to your site and request the home page. Let’s run him and see that everything’s okay:

tourbus

You should see a clean run followed by a text report showing what happened. If that worked, let’s make your tourist go through the website 10 times in a row. But let’s ALSO make 100 different tourists to the same 10 laps with him, all at once:

tourbus -n 10 -c 100

Happy server stressing! Check out the README for more info.

Bonus: Isolating Tourbus

Here’s how I like to install tourbus. I cd into my development folder, and then do:
rvm install jruby-1.6.0
rvm use jruby-1.6.0
rvm gemset create tourbus
rvm gemset use tourbus
git clone git://github.com/dbrady/tourbus.git
cd tourbus
gem install bundler
bundle install
gem build tourbus.gemspec
gem install tourbus-2.0.1.gem # (update version if it changes)

Next I cd into my project and do

echo 'rvm use jruby-1.6.0@tourbus' > .rvmrc

This lets me run tourbus under jruby and its own gemset, so even if my website is running rails on MRI, I can still get the lovely JVM native threads when tourbussing my site.

James Edward Gray: Associative Arrays and Ruby Hashes

Yesterday I put out a little screencast showing some ways of Creating Ruby Hashes. James Edward Gray II pinged me on Twitter and basically said “Great screencast! Ooh, but you forgot this! Ooh, and this! And this!” and so of course there was nothing to do for it but invite him to do a pairing screencast with me.

This video is a bit of a weird hybrid. You get 7 minutes of podcall, then 18 minutes of screencast, then another 12 minutes of podcall. James shows off some of the “hot new awesomeness” of Ruby 1.9, and then points out that this awesomeness has been around for a couple of years and nobody’s using it, in spite of it having been in the current Pickaxe for nearly as long. Along the way we talk about regular expressions, testing dogma, and the importance of never squashing creativity in the open source community. All in all, an incredibly fun time for me. James threatened to come back and do another one with me on regular expressions, and I’m mentioning it here in writing so that everybody knows I plan on taking him up on that offer.

No podcast, because half of it is us typing into a shared screen session. But here’s the video. You may need to watch it on Vimeo or download it to see the font clearly.

Associative Arrays with James Edward Gray II from David Brady on Vimeo.

SVN Users: Why You Should Switch To Git

Recently a coworker of mine told me he was happy with SVN, and had been for years. Why should he and his team switch to git if they were productive and happy? I posted this to our internal message board, but I think the answer is broad enough to merit posting here on my blog. Enjoy. Which vcs do you use, and why do you like it? Are there any ex-git users out there who prefer something else?

Just my $0.02, but I hear this concern from satisfied svn users a lot. I used to be one myself. There is a compelling answer, but unfortunately I don’t know how to articulate it. Almost without exception, every svn user I have seen switch to git has slapped their forehead and said, “My goodness, why didn’t you tell me the world wasn’t flat?!?”

I think the problem is threefold. First, git was very hard to use when it first came out, which turned a lot of people off. Second, it was kind of a hipster trendy thing, which turned even more people off. But most importantly, every advantage that git provides over svn is something that svn users have learned to live without, and so when you say “git can do this”, svn users say “Yeah, but we don’t need or use that.”

You need those things. They will make you happy. Take it on faith until you begin to enjoy the fruits yourself. 🙂 Git offers a ton of incredible things over svn. I’ll mention just my top three favorites.

First, you can branch in git, and you don’t do that in svn. I know what you’re thinking: you CAN branch in svn. That’s not what I said. I said you DON’T. Because it’s such a pain to do and merging is such a nightmare, I’ve only ever met one team that used branching heavily in svn. They were a company with 500+ developers, however, and had IT staff on hand full-time to enforce the engineering discipline to keep their branches under control, and once a week the dev team stopped and had a “merge day” when branches were folded back into the mainline. In contrast, git’s merging tools are so freakishly powerful that branching becomes nearly a zero-cost operation. In the past week, I have created or worked in not less than ten different branches across three projects. Each feature, each bugfix, isolated in its own branch. All of the code is changed and updated, and pushed up to the server. Some of the branches were merged immediately, some are still awaiting QA testing before it can be deployed. So that’s feature number one: Git makes branching and merging so easy that you’ll use it all the time.

Second, and this is a huge implication, because branching and merging are so easy, you no longer have this problem where everybody is syncing and merging with trunk, where every feature change gets deployed to production as soon as you finish it. You might be tempted to lump this in with my first point, but as somebody who occasionally gets dragged back into svn from git, this is totally a separate concern. You can’t do exploratory branches easily, so you don’t do them. With git, you can fork a branch, make some changes, forget about the branch, go back and work in master (git’s word for “trunk”) for a month, then come back to your exploratory branch and type “rebase” and it will MOVE your changes forward in time, updating the trunk and then “playing your changes back” over the new trunk, making it as though you had forked yesterday instead of a month ago. If you’ve ever made a bugfix and then had to hold off pushing your commit because QA was still testing trunk for a deploy, you need to switch to git.

Thirdly, git is distributed. Everybody gets the obvious implication of this, that you could be pushing your code to multiple servers. And big deal, right? You could be backing up your svn repo just as easily. But everybody misses the subtle implications of this, which are earth-shattering: one, what you call your sandbox, git consider to be just another repo. Which means you can be on a plane with no internet access, and you can checkout old revisions, commit code to a feature branch, fix a bug in master, and start two new exploratory branches, all without being connected to the main repo. What svn calls a commit, git calls a push, and it syncs your “local” repo with the remote one. (What git calls a commit is just storing a change from your sandbox to your local repo database to be pushed later.) And two, because you have a full copy of the local repo in your sandbox, you can play amazing games with the commit history. Checked in a file you shouldn’t have? Go back into your repo’s history and remove it from the commit stream before you push it to the server. Wrote the wrong bug number on your checkin? Amend your commit message. Pulled down latest code only to discover that 12 files are in conflict and you just want the version from two days ago? You can jump over to that commit and grab them.

That’s my $0.02, which I guess on a per-word basis appears to be quite the bargain. Sorry. TL;DR: git takes your version control game to a whole new level that you didn’t even know existed. If you’re happy with svn, you don’t NEED to use git. But if you want to STAY happy with svn, trust me: don’t ever switch. You WON’T be able to go back.

(Well, actually, you will. git has a svn emulation module that lets you have a git repo locally and push commits to a svn server. It still has the problem of “the dev team are all committing to trunk”, but features 1 and 3, of branching and distributing, still shine through. It makes working with subversion… bearable.)

Yes You Should Test Private Methods (Sometimes)

Lasse Koskela recently wrote Test Everything, But Not Private Methods on his blog. Thank you for writing this, Lasse. This is one of the most intelligent, albeit still wrong, posts I’ve read on the topic. Lasse specifically addresses one of my greatest concerns, which is simply that “if it can possibly break, it should be tested.”

It turns out that on the face, we are in agreement: you should not leave complicated code untested. Lasse’s answer (which is repeated by several folks, including Michael Feathers and Mike Bria) is simply to not write complicated private methods. If you have a complicated private method, you should make it public and test it. If it doesn’t belong on the public interface of the class, then move it to another class or create a new one where it can be public. As Michael Feathers puts it in his book Working With Legacy Code: “the real answer is that if you have the urge to test a private method, the method shouldn’t be private; if making the method public bothers you, chances are, it is because it is part of a separate responsibility: it should be on another class.”

I’m a huge fan of Michael Feathers, bordering on Fanboy lunacy, so I’m a bit hesitant to disagree with him. Also, to be fair, I agree with Lasse one hundred percent–about 90% of the time. However, Mike and Lasse present “extract method to class” as a panacea when it is in itself merely a tradeoff with serious consequences of its own.

Extracting a class dilutes and adds complexity to your namespace. Code should struggle very hard to earn a spot at this level. I spend a lot of mental energy on a project trying to understand the class hierarchy, so much so that it is worth my effort to spent time trying to keep it well-organized to assist this understanding. A method that was meant to be private and to serve a class at a specific point in time suddenly becomes a public object. Care must be taken to either make the new class generic enough and provably safe to be used by arbitrary clients, or to prevent other classes from accessing the class at all. If you find yourself trying to figure out how to control access to the class, chances are the code you are trying to extract really should be private after all.

If you are going to extract the private method to a class, considerable care should be taken that it does not, in fact, still fall under the responsibility of the class that contained it. If your new class ends up with an “-er” name, especially “-Manager” or “-Controller”, it’s probably just a Functor–a method call masquerading as an object. If the moved method has many side effects, especially ones that modify instance variables, the moved method will end up taking many arguments and returning many values that the original class must then use to modify itself. You may just end up passing in the original class to the method. Now what you’ve done is replaced the “this” keyword with a variable named “that”. This is a code smell; it’s called Feature Envy, and the solution is to move the method into the envied class. If the new class has the original class’s name in its own, you’ve all but conceded that the new class is useless in isolation from its progenitor; all you’ve really accomplished in this case is cluttering up the object namespace in order to achieve one outcome: making the private method testable. Lasse eschews inelegant workarounds like reflection nonsense or package hackery, but complicating the object map is just as inelegant if it achieves nothing more than making a private method public.

Another counterargument is–and though this is a situation that I wish did not actually exist, every developer has in fact faced it–extract class is not a free refactoring, and on large legacy projects it may be prohibitively expensive. Suddenly “change the design” becomes “they shouldn’t have designed it that way in the first place” and the only solution is to invent a time machine, go back in time, and not have done it wrong. This isn’t practical.

Sometimes you have a method that is, and should be, both private and complex enough to require testing. When that happens, you should test it.

I agree with Lasse and Mike about the problems inherent with testing private methods; I just don’t think their proposed solution is always workable.

This post is getting long, so I’m going to breaak it into to two separate entries. Next up: I want to present some cases when I think I should, even using TDD, design and write private methods that should be tested. I’m also going to address what I think is the primary concern of anti-private-testers, that testing private methods is testing implementation. (Teaser: I completely agree–but I hope to show you why, in some cases, that’s not a problem or at least an acceptable tradeoff.)