This is why I'm not on Facebook: 100 million Facebook users' info has been made available for free download: http://gu.com/p/2ty2k/tw 17 hrs ago
27
Sep

Why Stack Overflow’s reputation system is broken

I find it rather ironic that the author of the blog entry from which this excerpt is taken:

It seems like any time you try to measure the performance of knowledge workers, things rapidly disintegrate, and you get what Robert D. Austin calls measurement dysfunction. His book Measuring and Managing Performance in Organizations is an excellent and thorough survey of the subject. Managers like to implement measurement systems, and they like to tie compensation to performance based on these measurement systems. But in the absence of 100% supervision, workers have an incentive to “work to the measurement,” concerning themselves solely with the measurement and not with the actual value or quality of their work.

is also one of the faces behind a programmer website which does exactly what he is railing against.

I’m talking about the Stack Overflow reputation and badge system. Granted, it was more Jeff Atwood’s idea than Joel’s — he took his inspiration for it from the Xbox 360 — but the big problem is that when you try to turn a serious system that is supposed to be all about Getting Things Done into a game, people just game the system and turn it into an unusable mess that is not fit for purpose.

If you want to see what I mean, just take a look at this question, which I asked yesterday afternoon. I’ve been looking for a bug tracker system which can work as an integrated system for both developers and project managers for a while now, and none of the ones I’ve looked at so far have the particular feature I’m asking for.

The first so-called answer came within seconds and didn’t answer the question properly, which isn’t surprising since you would need at least 2-3 minutes just to read the question in the first place. It was followed by a string of about ten or so responses over the next half hour, again, very few of which made much effort to read the question, let alone answer it. Most people seemed to treat it as saying “What is your favourite issue tracker?” and one busybody even tagged it as “subjective” when I was asking for something very specific. And nobody so far has reported any success or otherwise with using an issue tracker of any description to integrate both the developer’s-eye view and the project manager’s-eye view.

This is a BIG problem with Stack Overflow, and I’ve seen it to an extent on other questions too. The system doesn’t favour good answers or correct answers or answers that actually make any attempt to answer the question, it favours quick answers. Being the first off the mark with something that at least looks like it could plausibly be an answer to the question means you’re most likely to get voted up. Getting voted up means appearing at the top of the list of answers, and it’s kind of self perpetuating because then you get more votes, and each vote means that you get ten reputation points, and if you get enough reputation points, you automatically become the Stack Overflow equivalent of a Wikipedia administrator.

The result is that you get a whole lot of knuckleheads gaming the system trying to pimp their reputation. They put up a response that looks fairly plausible and seems right to other knuckleheads but which either (a) doesn’t answer the question, or (b) is plain wrong. If the person asking the question is also a knucklehead, their answer gets marked as the accepted answer, which means even more reputation points. In the meantime, someone who arrives several days or weeks later with the correct answer doesn’t get any attention because their answer gets buried in all the other zeros. It’s particularly worrying because it’ll be the knuckleheads who end up running the show and deciding what goes and what doesn’t.

It’s as broken as lines of code per day, and it really really annoys me.

It really annoys a lot of other Stackers too — a request to fix it is the most popular user request on the Stack Overflow uservoice forums, though the problem is that there is no consensus about what needs to be done to stop it. I do hope they come up with some fix for it, otherwise the site could end up with no more value than its arch-nemesis, expertsexchange.

31
May

Productivity metrics: garbage in, garbage out

I came across this article today when I was googling for a link for another blog entry. I was flabbergasted to see that it was written by someone with a PhD, appears in a professional engineering journal, and is currently linked from their home page:

Over time, there have been many attempts to define metrics that effectively measure software development productivity. Most of the ones that I have seen are amazingly complicated and very difficult to apply.

I think there is a simpler productivity metric which should be used across the industry: the total number lines of code in the organization divided by the number of people who are working on that code (including QA as well as development). For short, I will call this metric the LOC per head.

I propose that this measurement is an excellent representation of the development organization’s true productivity. If the number rises, it means that the development organization is more productive. If it decreases, it means that the organization is less productive

Ah, the old lines of code chestnut again. For some reason, managers seem to love it. The only problem is, it’s totally brain-dead. Like government targets, any formal productivity metric can and will be gamed — usually with disastrous results, as Joel Spolsky points out.

You want lines of code? Be prepared for your code base to be poisoned with endless copy and paste code and needless repetition, which, as any competent developer will tell you, is a nightmare to maintain. Or you may even end up with a joker on your team who decides to script the process and produce a million lines of code a second without even turning up at the office.

Besides, some frameworks such as Ruby on Rails or jQuery allow you to accomplish much more with far fewer lines of code. The first release of 37 Signals’ Ta-Da List — a full-blown commercial product — contained less than 600 lines of Ruby code. So does that make DHH and colleagues unproductive? Of course not! On the contrary — it makes them brilliant.

You want lots of check-ins to source control? Fine, you’ll end up with dozens of them just to correct a single spelling mistake — and as a side effect, a version history that leaves everyone totally confused as to exactly what’s been going on.

You want lots of bug fixes in the issue tracker? Expect your developers to deliberately write bugs into their code so that they can “fix” them.

You want to compensate for this by penalising bug reports? You’re asking your developers to mislead your testers about what functionality is actually in the code base so they’ll pick up on fewer bugs.

And so on, and so on.

As the old computing adage goes, garbage in, garbage out.