ScrumButs

scrumbut [skruhmbut] noun.
1. A person engaged in only partial Agile project management or development methodologies
2. One who engages in either semi-agile or quasi-waterfall development methodologies.
3. One who adopts only SOME tenants of the SCRUM methodology.
4. In general, one who uses the word “but” when answering the question “Do you do SCRUM?”

ScrumButs are the best part of SCRUM makes a great argument that “optimal behavior of a team cannot be predicted” and so retrospectives may be used to tweak SCRUM to optimize it for a particular team. At last a refreshing break from the SCRUM zealots who believe that any deviation from SCRUM is bad and that if you are not doing pure SCRUM then you are not doing SCRUM.

It is worth highlighting that there is a great emphasis on using retrospectives to drive positive optimizations. Pure SCRUM should be adhered to for an iteration or two and then use retrospectives to adjust as necessary for a particular team. The need for an adjustment highlights an aspect of the team behavior that may be different from that envisaged by SCRUM. The team should decide if that behavior is good and should be kept (via a ScrumBut) or if the behavior is bad and if the team should think about a change. Either way the focus is on optimizing the team behavior and this is usually good.

Hindsight and the 80/20 rule

Good perspective on the 80/20 rule and the obvious benefit of hindsight!

It’s easy to be cynical about the 80/20 rule. There are too many hucksters selling books and consulting services that boil down to saying “concentrate on what’s most productive.” Thanks. Never would have thought of that. Let me write you a check.

At one extreme is the belief that everything is equally important, or at least equally likely to be important. At the other extreme is the belief that 80/20 principles are everywhere an that it is possible to predict the “20″ part. Reality lies somewhere between these extremes, but I believe it is often closer to the latter than we think. In many circumstances acting as if everything were equally important is idiocy or sloth.

You can improve your chances of correctly guessing which activities are going to be most productive. Nobody is going to write a book that tells you how to do this in your particular circumstances. It takes experience and hard work. But you can get better at it over time.

Can you predict the “20” in 80/20?

Update: STL vs Gnulib Performance

Following up on a previous post about the relative performance of STL vs gnulib, this post extends and completes the analysis to include the stl::vector and stl::deque containers. These containers provide a more direct comparison against the Gnulib array-list and carray-list implementations. The results are shown below. Please refer to the previous post on this subject for details on how the results were collected including source code.

DISCLAIMER: Please note that the intent of the post is not to suggest that gnulib should be used instead of STL in C++ programs, but rather to point out that although not part of the C language specification, gnulib offers C programmers a set of containers whose performance is comparable to the STL container performance available to C++ programmers.

Push back (insert at end) performance

Push back (insert at end) performance

Search Performance

Search (unsorted) Performance

Note that the vertical axis represents seconds and the lower the bar for each algorithm, the better.

Based on the above results, it can be seen that the performance of the stl::list is comparable to the gnulib linked_list for both insertions (at the end) and unsorted searching. Gnulib array_list is comparable in implementation to the stl::vector but proves to be almost 2x faster for insertions and ~1.5x slower for searching for this test. The stl::deque is also comparable to the gnulib array_list and proves to be slightly faster for insertions and just under 3x slower for searching.

As a final note, this analysis only performed comparisons based on two tests – push back and unsorted find. This is just an indication of performance and far from complete. The results may well be different if different tests were chosen.

STL vs Gnulib Performance

STL (Standard Template Library) adds containers and algorithms to the C++ language. There is no equivalent library in the C specification but there are a number of available libraries that provide common data structure and algorithm implementations which may be used in C programs. Once such library is gnulib which provides, amongst other things, implementations for common container data structures such as lists and trees. This post provides a quick comparison of the performance of a number of the gnulib list implementations against the performance of the STL list implementation. For this analysis, gcc v4.0.1 using -O2 optimization and the gnulib source from June 13th, 2009 were used.

The Benchmark
A very simple benchmark was chosen for the performance characterization. Two operations were measured. The first was the insertion of a large number of integers into a list, with the integers being added one at a time to the end of the list. The second was the searching for elements within the list. It is acknowledged that these are very simple tests and that any results are somewhat inconclusive in the absence of a larger and more thorough suite of performance tests.

STL list benchmarking source code:


// The insert code
start = clock();
for(uint32_t i=0; i<INSERT_ITERATIONS; ++i)
    l.push_back(i);
stop = clock();

// The search code
start = clock();
for(uint32_t i=0; i<SEARCH_ITERATIONS; ++i)
    find(l.begin(), l.end(), i);
stop = clock();

Gnulib list benchmarking source code:

// The insert code
start = clock();
for(i=0; i<INSERT_ITERATIONS; ++i)
    gl_list_add_last(list, (void*)i);
stop = clock();

// The search code
start = clock();
for(i=0; i<SEARCH_ITERATIONS; ++i)
    gl_list_search(list, (void*)i);
stop = clock();

Both STL and gnulib provide a number of container implementations. For this analysis the focus is on the list container implementations. Although gnulib provides a number of “list” implementations only the gl_linked_list container is a true linked list and the comparison between this and the STL list is the most relevant comparison. However, for illustrating the comparative performance of the other gnulib list implementations, four gnulib list implementations were selected for benchmarking:

  • gl_linked_list
  • gl_linkedhash_list
  • gl_array_list
  • gl_rbtree_list

The comparative performance of the relevant STL containers that are most similar to the gnulib list containers other than the gl_linked_list will be provided in a follow-up post.

Finally, the tests were run on a Macbook running OS X 10 with a 2.4GHz Intel Core 2 Duo processor with 3MB L2 cache and 2GB of RAM.

Results
The results for each algorithm were normalized with respect to the STL list results. So, with this normalization, a number less than one means that this is faster than the STL list implementation and a number greater than one means that this is slower than the STL list implementation.

The normalized results are shown in the graph below.

list_perf

Based on the above results, the STL list implementation and the gnulib gl_linked_list implementation are very similar in terms of performance with the gnulib implementation being very slightly faster. The gl_array_list implementation offers the best insertion performance (>5x faster than the STL implementation) and the gl_linkedhash_list implementation offers the best search performance (~5000x faster than the STL implementation).

Conclusions
The pure linked list implementations of both STL list and the gl_linked_list are almost identical in performance for both tests.

While the above results show the relative performance of the STL list against the other gnulib container implementations, this is not a fair and complete comparison as the STL offers other sequence containers that have similar implementations to some of the gnulib containers analyzed here. A follow up post will fill out the comparison started here and include STL vectors and sets into the comparison.

EDIT: The original intent of this post to compare an STL list against the full suite of gnulib list implementations was deeply flawed. The gnulib “list” interface is closer to the more generic STL sequence interface and a more correct and complete comparison would be to compare the gnulib containers against the STL containers. This will be a focus of a follow-up post. I’ve edited this post to clarify the comparison in order to avoid confusion. While I realized my mistake shortly after publishing the post, I was happy to see that this was also picked up by several readers – so many thanks to those of you who provided comments for keeping me honest!

The Unexpected Upgrade Experience

A couple of months ago I upgraded my hosting package during its annual renewal. Up to then I had an excellent experience with my provider and so when they made me an offer to move to their new hosting package for a similar price as my old package it seemed like a reasonable thing to do. My thinking was that since the old package was no longer available to new customers then it would only be a matter of time before support for the old package was reduced and eventually discontinued.

Since the migration to the new package, I have been plagued with issues. Nothing catastrophic but having to raise support tickets and the resultant email tag with the support team gets tiresome after a ticket or two. The root of the problem is that the majority of the help items and howtos relate to the old package and not the new one. That would be fine if they both operated similarly – but they don’t!

I’ll use my most recent issue as an example. On the old package, the WordPress application was installed for a domain using the application manager and the application was installed into the httpdocs directory area for that domain. I stupidly assumed that the same would be true for the new package – especially since the application manager offered a WordPress installation. So, I installed the application and imported the posts from my old package and everything seemed to be fine. That was until I tried to change the permalink format for the posts. When I tried this I got a .htaccess permissions error. It turns out there were two things wrong. The first was that the WordPress files in my domain’s httpdocs area were actually a copy from my old package that were copied across by my hosting provider during the migration and were completely independent of the current WordPress installation. The second was that the application install for WordPress installed into a hidden directory tree that was so hidden that you could not browse to it without knowing the exact path before hand – and that exact path contained a five digit object id as the hidden part of its pathname. Once the support team figured this out, we easily sorted it out. 

This sort of thing should not have happened in the first place. Where differences in hosting packages exist, there should be a cheat sheet available to help people migrate between packages. The howto documentation should also cover both packages, not just the old one. Having good tech support is great but having a great product that doesn’t require tech support is so much better!

Hopefully all my problems are finally resolved. I’m on the fence as to whether or not I’ll stick with my provider … one more issue will push me over!

Frequent Informal Code Reviews

The traditional code review model is to write a bunch of code and then gather the troops to review it once its done and ready. Following the review the author then spends a couple of hours/days in rework.
In contrast one of the benefits of pair programming is the just in time scrutiny of a second pair of eyes while the code is being developed. This just in time scrutiny helps pick up problems early and the cost of rework is minimal. Pair programming does however have some pitfalls and is not as prevalent as it supporters would like.
There is a middle ground that involves team members spending a few minutes each day informally reviewing any changes that were committed to the source code repository in the previous day. The informal review involves reading the changes and posting questions and comments to the author. This provides “almost in time” feedback and keeps everyone in tune with the changes and direction of the code base.

This approach has the following characteristics:

  • Collective code ownership
  • Frequent informal code reviews
  • Requires regular commits to the repository
  • Requires discipline from team members to frequently review new changes – otherwise thy just build up and we are back to where we began
  • Has effect of improving code quality  – people are less likely to commit rubbish with the intent of cleaning it up later since they know they will receive instant feedback

Note that some shops mandate a code review prior to each checkin. While this approach does help keep up the quality of committed code, it does introduce delays and bottlenecks in the checkin process.

Ensuring Success at Code Reviews

Volumes have been written on the subject of code reviews and various processes for conducting them. Most processes describe particular code review roles such as reader, moderator etc. and formats for conducting the code review meeting. These roles are certainly important to keep the code review meeting on track and to avoid the review degenerating into religious wars on commenting styles and magic numbers. However, I have witnessed many code reviews that had brilliant moderators and readers but failed to find any significant defects. It wasn’t because the code under review was defect free because serious defects were often found during testing. At one point a team was seriously considering scraping code reviews due to low return for a huge effort. So what went wrong?

I believe there were two general problems. The first was that the areas under review were rather specialist areas involving complex design and code. In general, most of the reviewers were not familiar with either the code or the details of the design prior to the review. The second problem was that in many cases the people reviewing the code did not have sufficient levels of experience in the area under review to be considered experts in that area. On average more than 50% of the reviewers were new developers ramping up on the technology area and in some cases even ramping up on the programming language itself. The code reviews were used as a ramp-up task for these developers and while it certainly provided an opportunity for these developers to ask questions about various parts of the code, they were not in a position to be able to critically review the code.

So what can we learn from this? My key takeaways are:

  1. Ensure all reviewers are sufficiently familiar with the design and the code prior to the review. A one hour presentation on the design is usually not enough to provide this familiarization.
  2. Ensure sufficient experts are involved in the review.
  3. Using code reviews as part of a newbie’s ramp up is a good idea, but the newbies shouldn’t be relied upon to provide the critical feedback for the review. As a guideline, newbies should not compromise more than 25% of the reviewers.
  4. If any of the above are not satisfied then postpone the review until they can be satisfied.

SCRUM and Distributed Teams

Good paper on how to increase the chance of success when using SCRUM and agile techniques with distributed teams. Interesting use of the term “project gravity”. From the abstract:

Distributed agile can work, but it is risky. Even more than traditional agile implementations, teams tend to fall into two failure modes: “command and control” or team fragmentation. Distribution between teams or team members has two distinct dimensions: separation and project gravity. Distribution, including physical distance, is the most commonly recognized barrier to effective distributed teams. Gravity is the other, less recognized dimension of separation and is some stable center which teams can organize around. This can take the form of a stable product backlog, a clear mission, or an inspiring product owner. Failing to recognize where your project lies along the dimensions of separation will make it more prone to the two failure modes. Both failure modes are discussed in terms of the dimensions of distribution, with suggestions for remedial actions. In either case, leadership must be aware of both dimensions of a team/project’s distribution, and act accordingly.