Archive for June, 2009

Update: STL vs Gnulib Performance

Sunday, June 28th, 2009

Following up on a previous post about the relative performance of STL vs gnulib, this post extends and completes the analysis to include the stl::vector and stl::deque containers. These containers provide a more direct comparison against the Gnulib array-list and carray-list implementations. The results are shown below. Please refer to the previous post on this subject for details on how the results were collected including source code.

DISCLAIMER: Please note that the intent of the post is not to suggest that gnulib should be used instead of STL in C++ programs, but rather to point out that although not part of the C language specification, gnulib offers C programmers a set of containers whose performance is comparable to the STL container performance available to C++ programmers.

Push back (insert at end) performance

Push back (insert at end) performance

Search Performance

Search (unsorted) Performance

Note that the vertical axis represents seconds and the lower the bar for each algorithm, the better.

Based on the above results, it can be seen that the performance of the stl::list is comparable to the gnulib linked_list for both insertions (at the end) and unsorted searching. Gnulib array_list is comparable in implementation to the stl::vector but proves to be almost 2x faster for insertions and ~1.5x slower for searching for this test. The stl::deque is also comparable to the gnulib array_list and proves to be slightly faster for insertions and just under 3x slower for searching.

As a final note, this analysis only performed comparisons based on two tests – push back and unsorted find. This is just an indication of performance and far from complete. The results may well be different if different tests were chosen.

STL vs Gnulib Performance

Saturday, June 20th, 2009

STL (Standard Template Library) adds containers and algorithms to the C++ language. There is no equivalent library in the C specification but there are a number of available libraries that provide common data structure and algorithm implementations which may be used in C programs. Once such library is gnulib which provides, amongst other things, implementations for common container data structures such as lists and trees. This post provides a quick comparison of the performance of a number of the gnulib list implementations against the performance of the STL list implementation. For this analysis, gcc v4.0.1 using -O2 optimization and the gnulib source from June 13th, 2009 were used.

The Benchmark
A very simple benchmark was chosen for the performance characterization. Two operations were measured. The first was the insertion of a large number of integers into a list, with the integers being added one at a time to the end of the list. The second was the searching for elements within the list. It is acknowledged that these are very simple tests and that any results are somewhat inconclusive in the absence of a larger and more thorough suite of performance tests.

STL list benchmarking source code:

// The insert code
start = clock();
for(uint32_t i=0; i<INSERT_ITERATIONS; ++i)
stop = clock();

// The search code
start = clock();
for(uint32_t i=0; i<SEARCH_ITERATIONS; ++i)
    find(l.begin(), l.end(), i);
stop = clock();

Gnulib list benchmarking source code:

// The insert code
start = clock();
for(i=0; i<INSERT_ITERATIONS; ++i)
    gl_list_add_last(list, (void*)i);
stop = clock();

// The search code
start = clock();
for(i=0; i<SEARCH_ITERATIONS; ++i)
    gl_list_search(list, (void*)i);
stop = clock();

Both STL and gnulib provide a number of container implementations. For this analysis the focus is on the list container implementations. Although gnulib provides a number of “list” implementations only the gl_linked_list container is a true linked list and the comparison between this and the STL list is the most relevant comparison. However, for illustrating the comparative performance of the other gnulib list implementations, four gnulib list implementations were selected for benchmarking:

  • gl_linked_list
  • gl_linkedhash_list
  • gl_array_list
  • gl_rbtree_list

The comparative performance of the relevant STL containers that are most similar to the gnulib list containers other than the gl_linked_list will be provided in a follow-up post.

Finally, the tests were run on a Macbook running OS X 10 with a 2.4GHz Intel Core 2 Duo processor with 3MB L2 cache and 2GB of RAM.

The results for each algorithm were normalized with respect to the STL list results. So, with this normalization, a number less than one means that this is faster than the STL list implementation and a number greater than one means that this is slower than the STL list implementation.

The normalized results are shown in the graph below.


Based on the above results, the STL list implementation and the gnulib gl_linked_list implementation are very similar in terms of performance with the gnulib implementation being very slightly faster. The gl_array_list implementation offers the best insertion performance (>5x faster than the STL implementation) and the gl_linkedhash_list implementation offers the best search performance (~5000x faster than the STL implementation).

The pure linked list implementations of both STL list and the gl_linked_list are almost identical in performance for both tests.

While the above results show the relative performance of the STL list against the other gnulib container implementations, this is not a fair and complete comparison as the STL offers other sequence containers that have similar implementations to some of the gnulib containers analyzed here. A follow up post will fill out the comparison started here and include STL vectors and sets into the comparison.

EDIT: The original intent of this post to compare an STL list against the full suite of gnulib list implementations was deeply flawed. The gnulib “list” interface is closer to the more generic STL sequence interface and a more correct and complete comparison would be to compare the gnulib containers against the STL containers. This will be a focus of a follow-up post. I’ve edited this post to clarify the comparison in order to avoid confusion. While I realized my mistake shortly after publishing the post, I was happy to see that this was also picked up by several readers – so many thanks to those of you who provided comments for keeping me honest!

The Unexpected Upgrade Experience

Sunday, June 14th, 2009

A couple of months ago I upgraded my hosting package during its annual renewal. Up to then I had an excellent experience with my provider and so when they made me an offer to move to their new hosting package for a similar price as my old package it seemed like a reasonable thing to do. My thinking was that since the old package was no longer available to new customers then it would only be a matter of time before support for the old package was reduced and eventually discontinued.

Since the migration to the new package, I have been plagued with issues. Nothing catastrophic but having to raise support tickets and the resultant email tag with the support team gets tiresome after a ticket or two. The root of the problem is that the majority of the help items and howtos relate to the old package and not the new one. That would be fine if they both operated similarly – but they don’t!

I’ll use my most recent issue as an example. On the old package, the WordPress application was installed for a domain using the application manager and the application was installed into the httpdocs directory area for that domain. I stupidly assumed that the same would be true for the new package – especially since the application manager offered a WordPress installation. So, I installed the application and imported the posts from my old package and everything seemed to be fine. That was until I tried to change the permalink format for the posts. When I tried this I got a .htaccess permissions error. It turns out there were two things wrong. The first was that the WordPress files in my domain’s httpdocs area were actually a copy from my old package that were copied across by my hosting provider during the migration and were completely independent of the current WordPress installation. The second was that the application install for WordPress installed into a hidden directory tree that was so hidden that you could not browse to it without knowing the exact path before hand – and that exact path contained a five digit object id as the hidden part of its pathname. Once the support team figured this out, we easily sorted it out. 

This sort of thing should not have happened in the first place. Where differences in hosting packages exist, there should be a cheat sheet available to help people migrate between packages. The howto documentation should also cover both packages, not just the old one. Having good tech support is great but having a great product that doesn’t require tech support is so much better!

Hopefully all my problems are finally resolved. I’m on the fence as to whether or not I’ll stick with my provider … one more issue will push me over!

Frequent Informal Code Reviews

Saturday, June 13th, 2009

The traditional code review model is to write a bunch of code and then gather the troops to review it once its done and ready. Following the review the author then spends a couple of hours/days in rework.
In contrast one of the benefits of pair programming is the just in time scrutiny of a second pair of eyes while the code is being developed. This just in time scrutiny helps pick up problems early and the cost of rework is minimal. Pair programming does however have some pitfalls and is not as prevalent as it supporters would like.
There is a middle ground that involves team members spending a few minutes each day informally reviewing any changes that were committed to the source code repository in the previous day. The informal review involves reading the changes and posting questions and comments to the author. This provides “almost in time” feedback and keeps everyone in tune with the changes and direction of the code base.

This approach has the following characteristics:

  • Collective code ownership
  • Frequent informal code reviews
  • Requires regular commits to the repository
  • Requires discipline from team members to frequently review new changes – otherwise thy just build up and we are back to where we began
  • Has effect of improving code quality  – people are less likely to commit rubbish with the intent of cleaning it up later since they know they will receive instant feedback

Note that some shops mandate a code review prior to each checkin. While this approach does help keep up the quality of committed code, it does introduce delays and bottlenecks in the checkin process.