Post-Increment vs Pre-Increment

January 29th, 2010

There is a well known C++ BKM for preferring pre-increment (++i) over post-increment (i++). The rationale is that when post-incrementing and object, the object must increment itself and then return a temporary containing its old value. While this makes perfect sense for complex types (e.g. STL iterators) I’ve always wondered if it holds true for built-in types such as int. Let’s find out.

First let’s create a trivial example to examine:

#include <iostream>

using namespace std;

int main()
{
	// post-imcrement
	for(int i=0; i<10; i++)
	{
		cout << "[" << i << "]" << endl;
	}

	// pre-imcrement
	for(int i=0; i<10; ++i)
	{
		cout << "[" << i << "]" << endl;
	}

	return 0;
}

Now using the Visual Studio 2008 C++ compiler, let’s take a look at the generated assembly code. First let’s keep things simple and disable compiler optimizations (/Od):

	// post-imcrement
	for(int i=0; i<10; i++)
004114D3  mov         dword ptr [i],0
004114D5  jmp         main+30h (4114E0h)
004114D7  mov         eax,dword ptr [i]
004114DA  add         eax,1
004114DD  mov         dword ptr [i],eax
004114E0  cmp         dword ptr [i],0Ah
004114E4  jge         main+86h (411536h)
	{

<... snip ...>

	// pre-imcrement
	for(int i=0; i<10; ++i)
00411536  mov         dword ptr [i],0
0041153D  jmp         main+98h (411548h)
0041153F  mov         eax,dword ptr [i]
00411542  add         eax,1
00411545  mov         dword ptr [i],eax
00411548  cmp         dword ptr [i],0Ah
0041154C  jge         main+0EEh (41159Eh)
	{

The relevant parts of the assembly code are highlighted. As you can see, there is no difference between the generated code for post-increment and pre-increment.

Now, with compiler optimizations enabled (\O2) and the relevant parts highlighted:

	// post-imcrement
	for(int i=0; i<10; i++)
00401008  xor         esi,esi
0040100A  lea         ebx,[ebx]
	{
<... snip ...>
0040104A  inc         esi
0040104B  cmp         esi,0Ah
0040104E  jl          main+10h (401010h)
	}

	// pre-imcrement
	for(int i=0; i<10; ++i)
00401050  xor         esi,esi
	{
<... snip ...>
0040108C  inc         esi
0040108D  cmp         esi,0Ah
00401090  jl          main+52h (401052h)
	}

Again there is no difference in the generated assembly between post-increment and pre-increment.

So, based on this quick experiment, there is no difference between pre-increment and post-increment for int types when using the Visual Studio 2008 C++ compiler with optimizations enabled or disabled.

Note that this doesn’t change the BKM to prefer pre-increments over post-increments when using integer types. The above experiment was only conducted with a single compiler – other compilers may behave differently and so it is best to err on the side of caution and use pre-increments where possible. Also, the use of pre-increment adds some future proofing – if the type of the variable being incremented changed to a more complex type then pre-increment then becomes the optimal choice.

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

countloc v0.4 released

December 18th, 2009

In the absence of independent user feedback, a good way to approximate independent feedback is to write something, not look at it for a while and then try to read/use it. I was reminded of this recently when I tried to use my countloc app to gather some project LOC metrics. I had written this particular version about a year ago and hadn’t used it in a while. After dusting off the mothballs and invoking the help, it took me two attempts to get it to work properly. I was not impressed. I was also worried that the hundred or so other people that downloaded it had a similar experience of finding that it was not as usable as it should be.

I’ve hopefully fixed the usability problems in the latest version – countloc v0.4. I’ve also fixed a few bugs and added in some feature requests that I had received via the RubyForge page.

If you do give this app a twirl please provide feedback via the RubyForge area on any issues encountered.

Note: The app is implemented in Ruby and the download also contains the source code. I would be interested in hearing from any Ruby gurus out there on any suggestions for improving the code. I’ve set up a thread in the forum area for providing feedback.

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

CodeKata – Kata Eighteen Solution

November 8th, 2009

Kata Eighteen requires some code to calculate transitive dependencies, i.e. how dependencies propagate between things. Here is my solution.

class Dependencies

  def initialize()
    @direct = Hash.new
  end

  def add_direct(item, dependants)
    @direct[item] = dependants
  end

  def dependencies_for(item)
    dependants = []
    toBeProcessed = @direct[item].clone
    while not toBeProcessed.empty?
      x = toBeProcessed.shift
      if x != item and not dependants.include?(x)
        dependants << x
        toBeProcessed.concat(@direct[x]) if @direct.include?(x)
      end
    end
    return dependants.sort!
  end

end

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

CodeKata – Kata Five Solution

November 4th, 2009

Kata five requires the implementation of a Bloom filter. Here is my solution in Ruby using a non-optimized bitmap implementation and a reduced MD5 digest for the hashing functions.

class BloomFilter

  def initialize(dictionary, num_bits, num_hashes)
    @bitmap = Bitmap.new(num_bits)
    @hashes = Array.new(num_hashes) { |i| BloomHash.new(i, num_bits-1) }

    # compute the hash values for each word in the dictionary
    # and set the appropriate bits in the bitmap
    File.new(dictionary, 'r').each_line do |word|
      @hashes.each { |h| @bitmap[h.compute(word.chomp)] = 1 }
    end
  end

  def search(word)
    @hashes.each { |h| return false if @bitmap[h.compute(word)] == 0 }
    return true
  end

end

class BloomHash

  def initialize(seed, max_value)
    @offset = seed
    @num_digits = 0
    @max_value = max_value
    while(max_value > 0) do
      @num_digits += 1
      max_value >>= 4
    end
  end

  def compute(word)
    digest = Digest::MD5.hexdigest(word)
    return digest[@offset, @num_digits].to_i(16) % @max_value
  end

end

class Bitmap < Array

  def initialize(size)
    super(size)
    self.fill(0)
  end

end

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

Multi-thread scaling issues with Ruby [update]

November 3rd, 2009

In my previous post on Multi-thread scaling issues with Python and Ruby, I had described the threading implementation within Ruby as using green threads. It seems that this has changed in Ruby 1.9 …

Ruby 1.9 threads do map to native threads, unfortunately the 1.9 interpreter forces user created threads to acquire a global mutex lock before executing. The upshot is that 1.9 thread execution is serialized and unable to benefit from multiple cores.

See Ruby in a multicore world for further discussion.

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

CodeKata – Kata Two Solution

November 3rd, 2009

Dave Thomas maintains a blog of “Code Kata’s”. He describes a code kata as being a practice session for practicing coding skills:

Each is a short exercise (perhaps 30 minutes to an hour long). Some involve programming, and can be coded in many different ways. Some are open ended, and involve thinking about the issues behind programming. These are unlikely to have a single correct answer.

Here are two of my solutions for kata two using Ruby. The first uses recursion and the second uses an iterative approach.

Recursive solution …

# Recursive implementation
def chop(target, values)
  # Special handling for zero and single element arrays
  return -1 if values.empty?

  return ((target == values[0]) ? 0 : -1) if values.length == 1

  # Try the bottom half first
  pos = chop(target, values[0, values.length/2])
  return pos if pos != -1

  # Then the upper half ... remember that the returned
  # position is relative to the middle of the array.
  pos = chop(target, values[values.length/2, values.length-1])
  return pos + (values.length/2) if pos != -1

  # Didn't find what we were looking for
  return -1
end

Iterative solution …

# Iterative implementation
def chop(target, values)
  start = 0
  stop = values.length  # stop indexes one past the end of the array

  # loop until something is found or until we
  # run out of elements to search
  while start != stop
    # Find the middle
    mid = start + (stop - start) / 2

    # Check to see if we got lucky
    return mid if values[mid] == target

    # Not this time ... so lets see which half it might be in
    if target < values[pos]
      stop = mid
    else
      start = mid + 1
    end
  end

  # Didn't find what we were looking for
  return -1
end

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

ScrumButs

September 4th, 2009

scrumbut [skruhmbut] noun.
1. A person engaged in only partial Agile project management or development methodologies
2. One who engages in either semi-agile or quasi-waterfall development methodologies.
3. One who adopts only SOME tenants of the SCRUM methodology.
4. In general, one who uses the word “but” when answering the question “Do you do SCRUM?”

ScrumButs are the best part of SCRUM makes a great argument that “optimal behavior of a team cannot be predicted” and so retrospectives may be used to tweak SCRUM to optimize it for a particular team. At last a refreshing break from the SCRUM zealots who believe that any deviation from SCRUM is bad and that if you are not doing pure SCRUM then you are not doing SCRUM.

It is worth highlighting that there is a great emphasis on using retrospectives to drive positive optimizations. Pure SCRUM should be adhered to for an iteration or two and then use retrospectives to adjust as necessary for a particular team. The need for an adjustment highlights an aspect of the team behavior that may be different from that envisaged by SCRUM. The team should decide if that behavior is good and should be kept (via a ScrumBut) or if the behavior is bad and if the team should think about a change. Either way the focus is on optimizing the team behavior and this is usually good.

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

Why I like Ruby …

August 19th, 2009
2.times { deck.shuffle }

is much more expressive than

for(i=0; i<2; ++i)
deck.shuffle

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

Hindsight and the 80/20 rule

July 31st, 2009

Good perspective on the 80/20 rule and the obvious benefit of hindsight!

It’s easy to be cynical about the 80/20 rule. There are too many hucksters selling books and consulting services that boil down to saying “concentrate on what’s most productive.” Thanks. Never would have thought of that. Let me write you a check.

At one extreme is the belief that everything is equally important, or at least equally likely to be important. At the other extreme is the belief that 80/20 principles are everywhere an that it is possible to predict the “20″ part. Reality lies somewhere between these extremes, but I believe it is often closer to the latter than we think. In many circumstances acting as if everything were equally important is idiocy or sloth.

You can improve your chances of correctly guessing which activities are going to be most productive. Nobody is going to write a book that tells you how to do this in your particular circumstances. It takes experience and hard work. But you can get better at it over time.

Can you predict the “20″ in 80/20?

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

Update: STL vs Gnulib Performance

June 28th, 2009

Following up on a previous post about the relative performance of STL vs gnulib, this post extends and completes the analysis to include the stl::vector and stl::deque containers. These containers provide a more direct comparison against the Gnulib array-list and carray-list implementations. The results are shown below. Please refer to the previous post on this subject for details on how the results were collected including source code.

DISCLAIMER: Please note that the intent of the post is not to suggest that gnulib should be used instead of STL in C++ programs, but rather to point out that although not part of the C language specification, gnulib offers C programmers a set of containers whose performance is comparable to the STL container performance available to C++ programmers.

Push back (insert at end) performance

Push back (insert at end) performance

Search Performance

Search (unsorted) Performance

Note that the vertical axis represents seconds and the lower the bar for each algorithm, the better.

Based on the above results, it can be seen that the performance of the stl::list is comparable to the gnulib linked_list for both insertions (at the end) and unsorted searching. Gnulib array_list is comparable in implementation to the stl::vector but proves to be almost 2x faster for insertions and ~1.5x slower for searching for this test. The stl::deque is also comparable to the gnulib array_list and proves to be slightly faster for insertions and just under 3x slower for searching.

As a final note, this analysis only performed comparisons based on two tests – push back and unsorted find. This is just an indication of performance and far from complete. The results may well be different if different tests were chosen.

Post to Twitter Tweet This Post Post to Digg Digg This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post