Skip to main content

How NOT To Write Code / How To Decipher Bad Code

Related image
Confusing code is bad code.


In this post I'll be talking about one of the worst coding practices: using nondescript variables. As an example, we'll be looking at a simple problem from Project Euler (#28). I'll give you an example using really bad code and then we'll look at how we can decipher it and make it more clear and understandable.

First, here's the problem:

Starting with the number 1 and moving to the right in a clockwise direction a 5 by 5 spiral is formed as follows:
21 22 23 24 25
20  7  8  9 10
19  6  1  2 11
18  5  4  3 12
17 16 15 14 13
It can be verified that the sum of the numbers on the diagonals is 101.
What is the sum of the numbers on the diagonals in a 1001 by 1001 spiral formed in the same way?

Ok, so now check out this solution:


            int i = 1; int j = 1; int s = 0;
            int m = 0; int e = 0;
            while (j - i + 1 < 1002)
            {
                s += j; i = j;
                e += 1; if (e == 5) { e = 1; }
                m = (e == 1) ? m + 1 : m;
                j = i + 2 * m;
            }
            stopwatch.Stop();

            Console.WriteLine("s: " + s);

Do you see the problem with this code? Maybe you're some kind of genius who can immediately understand what this method is doing, but most people can't. I'm even including myself in that group, and I'm the one who wrote this. The problem is, I wrote it two or three months ago, and now those five variables have zero meaning to me. Actually, I can tell what the s is, because it's giving me the answer to the problem, so it has to be the sum of the numbers on the diagonals in a 1001 by 1001 spiral. 

By the way, the problem with the code above isn't mathematical. The method produces the correct answer. But if we ever wanted to use that code again and tweak it to make it work in a wider variety of situations, we would first have to go through and try to understand it again. If the method had been written better the first time, we would have saved ourselves the headache of deciphering it in the future. 

So now let's figure out what the method does and see if we can make it easier to read. I'm going to start by methodically dissecting this mess, which hopefully will give you some pointers about how to interpret bad code, but if you want to skip to our finished code, just go to the last section.  

We've already said that s is the sum of the numbers on the diagonals. The initial value of s is 0. Each loop of the while loop, s gets incremented by j. Let's rename this variable sumDiagonalNums. Some people might shorten this to sumDiagonals or something like that. It's up to you, just make sure the name clearly conveys the purpose of the variable. 

Now we need to know what j represents. The initial value of j is 1. Each loop of the while loop, j is reset to equal i + 2*m. We can assume what j represents based on the fact that sumDiagonalNums is incremented by j every loop. Therefore, j must be the value of the next number laying on one of the spiral's diagonals. Let's rename this variable diagonalNum.

diagonalNum equals i + 2*m. Let's try to figure out what i and m are. The initial value of i is 1 and each while loop, it gets reset to equal the previous loop's diagonalNum. So, i can be renamed lastDiagonalNum. What about m? The initial value of m is 0. For each while loop, if e = 1, then m is incremented by 1. Otherwise, it retains its current value. We also know that 2*m is equal to diagonalNum - lastDiagonalNum. Therefore, m = (diagonalNum - lastDiagonalNum)/2. You might expect m to be a fraction sometimes, if it is always exactly half the difference between the two diagonals, but in fact if you look at the spiral pattern in the problem example, you will see that the difference between each diagonal number is an even number. This enables us to define m as an integer instead of a float. Let's rename this variable halfDiagonalDifference

All that remains is that pesky e. This variable starts at 0, gets incremented by 1 each while loop, and once it reaches 5, it gets reset to 1. In other words, e is constantly cycling 1-2-3-4-1... If e = 1, then halfDiagonalDifference is incremented by 1. This only makes sense if we look again at the example spiral that was provided us in the problem. After every 4 corners/edges in the spiral, the difference between the diagonal numbers increases by 2, or, half the difference increases by 1. Let's rename e as edge. 

We now have the following code, which is a little easier to understand: 

            int lastDiagonalNum = 1;
            int diagonalNum = 1;
            int sumDiagonalNums = 0;
            int halfDiagonalDifference = 0;
            int edge = 0;
            while (diagonalNum - lastDiagonalNum + 1 < 1002)
            {
                sumDiagonalNums += diagonalNum;
                lastDiagonalNum = diagonalNum;
                edge += 1;
                if (edge == 5) { edge = 1; }
                halfDiagonalDifference = (edge == 1) ?
                halfDiagonalDifference + 1 : halfDiagonalDifference;
                diagonalNum = lastDiagonalNum + 2 * halfDiagonalDifference;
            }
            stopwatch.Stop();
            Console.WriteLine("Sum of Diagonal Numbers: " + sumDiagonalNums);

Maybe you're like me though, and this still looks a little messy. We can clean this up by working with the full diagonal difference instead of half, reducing the while statement, reducing some of our conditional statements, rearranging a couple other statements and changing the initial values of a couple of our variables. 

The final product can look something like this: 

            int lastDiagonalNum = 1;
            int diagonalNum = 1;
            int sumDiagonalNums = 1;
            int diagonalDifference = 2;
            int edge = 1;


            while (diagonalDifference < 1001)
            {
                diagonalNum = lastDiagonalNum + diagonalDifference;
                sumDiagonalNums += diagonalNum;
                lastDiagonalNum = diagonalNum;
                edge += 1;


                if (edge == 5)
                {
                    edge = 1;
                    diagonalDifference += 2;
                }        

     
            }


            stopwatch.Stop();
            Console.WriteLine("Sum of Diagonal Numbers: " + sumDiagonalNums);

By the way, not only does this last version of our method look cleaner, it runs about 25% faster than the original. One side take-away is that reducing the number of conditional statements and reducing the number/complexity of calculations performed will always give us faster performance. My guess is that the reducing the number of conditional statements from two to one accounts for the majority of the acceleration.



Comments

Popular posts from this blog

Book Review: Learn WPF MVVM - XAML, C# and the MVVM pattern by Arnaud Weil

Last week I was looking for inexpensive ways to learn about applying the MVVM pattern to WPF programming (I still am; please leave any tips in the comments section) and ran across Arnaud Weil's $10 digitial book on Amazon. It is brief, less than two hundred pages, but obviously economical and I was impressed with how interactive it was. Pros: The book teaches the user how to design a XAML page, how to use data binding, and how to set up the Model-View-View Model structure to conform to commonly accepted standards. The feature that impressed me the most was that Weil has provided a number of interactive exercises to teach you how to create a WPF-MVVM application. After each block of material, there are "Now it's your turn to code:" sections in which you are expected to create a XAML control, write a method, bind a control to a data object, etc. Afterward there is a "Solution" section in which Weil leads you step-by-step through the assignment, in ca...

Project Euler Problem #19 Counting Sundays

Spoiler Alert: As the About page of Project Euler says, "Real learning is an active process and seeing how it is done is a long way from experiencing that epiphany of discovery." If you are currently attempting to solve this problem in the archives, please do not continue to read this post. I personally worked through all the problems I post on this blog and it has brought me immense satisfaction.  Problem: You are given the following information, but you may prefer to do some research for yourself. 1 Jan 1900 was a Monday. Thirty days has September, April, June and November. All the rest have thirty-one, Saving February alone, Which has twenty-eight, rain or shine. And on leap years, twenty-nine. A leap year occurs on any year evenly divisible by 4, but not on a century unless it is divisible by 400. How many Sundays fell on the first of the month during the twentieth century (1 Jan 1901 to 31 Dec 2000)? Solution:  Although this problem sounds a little ...