![]() |
Confusing code is bad code. |
In this post I'll be talking about one of the worst coding practices: using nondescript variables. As an example, we'll be looking at a simple problem from Project Euler (#28). I'll give you an example using really bad code and then we'll look at how we can decipher it and make it more clear and understandable.
First, here's the problem:
21 22 23 24 25
20 7 8 9 10
19 6 1 2 11
18 5 4 3 12
17 16 15 14 13
It can be verified that the sum of the numbers on the diagonals is 101.What is the sum of the numbers on the diagonals in a 1001 by 1001 spiral formed in the same way?
Ok, so now check out this solution:
int i = 1; int j = 1; int s = 0;
int m = 0; int e = 0;
while (j - i + 1 < 1002)
{
s += j; i = j;
e += 1; if (e == 5) { e = 1; }
m = (e == 1) ? m + 1 : m;
j = i + 2 * m;
}
stopwatch.Stop();
Console.WriteLine("s: " + s);
Do you see the problem with this code? Maybe you're some kind of genius who can immediately understand what this method is doing, but most people can't. I'm even including myself in that group, and I'm the one who wrote this. The problem is, I wrote it two or three months ago, and now those five variables have zero meaning to me. Actually, I can tell what the s is, because it's giving me the answer to the problem, so it has to be the sum of the numbers on the diagonals in a 1001 by 1001 spiral.
By the way, the problem with the code above isn't mathematical. The method produces the correct answer. But if we ever wanted to use that code again and tweak it to make it work in a wider variety of situations, we would first have to go through and try to understand it again. If the method had been written better the first time, we would have saved ourselves the headache of deciphering it in the future.
So now let's figure out what the method does and see if we can make it easier to read. I'm going to start by methodically dissecting this mess, which hopefully will give you some pointers about how to interpret bad code, but if you want to skip to our finished code, just go to the last section.
We've already said that s is the sum of the numbers on the diagonals. The initial value of s is 0. Each loop of the while loop, s gets incremented by j. Let's rename this variable sumDiagonalNums. Some people might shorten this to sumDiagonals or something like that. It's up to you, just make sure the name clearly conveys the purpose of the variable.
Now we need to know what j represents. The initial value of j is 1. Each loop of the while loop, j is reset to equal i + 2*m. We can assume what j represents based on the fact that sumDiagonalNums is incremented by j every loop. Therefore, j must be the value of the next number laying on one of the spiral's diagonals. Let's rename this variable diagonalNum.
diagonalNum equals i + 2*m. Let's try to figure out what i and m are. The initial value of i is 1 and each while loop, it gets reset to equal the previous loop's diagonalNum. So, i can be renamed lastDiagonalNum. What about m? The initial value of m is 0. For each while loop, if e = 1, then m is incremented by 1. Otherwise, it retains its current value. We also know that 2*m is equal to diagonalNum - lastDiagonalNum. Therefore, m = (diagonalNum - lastDiagonalNum)/2. You might expect m to be a fraction sometimes, if it is always exactly half the difference between the two diagonals, but in fact if you look at the spiral pattern in the problem example, you will see that the difference between each diagonal number is an even number. This enables us to define m as an integer instead of a float. Let's rename this variable halfDiagonalDifference.
All that remains is that pesky e. This variable starts at 0, gets incremented by 1 each while loop, and once it reaches 5, it gets reset to 1. In other words, e is constantly cycling 1-2-3-4-1... If e = 1, then halfDiagonalDifference is incremented by 1. This only makes sense if we look again at the example spiral that was provided us in the problem. After every 4 corners/edges in the spiral, the difference between the diagonal numbers increases by 2, or, half the difference increases by 1. Let's rename e as edge.
We now have the following code, which is a little easier to understand:
int lastDiagonalNum = 1;
int diagonalNum = 1;
int sumDiagonalNums = 0;
int halfDiagonalDifference = 0;
int edge = 0;
while (diagonalNum - lastDiagonalNum + 1 < 1002)
{
sumDiagonalNums += diagonalNum;
lastDiagonalNum = diagonalNum;
edge += 1;
if (edge == 5) { edge = 1; }
halfDiagonalDifference = (edge == 1) ?
halfDiagonalDifference + 1 : halfDiagonalDifference;
diagonalNum = lastDiagonalNum + 2 * halfDiagonalDifference;
}
stopwatch.Stop();
Console.WriteLine("Sum of Diagonal Numbers: " + sumDiagonalNums);
Maybe you're like me though, and this still looks a little messy. We can clean this up by working with the full diagonal difference instead of half, reducing the while statement, reducing some of our conditional statements, rearranging a couple other statements and changing the initial values of a couple of our variables.
The final product can look something like this:
int lastDiagonalNum = 1;
int diagonalNum = 1;
int sumDiagonalNums = 1;
int diagonalDifference = 2;
int edge = 1;
while (diagonalDifference < 1001)
{
diagonalNum = lastDiagonalNum + diagonalDifference;
sumDiagonalNums += diagonalNum;
lastDiagonalNum = diagonalNum;
edge += 1;
if (edge == 5)
{
edge = 1;
diagonalDifference += 2;
}
}
stopwatch.Stop();
Console.WriteLine("Sum of Diagonal Numbers: " + sumDiagonalNums);
By the way, not only does this last version of our method look cleaner, it runs about 25% faster than the original. One side take-away is that reducing the number of conditional statements and reducing the number/complexity of calculations performed will always give us faster performance. My guess is that the reducing the number of conditional statements from two to one accounts for the majority of the acceleration.
Comments
Post a Comment