Whiteboarding, Algorithms, Strategy, and Simplified Time Complexity

Blake Rogers
12 min readSep 11, 2019

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” - DonaldKnuth

In my work as a consultant I wear a lot of hats, from CTO to recruiter to developer. I have to stay sharp, and one way is practicing coding computer science problems.

I went to a Whiteboarding and Algorithms Workshop Meetup at Noisebridge Hackerspace, a workshop for job-seekers on coding at a whiteboard. Whiteboarding is part of the technical interviewing process at many top technology companies, using skill assessment cloud tools like LeetCode and HackerRank.

The purpose of the whiteboarding workshop is to help job-seekers with their technical interviews, virtually or in-person, improving their public speaking skills and ability to code an algorithm at the whiteboard while creating rapport with their audience. Programmers can then discuss their solution and recommend improvements.

The “whiteboard” in a real technical interview is a cloud-based coding quiz in a cloud environment like HackerRank or LeetCode, or just a shared google doc. One of the purposes for HackerRank, other than programming competitions, is algorithm coding practice. So far I’ve solved around 75 programming problems on HackerRank, and I’m proud to be ranked #21,696th.

Most attendees were programmers, there to improve their game. There was a recruiter from Lyft. The secret is out. I recommend hiring managers, project managers and recruiters frequent programming workshops to connect with talent, and learn from the discussion how algorithms, performance, and optimization apply to their work. I was there to meet the competition and see the talent, often the same people.

Fig. 1: Computer scientist whiteboarding at Microsoft

Source: https://www.microsoft.com/en-us/research/wp-content/uploads/2014/03/Lamport-at-a-whiteboard.jpg

What’s an Algorithm?

An algorithm can be considered a recipe for doing something that the computer follows. The algorithm runs at a certain speed. Performance is something we can all agree. Each of our solutions, once deemed functional, are judged like race cars in how fast they run. When code runs on a computer, it takes a certain amount of time to process each command. The code’s memory footprint, how much memory space the code takes up, is part of performance. The goal is the fastest, leanest code. And code runs fast based upon the programmer’s skill at choosing the right tools for the job.

Algorithms are Strategic

Managers need to make decisions about performance goals that the engineers then try to meet. This could be process a certain number of users per second, page loading time, or search through ten million records. The engineers can then test their code, and see if they are meeting the requirements.

Along the way, managers have to be able to decide whether improving an algorithm is worthwhile, based upon inputs from the engineers. If something critical can be doubled in speed, that’s a good thing. But not necessarily meaningful in the long haul. What if it can go even faster? The code’s performance metrics (numbers) have to be understood on both sides of the table. This topic is called time complexity. I’ll discuss some easy shortcuts to ascertaining an algorithm’s performance.

Software performance includes tradeoffs, in this case sometimes you code something that runs fast when there aren’t very many users and then when it scales it crashes. Other code is built for high capacity, and is only efficient and worth the capacity when there are actually lots of users. Elastic cloud providers like Amazon’s Elastic Computing (EC2) are a great solution, because capacity scales with your requirements.

Whiteboarding at hackerspace

The meetup organizer selected programming problems from Leetcode. Leetcode questions are compiled from actual questions asked by Facebook, Google, Apple, and Uber. According to one recruiter at the workshop, candidates upload the questions after their interviews.

The engineers at the hackerspace meetup were probably all good. They seemed pretty fluent in computer science and math. One knew statistics. I didn’t know who were regulars in the whiteboarding meetup, or who were newcomers like myself.

I was nervous about presenting, not because it makes me overly nervous but because it was my first time there. Probably others were nervous about presenting too. After all, one of the goals of the workshop was to improve confidence at the whiteboard. The worst scenario would be that you really bombed on a problem, and other programmers would look at you sadly like you couldn’t program your way out of a wet paper bag.

We focused on the coding and code criticism pretty quickly, I noticed. It seemed that improving our presentation, style, and explaining what we were thinking mostly went out the window. It’s a common theme actually to jump in quickly:

The conversation

Technical Recruiters: Talk me through your code.

Programmers: I know the answer, I’ll just charge through it, or I’ll figure it out as I go

There were a few memorable characters. A young and confident guy worked in embedded systems, like in devices with small processors and memory. He was low-level and had thought of every use-case and edge-case. No one criticized his approach, but I don’t know if everyone understood him. A woman arrived and said she wanted to do Red-Black Trees. Harder than the problems were were working on.

I didn’t whiteboard anything in the workshop, others volunteered and I was ambivalent about jumping in right away. I took the first couple coding problems home and worked on them there. I had seen some other solutions, so I had a small advantage.

It was a sunny Sunday, and we were near Dolores Park. I gave the low-level guy my card and left the workshop.

Problem: Finding two numbers

To keep it simple I’ll only go over the Two Sum problem, which is a helpful example because my first version was slow, and I made a large performance optimization.

Two Sum: Find two numbers in a list equal to Some Target Number

Fig. 2 Two Sum — Find two numbers where X+Y=Z

So you can see what kind of terminology and specifics we are dealing with in the problem definition itself. This means the input is a list of numbers. We want to find two numbers in it that add up to another provided they give us.

A methodology

“The 80/20 Rule says 80% of the failures come from not reading 20% of the instructions.” — Me

When I am off, usually it’s because I misunderstood the problem, transposed something, or didn’t read the instructions well enough. Here are four steps I made up, as hints for moving through the process: Hear, evaluate, reflect, and solve (HERS).

  1. Hear the problem definition, listen to the details. Re-read the instructions if its a written problem. What are the edge cases?
  2. Evaluate the problem and think to yourself after reading and listening. Is there a pattern? Have you seen other problems like it? Is it a search problem? What is its signature? What does the input/output look like?
  3. Reflect, confirm, clarify and pseudocode your solution. Pseudocode is your normal language description of the process.
  4. Solve the problem — translate your pseudocode into real code

Pseudocode Example

  1. Look through all numbers in the list, and for each number in the list do the following process:
  2. Search for a second number (complement) in the list, in the same list as the first, starting in the location of the first number
  3. If the complement is found, quit searching and give us the locations in the list of both numbers.

Initial Results: program accepted

Fig 3: Two Sum Question Results

My first approach succeeded, and met LeetCode’s minimum performance, but was still at the low end of accepted solutions in the 14th percentile of solutions whereas my memory footprint was at the 98th percentile. See my LeetCode Runtime Distribution chart:

https://leetcode.com/submissions/detail/249537193/

There is one slightly obvious failure in my program. My code breaks out of the inner loop, but the outer loop keeps going until it goes through all the numbers. I improve my LeetCode ranking by 10 percent. The code is still slow, and my LeetCode ranking confirms this.

Simplified Time Complexity

As Confucius says, there is more than one way to skin a cat.

Right now would be a good time to discuss my application’s dismal performance, and I do that using the science of Time Complexity, simplified.

Don’t be intimidated by Time Complexity!

To be on the safe side the model presumes maximum looking, that the algorithm finds a match at the end.

10+10+10+10+10+10+10+10+10 (90) round up to 100.

Because the input (10) results in an output of 100, the algorithm can be said to have N squared runtime.

Pseudocode

“Search for a second number (complement) in the same list as the first, starting in the location of the first number”.

This means that if the algorithm is on its third number in the list, then it only compares numbers after the third number in the list. So the algorithm doesn’t go back and inspect numbers it has already compared. This results in cutting total comparisons in around half. They can be manually counted:

Comparison Iterations

  1. First pass, number one: compares against remaining 9 numbers.
  2. Second pass, number two: compares against remaining 8 numbers.
  3. Third pass, number three: compares against remaining 7 numbers.

The amount of comparisons as the algorithm runs are: 9+8+7+6+5+4+3+2+1 for a total of 45 (round to 50) which looks like twice as fast as the example above with 100 comparisons. You can see how the first pass takes the longest, and the last pass, when it has checked almost all the numbers, takes almost nothing because its only one comparison. In the middle of checking the list exists the average number of comparisons, because the program has checked exactly half the numbers.

N2 has more power

We calculated the first algorithm as checking all the possible numbers for a list of 10 would be 100 comparisons, or N squared. Therefore based upon my output of 50 above, because 50 is half of 100, One would conclude with some basic math that this would have a Time Complexity N squared/2 (10*10/2) . But mysteriously that is not the case. Reducing performance by half doesn’t turn out to be that statistically meaningful. Why?

Because, to make a mathematical pun: N squared has more power. We want the factor with the strongest correlation to performance as load increases.

Basic Runtime maximum comparisons

Input RunTime magnitude increase over previous runtime

10 -> 100

100 -> 10,000 (100X = N squared)

1,000 -> 1,000,000 (100X = N squared)

Slightly optimized

Input Runtime magnitude over previous

10 -> 50

100 -> 5,000 (100X = N squared)

1,000 -> 500,000 (100X =N squared)

Time Complexity analysis can help predict algorithm’s performance at scale, and shows how a linear increase in inputs causes an exponential increase in running time. While inputs increase 10 times, the time required to run comparisons increases exponentially to 100 times.

https://en.wikipedia.org/wiki/Big_O_notation

Premature Optimization

Donald Knuth’s rule says that 3 percent of application is critical. There’s also the 80/20 rule which says 80 percent of effects come from 20 percent of the causes. Thus, 20 percent of the application is performance critical, 80 percent is not. So whether its 3 percent or 20 percent, we still want to focus on the parts of the application that are performance critical.

Improving the Two Sum solution

There is much room for improvement, but how can I turbocharge my code? First I tweak it a little, breaking out of a loop earlier, improving my ranking by 10% to the 23rd percentile. Something is still wrong with my code. It works but it is still a brute force approach. Brute force looks through all the possibilities, or close. My solution was still accepted, and around half the submissions were rejected. That makes feel better, at the 23rd percentile.

I’m not going to hit the books just yet. The number one methodology of finding a better answer is to Google somebody else’s example. Here’s a solution on Youtube which they claim is in the 99th percentile. I take the hint and do something like that which you can see below.

https://www.youtube.com/watch?v=Aql6zHkONek

Optimization Results

My runtime ranking improves to close to ranking 99 percent.

Fig 4: Two Sum Question Results (optimized)

Fig 2: Two Sum Question Results (Optimized version, technique hint from Youtube video)

Why is the second one faster? Because this solution uses a hashmap structure to keep a map of both the locations and values from the list.

The coder does this using a trick: The key and value are swapped, when the second number (the one the algorithm is searching for) and index numbers are put in the key and value positions respectively. (Fig.4 line 12). This is so the second number can be retrieved from the hashmap(Fig.4 line 7–8).

The code has a Time Complexity of Big O “N”, which means that the algorithm only goes through the list of N numbers once. The trick is that all the input numbers are stored in the hashtable, then as the program goes through the numbers it can check the Hashmap for a complement match that was put into the hashmap. Let’s look at how that works, with finding a match for the third number.

Comparisons

  1. Number one: checks hashmap for value, if not, puts puts in hashmap.
  2. Number two: checks hashmap for value, if not, puts puts in hashmap.
  3. number three: checks hashmap for value, finds value in hashmap.

In this case, we’ve found the value during the third loop. To be on the safe side the Time Complexity model presumes maximum looking, that the algorithm finds a match at the end. The algorithm only goes through the list one time maximum, doing a maximum of N (10) comparisons.

The first version algorithm had to potentially go through the list close to the maximum of 100 times, or N squared.

Optimized version with single array loop and hash map

Input RunTime magnitude increase

10 -> 10

100 -> 100 (no increase in magnitude)

1,000 -> 1,000 (no increase in magnitude)

Predict and compare algorithm performance

Comparing algorithms now becomes easier, because we can compare their Time Complexity, and then we can also predict their performance profile and load on the system. Time Complexity analysis can help predict algorithm’s performance at scale, and for instace of Big O (N), shows how a linear increase in inputs causes a linear increase in running time.

LeetCode’s Sample Solutions

https://leetcode.com/problems/two-sum/solution/

Criticisms of Whiteboarding

Whether you’re used to having a manager or lead engineer look over your shoulder while you code or not, whiteboarding is tough because its done under conditions under which the programmer doesn’t normally code. There is an intentional anxiety aspect to coding in real time under pressure, which is obviously to see how you do under pressure.

While whiteboarding, engineers don’t have the advantage of working in an integrated development environment (IDE) with code lookup, so they can’t look up functions on the web, and most importantly they can’t test the code while writing it. This is a good reason to simulate whiteboarding by coding in a text editor instead of an IDE.

Whiteboarding has been criticized for not being relevant to many enterprise integration roles, for example in Fintech. Whiteboarding has also been criticized for being part of a strategy to hire recent computer science graduates. You can throw a dart and hit a news article about how the face of Silicon Valley is overwhelmingly young and male.

Fig 5: Median employee age at top tech companies (Statista)

Source https://www.businessinsider.com/median-tech-employee-age-chart-2017-8

Whiteboard Discontent

Should we let older programmers off the hook for not remembering their computer science from their college years, and losing out against hard-hitting recent grads?

https://news.efinancialcareers.com/uk-en/287595/hackerrank-tests-banking/

Are whiteboarding questions age-ist programming calisthenics? What if you’ve been building enterprise web applications for years, and never needed to write a binary search or merge sort from scratch?

More discontent

https://news.ycombinator.com/item?id=15182952

https://news.ycombinator.com/item?id=8974477

Benefits of Whiteboarding

Whiteboarding tests candidates’ computer-science knowledge and problem-solving ability, and gives companies a quantitative ranking of candidates.

Whiteboarding in person also gives engineers practice sharing ideas with groups, and can overcome anxiety of public speaking and working under pressure.

Doing programming problems on HackerRank or LeetCode is good for anyone wanting to maintain or improve their CS chops. One side-benefit is it makes your technique more robust because Hacker Rank and LeetCode run your code against substantial testing data.

--

--

Blake Rogers

Writing about technology, people, and complex systems.