Fundamentals of Algorithm Design

According to Merriam-Webster an algorithm is:

“a procedure for solving a mathematical problem (as of finding the greatest common divisor) in a finite number of steps that frequently involves repetition of an operation
broadly : a step-by-step procedure for solving a problem or accomplishing some end“.

Which, in essence, close enough. So, algorithm design would be the process of devising such solution or procedure, which in itself is a branch of our development process. But how do we approach such a process? Where to start?

A quick disclaimer – this post is meant to lay down what is required when we approach a situation that needs to be solved by a sequence of calculations or actions – the algorithm – rather than the calculations themselves, so in that sense I am not going to discuss the different solutions for the traveling salesman problem here.
We must make a clear distinction between the cases where the Algo is the product and where the Algo is another systematic need. When the Algo is the product, for example an LLM (large language model) or an advanced image processing method, the systematic limitations would be the Algo itself and we would have to build the rest of the system’s sub-modules around the Algo. These are unique cases – it is not every day we come up with algorithms that are on their own a product. Most of this post’s content refers to the other and more common cases where the Algo is there to solve another step in the engineering path of the technical solution.

Algorithm Development Flow

A correct development flow for an algorithm is as follows:

Requirements & Definitions
Resource and Data-Flow Identification
Theoretical Algorithm Design & Proof of Concept
Data Collection
Initial Implementation
Loop Data Collection ↔ Fine Tuning of the Algo

1 – Requirements & Definitions

As in every R&D project or task, we must know what is the task at hand, where it should be implemented and why and to what timeline we have to commit. See below about the problem statement as this sometimes is not clear enough when dealing with different kinds of algorithms.

2 – Resource and Data-Flow Identification

If it is a chemical model algorithm you would probably need a chemist by your side and if the data for the Algorithm design has to be taken in a clinic with human subjects it has to be mapped to have a smooth development flow. See below about resources and data-flow. Note that the review of steps 1 & 2 may be combined into the same review.

3 – Theoretical Algorithm Design & Proof of Concept (POC)

Share with the external team what your plan for this Algo is and, if possible, also run a quick POC and share the results with others. Let everyone feel how the solution is going to look like. This serves two purposes: first, to verify that the requirements are indeed what the product needs and the second, to catch fundamental scientific errors. For example, I once reviewed an Algo proposition with an engineer only to find out that he was looking for a signal in a sensor which could not optically reach that sensor.

4 – Data Collection

That one goes without saying. Building an Algo requires data to test. In the AI and neural-network realm the data-collection is even more crucial since the training of the models is based on the quality and amount of data provided for the development.

5 – Initial Implementation

We have worked a few months, let’s review what was done. This is where the mass workload of development is done.

6 – Loop Data Collection ↔ Fine Tuning of the Algo

As professional and proficient at your work you may be, nothing is perfect in the first trial. There are use cases you may not have covered in your implementation, new data presented itself, tweaking of the requirements by the System Engineer or the Product Manager. This is a loop of data-collection and fine-tuning of the algorithm, that may well continue even after the product has been released.

Now I know it may look like a long and tedious list especially for small and “obvious” algorithms however, even for the most complicated and difficult Algo development processes, I have seen the first 3 steps being done in less than a week’s time when the Algo and System team members know how to make things happen. Note that you may combine a family of algorithms into a general development process to save review overheads and make it even more efficient. If we go a bit deeper, here is a list of topics that we should take into consideration during the Algo development process that if taken care of, most of the flow described previously should go smoothly and uninterrupted:

Clear Problem Statement

Problem statement in the Algo world means what our input is and what our desired output should look like. Sometimes we may discover in the problem statement formulation process that our input data is, a priori, insufficient, which in turn would change the problem statement. Input may be in the form of data points series, digital image or any other kind of information with its operating conditions. Operating conditions are paramount: I was once a part of a team that tried to solve a very difficult scenario over a very complex Algo only to realize that we were solving a theoretical case with conditions which could not possibly occur in the final product when used by the client.

Quantitative vs. Qualitative

The output of an Algo may be formulated in a quantitative way or a qualitative way, the former is easier to handle. For example, when we have a velocity calculation between a rocket launcher and a moving target the velocity is a quantitative value that is measured in distance vs. time. A qualitative output example would be something like preventive maintenance alert according to performance. In this case the Algo input would be all the machine’s sensors and performance outputs and the Algo output would be alert\not alert however, the Algo definition does not clearly set the path between the input and the output since the machine is operating inside its performance limitations and does not suffer from malfunctions.

Definitive vs. Non-Definitive

Another limitation that an algorithm may have, especially nowadays with AI inserted everywhere, is definitive vs. non-definitive (probabilistic) output. Definitive output definition is when we have a requirement from our Algo that must occur in 100% of the times for a certain input. If we repeat the input we must get the exact same answer each time. A probabilistic Algo will have a different output each time, even for the exact same input (try entering “Write me a 5-line song about love” into Claude AI a few times and see what you get :-)). The limitations about non-definitive algorithms sometimes do not come up at the very beginning, not until the Algo is already in design review, mainly because nobody knew it was a limitation. If you are about to integrate a probabilistic section in your Algo check in advance if such an algorithm is permitted in your application. In medical devices for example the integration of non-definitive algorithms is problematic regulatory wise, so verify with a regulatory affairs expert before starting a deep dive development.

Mathematical and Scientific Correctness

The Algo must be correct mathematically (or scientifically if it more than just a dataset). If there are cases where the Algo is proven to fail on paper or experimentally these must be communicated and agreed upon to be excluded from the algorithm’s performance scenarios. Neglecting to have such exceptions done would be dangerous when eventually assessing if the Algo satisfies its requirements and may even cause the whole project to fail in the money time as the product cannot be delivered to the clients with the provided exceptions.

If there are assumptions regarding anything in the algorithm, its inputs, its data-flow, its environment or anything else these assumptions must be as parts of the Algo definitions.
The product, in our case the algorithm, must first and foremost work (must be correct)! All the rest is important but is second to the Algo’s correctness.

Run-Time and Throughput Limitations

The essence of this section is how much time we have to finish the Algo calculations before they become obsolete for the system’s data flow. This requirement limits the number of calculations we are allowed to perform over a set of data. Not every Algo definition includes this section in its content usually for one of two possible reasons: either it is not really a limitation for the system or the System Engineer has overlooked that limitation in the system’s overall budget.
In real-time applications the algorithm run time is an absolute must but if you’re planning an optimization algorithm that is to run overnight one minute more or one minute less will probably not make a significant difference.

Available Hardware

Resources are always limited even if sometimes we have significantly more than required (over budgeting requirements). We must know how much resources are available for our Algo so we would not develop a 1MB look-up table only to find out eventually that the chip has only 128kB available for us.
There is a significant difference between the complexity and size of an Algo that may run on a NVIDIA Jetson Nano vs. the NVIDIA V100S for PCIe (472 gigaFLOPS vs. 16.4 teraFLOPS; Memory of 4GB vs. 32GB etc.), so when planning an Algo we should know whether we have a small pistol or a cannon in our hands.

Remember that available hardware requirements usually go hand-in-hand with the throughput requirements. Be open-minded about it. Sometimes it is effortless and BOM-price insignificant to increase the available resource to accommodate an Algo that otherwise would have become very limited.

Algo Run Location – Locally\Externally Executed Calculations

Although this is a subset of the available hardware for our Algo I would like to dwell on this particular case a bit. It makes a huge difference when considering different calculation methods for our solution if the Algo is going to be run locally on our machine or externally on another machine. Running algorithms on external machines that their sole systematic existence is to serve as calculators paves a path to quite a few features like constant SW updates (without the local machine being interrupted), usually stronger hardware and availability to bigger databases and in most of the time these calculations will not be real-time. Running everything locally has the advantage of having all the data-flow in real-time and of course the ability to serve the machine in real-time as well.

In many cases the chosen option is derived from the algorithm’s implementation itself and not given to us as a requirement. I have been in two different projects where in one the conscious decision was made to have all the image-processing offline on a separated and a very strong PC while in the other project it was decided to have a minimized algorithm and reduced image-processing features only to keep the algorithm running locally on the main machine and not to require another set of hardware to be added systematically.

Synchronization & Data-Flow

In multi-disciplinary systems timing matters. We covered the throughput of the algorithm itself in the previous section, however the data-flow into and from the algorithm is no-less important. If an Algo needs an input of two sensors whereby one arrives in a frequency of 10Hz and the other of 1Hz we would have to find a way to have our Algo deal with this inconsistency between the frequencies. Furthermore, this data-flow into the algorithm may result in a limitation of the actual throughput of the Algo if a fresh pair of sensors A & B has to be taken for each separate calculation. Another case could be when both sensors A and B have the same frequency of 1Hz however the signals arrive at the calculating unit in a 500ms lag.

Real-Time and Offline Algorithms

Real-time algorithms usually have more limitations and quite often tend to be simpler (on average) than offline algorithms.
If we are building a stock-exchange buying\selling app any smart algorithm we may have must be a real-time one otherwise we miss the market price but a long-term stock portfolio advisor application may have its calculations run even after stock-exchange working hours. I assume there will be a lot of similarities between two such algorithms however the in-depth calculations that we would surely get when running the portfolio advisor app cannot exist in the buying\selling application.

In most cases it is clear without even noting it whether it is an offline or a real-time Algo but not always. If it does not appear in the official requirements, make sure to verify that with the System Engineer or the Technical Lead of the module.

Availability of Resources for Development

Do we have enough data to start the Algo development? Do we have an available machine or computing unit for our experiments and POC?
This is more of a project management and planning item but it may impact the solution. More than once the algorithmic solutions were divided into sub-procedures where the first to be implemented were the ones with available resources and the following sub-procedures were developed when the resource became available for the project.

One must identify the critical resources for the development process this includes human resources. If we are developing a laser power feedback loop you would want your physicist with you when characterizing the activation energy of the laser or, when planning an image-processing algorithm to your camera based system you would probably want to have the machine’s camera next to you with some representative samples to take image of.

To sum up, make sure that every calculation in the system is getting the attention it deserves from the simplest exponential factorization up to the most complicated multi-parameter multi-step neural network. In this post I have tried to give you a taste of the absolute fundamentals of the development flow when we need a certain calculation or a series of calculations in our system – an Algo development. I hope it will help you to organize your thoughts when approaching your next algorithmic endeavor.