Computational feasibility

Your experiments should be computationally feasible. If you ignore these issues, you will probably not get any results.

Intended for: BSc, MSc, PhD

You should learn to first start small, and then gradually scale up. The latter requires you to quickly estimate the expected computational load of your experiments, and (possibly) to find your required computational resources.

Start small

Estimate the expected computational cost

Computational resources

Start small

If your method does not work in a small task, it will certainly not work in a big task.

You always start too big.

A typical experiment:

You just came up with a new idea. You are enthusiastic, and you want to solve an interesting, complex problem.
You start to write a ton of code, and fire it at your complex problem.
You then find out that your method takes ~24 hours to run a reasonable number of steps. To make it more problematic: your method shows no performance at all.
You suddenly have no clue how to continue. It might be a bug in your code. Your idea might be flawed. Your problem might simply be too big too solve. Etc.
You don't know, and you don't know how to figure it out.

Why you need to start small.

The solution to the above problem is really simple: always start with small scale experiments, and only gradually scale up. There are two reasons:

Investigation: Your method may simply be flawed. You forgot to think about a certain aspect that hinders performance. To find out what goes wrong, you will need to log output during execution, maybe visualize aspects of the algorithm. This is only possible in small problems, where you can accurately track what happens.
Debugging: Your code may have a bug. Debugging is an iterative process, and you usually need many cycles. Imagine you need 30 cycles to track a certain mistake. If a single run takes 1 minute, you could be done in half an hour. However, if every run takes 24 hours, you are not done by the end of the month.

Note: Starting with a small task may seem boring, but (trust me from personal conversations that) even the most high-end AI publications were first tested in small problems.

Find the right small task.

With the rise of deep learning it sometimes seems as if only high-dimensional problems matter. However, tasks can be difficult in various ways, and dimensionality is only one of them. Try to find a low dimensional task that still captures the problem your are trying to solve. Examples:

- Long-range dependencies: Maybe you want to investigate the ability of your model to learn long-range dependencies in a stream of data. Keep the dimensionality low, but the dependency long-range, and see if your method works.
- Exploration: Maybe you want to investigate a new exploration method for sparse rewards (in reinforcement learning). Design a task with a small observation space, but make sure the agent needs to take a very specific sequence of actions to find a non-zero reward.
- Representation: Maybe you want to investigate whether your new loss function learns more disentangled representations. You start on the (relatively low-dimensional) MNIST dataset, and investigate what representations your method finds.

Scale up.

Once you have validated your method in a relatively small task, you may scale up to a bigger task to prove its feasibility.
However, make sure to get an estimate of the total expected computational cost as soon as possible (see below).

Estimate the expected computational cost

A crucial step in any experimental cycle is getting an estimate of the expected total computational cost of your experiments. This way you ensure that your project is feasible.

Single run: Perform a single run of your experiment as quickly as possible. Observe how long it takes (on your hardware) to get some learning behaviour.

- Note: Your code does not need to be polished yet, or you could alternatively run public code of a comparable method on a similar task. You just want a time estimate to judge whether the task is feasible.

Multiply: You need to tune hyperparameters, and you will need to run repetitions of your experiment.

- Therefore, as a rule of thumb, multiply the runtime of your single experiment with ~30-50x to get your total expected time for your experiment.
- You for example plan to tune at least two hyperparameter (e.g., learning rate & exploration parameter) over 3 possible values, and you repeat each experiment 5 times. This already requires 3 x 3 x 5 = 45 repetitions.

Example:

A single experiment shows learning progress after 4 hours.
Your expect your total runtime will be ~200 hours for this experiment.
You cannot run this experiment on your own laptop, so you look into third-party compute resources (see below).

Computational resources

Quite soon your experiments become infeasible to run on your own device. There are several third-party options:

University compute cluster: Usually your university has a compute cluster, to which you may have acccess.
- At Leiden University, this includes the DSLab and the ALICE computer cluster.
National computer cluster: Sometimes there are also national resources available for all students.
- In the Netherlands, this is offered by Surf, which for example includes the Lisa and Snellius compute clusters. You need to write a project proposal to get access (mostly intended for PhDs).
Commercial compute resources: Companies also offer cloud computing resources. As a student, you often get your first credits for free.
- Google Colab allows for computation your browser, with free access to GPUs
- Google Cloud and AWS are two popular cloud compute options with a free student start budget.
- These companies usually also have dedicated academic research grants for students, for which you can apply (Example).

Page updated

Google Sites

Report abuse