Software



Good software packages may make your experimental cycle (and life) a lot easier.


Intended for: BSc, MSc, PhD

Back-up & Version control

The first thing you need to set up is version control and all your relevant documents. You don't want to lose progress because you accidentally delete a file or folder.
For your code, use Git (with Github), even if you work on a project alone. It naturally allows you to branch off in new experiment directions, and later on merge or discard these.
For all other files, I advise to use a Dropbox account (the first 2GB is free, which is definitely enough for a few projects if you mostly store text files and images). Install the Dropbox folder on all your devices, and work on all your relevant documents from inside this folder. You then always have a back-up, and your files are automatically up-to-date on all of your devices (which prevents having to carry a laptop around all the time).  


Managing Experiments

Management of experiments can be a lot of hassle. Start multiple experiments with different hyperparameters and repetitions, log all their results, generate learning curves from them, etc. A few years back, researchers would all write this code themselves (which often took more time than the actual experiment code).  Luckily, these days there are very convenient packages that help you launch experiments, control them, log all your results, and directly visualize them. The best option is Weights and Biases


Deep Learning Libraries

To implement and train your neural network, you want to use a deep learning library. These packages are actually 'automatic differentiation engines': they allow you to build a network/graph through which you can automatically differentiate (a loss with respect to the variables). Examples are: 


Datasets

For supervised learning experiments, you will need a dataset to train on. There is a huge variety of available datasets, which range in type of challenge and difficulty. 


Reinforcement Learning Environments

For reinforcement learning experiments, you usually need environments to test on. Most environments follow the Environment class template introduced in Gym. Many researchers have written new environment(s) (packages) in the same template, of which you may find examples in the below lists.  


Parallelization

When computation is a bottleneck, you may want to parallelize your code. However, read the below considerations first: 

If computation outside of the network operations remains the bottleneck, then you may want to parallelize your Python code. The best option is to use Ray.


LaTeX documents

You can of course work with a local Tex installation (I usually still do). However, an alternative is to use an online Latex editor, such as Overleaf. Two important benefits are i) automatic version control and ii) easy sharing with supervisors/collaborators. 


References

To keep track of your references, it can be helpful to use a reference manager. For a bachelor/master thesis this is probably not necessary, but as a PhD student it might be useful. For a paper, you can simply extract the required bibliography file from your reference manager. Some popular options: 


Presentations & Diagrams

For presentations and diagrams, I mostly use online tools. You can make very neat presentations with Google Slides, and all your progress is automatically stored. For conceptual drawings, you can of course write scripts, but they take a lot of time. I usually prefer the online tool Diagrams.net, which provides fast prototyping and a clean lay-out.