A UCL project to solve the problem of tedious digital pathology labeling

In this article, I am going to talk about one of my dearest and best projects that I have worked on, my undergraduate thesis at University College London. My thesis was about using unsupervised machine learning to automate the process of breaking down huge Whole Slide Images and labeling them without supervision. This proved to be a much more difficult task than I originally thought. I honestly can't believe that if you search for “unsupervised digital pathology” on google, the first result is my paper!

Cancer is the second leading cause of death worldwide, responsible for around 15% of deaths[1]…

Machine Learning, Optimization

Automating everything about the model architecture from A to Z

Image for post
Image for post
Photo by Luca Bravo on Unsplash

What does the appropriate neural network look like for a given problem? How deep should it be? Which types of layers should be used? Would LSTMs be enough or would Transformer layers be better? Or maybe a combination of the two? Would ensembling or distillation boost performance? These tricky questions are made even more challenging when considering machine learning (ML) domains where there may exist better intuition and deeper understanding than others.

Source: Google

The above questions are quite tricky. As data scientists, the current approach is just to experiment with the possibilities that make more sense, evaluate, make another…

Putting an AI system into the world’s oldest eye hospital and the biggest one in Europe and North America — Moorefield’s eye hospital

Image for post
Image for post
Photo by v2osk on Unsplash

If you have been following me for a while, you know that I am heavily in medical AI. I have been reading tons of amazing papers achieving tremendous performance over the last few months/years and I wrote a bunch of reviews about some of them. However, I was just thinking a few days ago, were those papers actually used? Or did they just become history? And that’s why I decided to review this paper, the one that actually made it from a research lab into production.

Not too long ago, Deepmind released an AI system that automates the diagnosis of…

TPUs are over 20x times faster than state-of-art GPUs… But how?

TPUs are hardware accelerators specialized in deep learning tasks. In this code lab, you will see how to use them with Keras and Tensorflow 2. Cloud TPUs are available in a base configuration with 8 cores and also in larger configurations called “TPU pods” of up to 2048 cores. The extra hardware can be used to accelerate training by increasing the training batch size.

Source: Colab

We all know that GPUs are faster than CPUs when it comes to machine learning. And during the last few years, we can see that there are new chips being developed by giants in…

My story of moving to the UK when I was a teenager

Image for post
Image for post
Photo by NASA on Unsplash

When I was doing my A-levels, one of my friends suggested that we both study abroad together. My first comeback was “I don’t even prepare my own lunch, do you expect me to move across the globe and live on my own?”. It was quite an uncomfortable idea to me, I was someone who always stuck to his comfort zone, I hated changing schools, changing my childhood neighborhood, let alone changing countries, however, I still did it!

I decided to take a leap of faith and travel with my friend from Egypt to study Computer Science at University College London…

NFNets are faster than EfficientNets and they don’t use normalization

Photo by Boitumelo Phetla on Unsplash

Our smaller models match the test accuracy of an EfficientNet-B7 on ImageNet while being up to 8.7× faster to train, and our largest models attain a new state-of-the-art top-1 accuracy of 86.5%.

Source: arxiv

One of the most annoying things about training a model is the time it takes to train it and the amount of memory needed to fit in the data and the models. …

Transformers have been dominating NLP and image recognition, and now object detection

Image for post
Image for post
Photo by Samule Sun on Unsplash

Most of the great recent machine learning papers are based on transformers. They are powerful effective machine learning models that have proven they are worth the time and effort to optimize. Recently, Facebook published a new paper that uses transformers to outperform state-of-the-art Faster RCNNs in object detection.

Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.The new model is conceptually simple and does not require a specialized library. We show that it significantly outperforms com- petitive baselines. Training code and pretrained models are available at https://github.com/facebookresearch/detr.

Source: arxiv

The paper examines the weakness of…

Achieving impressive performance with self-supervision using Transformers and Contrastive learning

Image for post
Image for post
Photo by Harlie Raethel on Unsplash

Covid prognosis and screening is not an easy task, especially with the lack of data. Solving this problem with AI would very much prove the effectiveness of AI since it would help with one of the worst pandemics. The only issue is that AI relies on tons of data, and the world needs a solution to the excessive strain being put on hospitals in a short time. Simply put, there isn’t going to be a huge useful dataset soon.

If you are thinking about what is unique about this paper, it is the fact that they have explored all of…

Tired of having your ML data scattered all over? Check out the “Github” of ML

Image for post
Image for post
Photo by Markus Spiske on Unsplash

Data Scientists deserve to browse, preview, share, fork, and merge data & models alongside code. DAGsHub Storage is a DVC remote that requires 0 Configuration (works out of the box), includes team and organization access controls, and easy visibility.

Source: DAGsHub

If you were to ask most data scientists about the worst two things about machine learning projects, I bet their answer would be handling the ML pipelines and managing the sheer amounts of data. These two issues are messy and honestly quite annoying.

Unlike most software products where you would have to track code, you have to track the…

Optimizing self-attention and feed-forward transformer blocks to achieve higher performance with lower compute time

Image for post
Image for post
Photo by Christian Wiediger on Unsplash

The PAR Transformer needs 35% lower compute time than Transformer-XL achieved by replacing 63% of the self attention blocks with feed-forward blocks, and retains the perplexity on WikiText-103 language modelling benchmark.

Source: PAR paper on arxiv

The transformer optimization streak continues. Google released their Switch transformers, Microsoft released ZeRo Offload, Facebook released Data-efficient transformers and now NVIDIA releases PAR which optimizes the use of attention in transformers.

The reason why these companies are concentrating on optimizing transformers is due to the huge success that transformers achieved in NLP and now in Image processing. …

Mostafa Ibrahim

Programmer. University College London Computer Science Graduate. ARM Full Stack Web Dev. Passionate about Machine Learning in Healthcare.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store