100DaysToOffload

How to unfold the aliased pixels in SENSE

This post explains how to unfold the undersampled, aliased image to the indices in the reconstruction matrix in SENSE. In short I sketched how to place the mapping window for acceleration factors 2, 3 and 4. You can see the wrap-around of aliasing windows in the even acceleration factors. Explanation of mapping wrapped pixels. SENSE SENSE (SENSitivity Encoding)[1] is an image-domain parallel imaging technique in MRI. For more information, you can visit the following links:

Data prefetching in Deep Learning

In the previous post I explained that, after a certain point, improving efficiency of the data loaders or increasing the number of data loaders have a marginal impact on the training speed of deep learning. Today I will explain a method that can further speed up your training, provided that you already achieved sufficient data loading efficiency. Data Prefetcher In a typical deep learning pipeline, one must load the batch data from CPU to GPU before the model can be trained on that batch.

Visualizing data loaders to diagnose deep learning pipelines

Have you wondered why some of your training scripts halt every n batches where n is the number of loader processes? This likely means your pipeline is bottlenecked by data loading time, as shown in the following animation: Training is bottlenecked by data loading time. In the animation above, mean loading time for each batch is 2 seconds, and there are 7 processes but forward+backward pass for each batch only takes 100ms.

Applying Theory of Mind to the Hanabi Challenge

Can reinforcement learning agents generate and benefit from conventions when cooperating with each other in imperfect information game? This is the question that led to my course project in “Advancing AI through cognitive science” course at NYU Center for Data Science. In summary, I applied theory-of-mind modeling to the Hanabi challenge [1] and observed an improvement. Applying Theory of Mind to the Hanabi Challenge Hanabi is a card game created in 2010 which can be understood as cooperative solitaire.

Review of 'Processing Megapixel Images with Deep Attention-Sampling Models'

‘Processing Megapixel Images with Deep Attention-Sampling Models’ (referred as ‘ATS’ below) [1] proposes a new model that can save unnecessary computations from Deep MIL [2]. They first compute an attention map of all possible patch locations from an image. They do so by feeding a downsampled image to a shallow CNN without much pooling operations. They sample a small number of patches from the attention distribution and show that feeding these samplied patches to MIL classifier is an unbiased minimum-variance estimator of the prediction made with all patches.

Breaking Numerical Reasoning in NLI

Are natural language inference (NLI) deep learning models capable of numerical reasoning? This is the question that led to my course project in “Natural Language Understanding and Computational Semantics” course at NYU Center for Data Science. In summary, my teammates and I tried adversarial data augmentation and modified numerical word embedding to show that some of SoTA NLI architectures at the time could not perform correct numerical reasoning that involve adding multiple number words.

Web app: My UCI class is full

Screenshot of the web app My UCI class is full is a web application which helped 1400+ students enroll in 5300+ courses from Fall 2016 to Fall 2019. The app sent 52K+ alerts to the students when a class without a waitlist had a spot for them to enroll in. This app is discontinued as of December 2019. Developing the first version In HackUCI 2015, Yang Jiao and I developed an app to help students enroll in classes without waitlist.

Review of 'Evaluating Weakly Supervised Object Localization Methods Right'

The main claims of the paper [1] A certain level of localization labels are inevitable for WSOL. In fact, prior works that claim to be weakly supervised use strong supervision implicitly. Therefore, let’s standardize a protocol where the models are allowed to use pixel-level masks or bounding boxes to a limited degree. According to their proposed evaluation method, they have not observed any improvement in WSOL performances since CAM (2016) in this protocol.