Deep Learning

Artificial Intelligence System Reduces False-Positive Findings in the Interpretation of Breast Ultrasound Exams

Ultrasound is an important imaging modality for the detection and characterization of breast cancer. Though consistently shown to detect mammographically occult cancers, especially in women with dense breasts, breast ultrasound has been noted to have …

Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability

Saliency maps that identify the most informative regions of an image for a classifier are valuable for model interpretability. A common approach to creating saliency maps involves generating input masks that mask out portions of an image to maximally …

Reducing false-positive biopsies with deep neural networks that utilize local and global information in screening mammograms

Breast cancer is the most common cancer in women, and hundreds of thousands of unnecessary biopsies are done around the world at a tremendous cost. It is crucial to reduce the rate of biopsies that turn out to be benign tissue. In this study, we …

An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department

During the COVID-19 pandemic, rapid and accurate triage of patients at the emergency department is critical to inform decision-making. We propose a data-driven approach for automatic prediction of deterioration risk using a deep neural network that …

Data Prefetching in Deep Learning

In the previous post I explained that, after a certain point, improving efficiency of the data loaders or increasing the number of data loaders have a marginal impact on the training speed of deep learning. Today I will explain a method that can further speed up your training, provided that you already achieved sufficient data loading efficiency. Comparison of training pipelines with and without prefetcher.. Typical training pipeline In a typical deep learning pipeline, one must load the batch data from CPU to GPU before the model can be trained on that batch.

Visualizing data loaders to diagnose deep learning pipelines

Have you wondered why some of your training scripts halt every n batches where n is the number of loader processes? This likely means your pipeline is bottlenecked by data loading time, as shown in the following animation: Training is bottlenecked by data loading time. In the animation above, mean loading time for each batch is 2 seconds, and there are 7 processes but forward+backward pass for each batch only takes 100ms.

Applying Theory of Mind to the Hanabi Challenge

Can reinforcement learning agents generate and benefit from conventions when cooperating with each other in imperfect information game? This is the question that led to my course project in “Advancing AI through cognitive science” course at NYU Center for Data Science. In summary, I applied theory-of-mind modeling to the Hanabi challenge [1] and observed an improvement. Applying Theory of Mind to the Hanabi Challenge Hanabi is a card game created in 2010 which can be understood as cooperative solitaire.

Paper Review: 'Processing Megapixel Images with Deep Attention-Sampling Models'

‘Processing Megapixel Images with Deep Attention-Sampling Models’ (referred as ‘ATS’ below) [1] proposes a new model that can save unnecessary computations from Deep MIL [2]. They first compute an attention map of all possible patch locations from an image. They do so by feeding a downsampled image to a shallow CNN without much pooling operations. They sample a small number of patches from the attention distribution and show that feeding these samplied patches to MIL classifier is an unbiased minimum-variance estimator of the prediction made with all patches.

Breaking Numerical Reasoning in NLI

Are natural language inference (NLI) deep learning models capable of numerical reasoning? This is the question that led to my course project in “Natural Language Understanding and Computational Semantics” course at NYU Center for Data Science. In summary, my teammates and I tried adversarial data augmentation and modified numerical word embedding to show that some of SoTA NLI architectures at the time could not perform correct numerical reasoning that involve adding multiple number words.

Paper Review: 'Evaluating Weakly Supervised Object Localization Methods Right'

The main claims of the paper [1] A certain level of localization labels are inevitable for WSOL. In fact, prior works that claim to be weakly supervised use strong supervision implicitly. Therefore, let’s standardize a protocol where the models are allowed to use pixel-level masks or bounding boxes to a limited degree. According to their proposed evaluation method, they have not observed any improvement in WSOL performances since CAM (2016) in this protocol.