Recurrent Neural Networks: Classifying Diagnoses with Long Short-Term Memory

by Sophia TurolSeptember 7, 2016

Learn about the challenges of training a recurrent neural network, such as vanishing gradients, and ways to address them with long short-term memory.

When training a recurrent neural network, one can face a bunch of challenges. Long short-term memory networks (LSTM) can come to the rescue, proving to be effective for learning from sequence data. At the recent TensorFlow meetup in Los Angeles, the attendees learnt how to use an LSTM network for modeling clinic data.

Table of Contents

Addressing the issues of training a model with RNN

Dave Kale started his session with an introduction to computational phenotyping, which is used in predictive diagnostics to analyze observable characteristics or traits of an organism. According to Dave, the problem resides in classifying the right diagnosis based on all the data available. He enumerated a number of classical solutions to classify diagnoses:

windowing (i.e., Markov property)
feature engineering
hidden (state) Markov models

Dave explored recurrent neural networks (RNNs) as one more solution option. He also talked about the following challenges when training an RNN:

backpropagation (also known as the chain rule) through time
gradients vanish (or explode) in deep neural nets

Then, Dave moved on to long short-term neural networks (LSTM) as a means for addressing the vanishing gradient problem during training. He also demonstrated how target replication and auxiliary targets can help on the way.

Watch the video for more detail.

TensorFlow concepts

Sam Abrahams treated the audience to an introductory session on TensorFlow, answering the following questions:

What makes TensorFlow unique?
Where might the library be heading in the future?
What are the guts of TensorFlow?

In his talk, Sam mentioned TensorFlow serving and a distributed runtime. TensorFlow serving allows for running a server that helps to easily swap the models in / out or run online training. A distributed runtime enables users to run TensorFlow via computing resources from heterogeneous hardware in parallel with minimal changes to code.

When overviewing TensorFlow basics, Sam highlighted some of the core definitions and moved onto:

What data flow graph is and why using it
How to build a graph
TensorFlow placeholders and variables
Running a TensorFlow session

To get more detail, watch the video from the meetup below.

Join the meetup group to get informed about the upcoming events.

About the experts

Dave Kale is a fourth-year PhD student in computer science at the University of Southern California. He works on a variety of topics in machine learning, including deep learning, active learning, and time series analysis. Dave is affiliated with the Whittier Virtual PICU Lab of Children’s Hospital LA, where he previously worked as Lead Data Scientist. Dave is also a co-founder of the annual Meaningful Use of Complex Medical Data (MUCMD) Symposium, the preeminent forum for research on applying machine learning to clinical data.

Sam Abrahams is a freelance data scientist and engineer, specializing in deep learning. He holds a bachelor’s degree in mathematical economics from the University of Richmond. Sam is skilled at building strategies to answer business challenges with available data and making recommendations for further data collection. He also has experience in teaching technical concepts to people of varied skill and knowledge.