Announcing GPU-Enabled TensorFlow Instances on AWS
At Altoros, we are working on multi-tenant TensorFlow for AWS. We believe that TensorFlow has the potential to become the engine behind the most successful consumer and industrial products of the next 10–20 years.
Today, we are excited to share with you the first primitive—an Amazon Machine Image for AWS GPU instances.
Why are we doing it?
We want to make TensorFlow even more accessible. Without this AMI, you will have to register on the NVIDIA portal, submit an application, and wait for about a day to get the library. Yes, we know…Then, you bake your own image, configure it, debug it, and, hopefully, you will get a working instance of TensorFlow on an AWS GPU virtual machine. However, to move forward with our work, we needed a one-click deployment experience.
So, over the past weeks, we’ve worked with our colleagues from NVIDIA to bundle the cuDDN library with the AMI. Good news is that NVIDIA and Google TensorFlow teams are working together. What does it mean? I would guess that in the next few weeks you can expect TensorFlow to be shipped with cuDDN baked in as a dependency.
Moving forward, we are making a commitment to support and maintain this machine image for the benefit of the community.
A friend of mine asked me why we are working with TensorFlow. Because I believe that everyone deserves an answer.
A better question is how, and in what new ways, can we:
- train more computers to teach themselves by sifting through massive amounts of data coming out from all sorts of sources
- make these computers available through an API to provide a high-quality answer to anyone who is in need
Because everyone deserves an answer! So, we’ve been thinking why not standardizing and democratizing the field with and around TensorFlow and helping to change the world?
As we develop multi-tenant capabilities around TensorFlow, I’m looking forward to advancing our collective knowledge of TensorFlow, its use cases, and applications.
What’s next?
We will be working on clustering. That is, enabling autoscaling of TensorFlow on multi-node, multi-GPU clusters of instances. So that you can train your model in hours, instead of days and weeks. Next: Azure, SoftLayer, vCloud, and OpenStack.
If you need any assistance with proof of concepts, deployment, and support through a monthly subscription, drop us a line. We would be happy to provide you and your team with the support you may need to take advantage of TensorFlow.
Everyone deserves an answer. Join the movement. Let’s change the world. 🙂
Further reading
- How to Set Up a GPU-Enabled TensorFlow Instance on AWS
- Getting Started with a CPU-Enabled TensorFlow Instance on AWS
- TensorFlow in the Cloud: Accelerating Resources with Elastic GPUs
- Performance of Distributed TensorFlow: A Multi-Node and Multi-GPU Configuration
Related video