You recently published a book called “First Contact with TensorFlow”.
First of all, why this book is freely available on-line?
I hope this book adds some value to this world of education that I love so much. I think that knowledge is liberation and should be accessible to all. For this reason, the content of this book will be available on this website completely free. Of course, if the reader prefers to opt for a paper copy, you can purchase the book through Amazon or Lulu portals.
What is your book about?
As the title indicates, it is a first contact with TensorFlow , to get started with Deep Learning programming. A few months ago, Google offered to the community, as an open source, its Machine Learning engine (with name TensorFlow) that can be used by developers and researchers who want to incorporate Machine Learning in their projects and products. The book has a practical nature, and therefore I reduced the theoretical part as much as possible, assuming that the reader has some basic understanding about Machine Learning.
Why is it important to incorporate advanced analytics in projects and products?
Today we find ourselves immersed in a new process where “things” pass from the physical world to the digital world and are accessible from any electronic device. Cloud Computing is what makes it possible for digital technology to penetrate every corner of our economy and society. This new scenario not only allows users to be connected to this coming digital world through their mobile devices, but it is also allowing the connection of any object or device. This will cause a deluge of digital information, which is known as Big Data, and computing systems should include advanced analytics tools to offer predictive models.
Given your background, why did you write a book about Machine Learning technology?
Actually, my research focus is gradually moving from supercomputing architectures and runtimes to execution middleware’s for big data workloads, and more recently to platforms for Machine Learning on massive data. Precisely by being an engineer, not a data scientist, I think I can contribute with this introductory approach to the subject, and that it can be helpful for many engineers in the early stages; then it will be their choice to go deeper into what they need.
What is Machine Learning?
We can consider Machine Learning as a field of Computer Science that evolved from the study of pattern recognition and computational learning theory into Artificial Intelligence. Its goal is to give computers the ability to learn without being explicitly programmed. For this purpose, Machine Learning uses mathematical/statistical techniques to construct models from a set of observed data rather than have specific set of instructions entered by the user that define the model for that set of data.
Why Deep Learning?
Conventional Machine Learning techniques were limited in their ability to process natural data in their raw form; for example, to identify objects in images or transcribe speech into text. Increasingly, these applications make use of a class of techniques called Deep Learning. These methods are based on “deep” multi-layer neural networks with many hidden layers that attempt to model high-level abstractions in data. Right now, a research in diverse types of Deep Learning’s networks is being developed. In any case, Deep Convolutional Nets have brought breakthroughs in processing images and speech, whereas Recurrent Nets have done a good job on sequential data such as text and speech (we are doing research in these networks).
Why is Deep Learning taking off now?
It is know that many of the ideas used in Deep Learning have been around for decades. One of the key drivers of its current progress is clearly the huge deluge of data available today. Thanks to the advent of Big Data these models can be “trained” by exposing them to large data sets that were previously unavailable. But another not less important driver is the computation power available today. As an example, due the raising of GPUs, Deep Learning’s community started shifting to GPUs.
And what about using supercomputers like Marenostrum?
Now I’m shifting to High Performance Computing/Supercomputing (HPC) too. Not only Deep Learning, but also Machine Learning in general, that should embrace HPC because there are many opportunities.
That’s what is called High-Performance Big-Data Analytics?
Driving new insights based on the massive amounts of available data requires a new development of High Performance Computing systems, enabling the convergence of advanced analytic algorithms and big data technologies that abstract the manipulation and processing of data facilitating the analysis of very large data sets. We use the term High-Performance Big-Data Analytics (HPBDA) to refer to the integration of the best of Analytics knowledge with new Big Data middleware and the awesome power of emerging computational hardware systems to give support to the new era of Cognitive Computing.
What do you mean with ‘Cognitive Computing’?
I’m convinced that Machine Learning is in the process of becoming automated and commoditized. Doing such computing systems will include advanced analytics tools in their systems middleware (adding a cognitive layer in their software stack) to offer advanced analytics to have “wiser” computers. The general idea is that instead of instructing a computer what to do, we are going to simply throw data at the problem and tell the computer to figure it out itself. For this purpose some companies said that computers are taken functions from the human brain like inference, prediction or abstraction, hence the name Cognitive Computing. Here, projects like Microsoft Oxford, IBM Watson, Google DeepMind, Baidu Minwa, Facebook, among others play a big role.
What is the next challenge for Cognitive Computing?
Most machine learning systems are based on a large collection of examples depicting how the system is expected to perform. This is what in research is known as supervised learning, where the software is trained with data labeled (in general by humans), for example, images tagged with the names of the objects they depict. However, supervised learning works really well when we have a large and clean dataset, which is costly and scarce. On the other hand, most data available is not associated to any annotation and still holds a high potential that remains to be unlocked. Processing unstructured and not annotated data is addressed by a second family of techniques named unsupervised learning. These research venues are expected to play a key role in building realistic intelligent systems. For example, most of human and animal learning is unsupervised learning, which requires a very small amount of training examples. We need to solve the unsupervised learning problem before we can even think of getting to true Cognitive Computing systems.
Is this book already available?
Yes, it is available on the web page www.JordiTorres.Barcelona/TensorFlow and paper version at Lulu/Amazon’s portal. Spanish version is also available on this web and paper version at Amazon’s portal. Also a course presentation based in the book is available here (very soon I will post more documentation).
First versión of this post published on 8 Feb 2016 (before publishing the book)
Updated of this post published on 8 May 2016 (after publishing the book)