TensorFlow.js: What is possible with Machine learning in the browser?

19
Feb

TensorFlow.js: What is possible with Machine learning in the browser?

When you talk about Machine Learning and Google's TensorFlow, most people think of Python and specialized hardware rather than JavaScript and any browser. This article explains what TensorFlow.js can do and why it makes sense to run machine learning in a browser.

TensorFlow.js is a JavaScript library that runs in a browser as well as with Node.js on a server. However, in this article our scope of interest is only for the application in the browser. The interface of TensorFlow.js is strongly based on TensorFlow’s High Level API Keras. Keras code is often only distinguishable from TensorFlow.js code at second glance. Most differences are due to the different language constructs of Python and JavaScript for configuration parameters.

Machine learning with every GPU

TensorFlow.js allows you to build machine learning projects from zero. If the necessary data is available, models can be trained and executed directly in the browser. For this, TensorFlow.js uses the graphics card (GPU) of the computer via the WebGL browser API. It does lose some performance as a result because WebGL needs a few tricks to force it to execute the matrix multiplication required by TensorFlow.js. Yet these are necessary because TensorFlow.js, as a machine learning strategy, mainly supports neural networks. These can be mapped very well by matrix multiplications during training as well as during prediction. Here we already see the first advantage of TensorFlow.js over TensorFlow: While TensorFlow currently only supports NVIDIA GPU via CUDA, TensorFlow.js works with any graphics card. Listing 1 contains the code to use the High Level API to create a sequential neural network in the browser. If you know TensorFlow’s Keras API, things will become clear very quickly for you. Tutorials can be found on tensorflow.org.

 

Listing 1

// create a sequential model
const model = tf.sequential();
 
// add a fully connected layer with 10 units (neurons)
model.add(tf.layers.dense({units: 10}));
 
// add a convolutional layer to work on a monochrome 28x28 pixel image with 8
// filter units
model.add(tf.layers.conv2d({
  inputShape: [28, 28, 1],
  filters: 8
}));
 
// compile the model like you would do in Keras
// the API speaks for itself
model.compile({
  optimizer: 'adam',
  loss: 'categoricalCrossentropy',
  metrics: ['accuracy']
});

Interact with all browser APIs

Addressing interfaces on different operating systems and devices can still be a painful experience. Not so when developing a browser-based application. Even access to such complex hardware as a camera or microphone is anchored in the HTML standard and supported by all current browsers. Also, the nature of the browser, which is naturally designed for interaction will also suit you here. Interactive applications with a share of machine learning are therefore now easier than ever.
As an example, we have a simple game, Scavenger Hunt, which of course can also run in the browser of a mobile phone and thus brings the most fun. As shown in Figure 1, in the real world you have to quickly find an object that matches the displayed emoticon. For this purpose, the built-in camera and a trained neural network is used, which can detect the matching objects. Such a model can be used by any JavaScript developer even without machine learning skills.

 

Machine learning without the need for installation on each computer

TensorFlow.js allows you to deploy a model created in advance with TensorFlow. This model may already have been fully or partly trained on strong hardware. In the browser it then comes down only to the application or it is further trained. Figure 2 shows a Pac-Man variant that can be controlled by different poses. It is based on a pre-trained network, which is trained in the browser on its own poses. We refer to this as transfer-learning.

The model is converted by a supplied program and can be loaded asynchronously after loading by entering a line similar to the following:

 

const model = await tf.loadModel('model.json')

 

After that, the model is no longer distinguishable from a model created directly in the browser. So it can, for example, be easily used for prediction, which in turn is executed asynchronously on the GPU:

const example = tf.tensor([[150, 45, 10]]);
const prediction = model.predict(example);
const value = await prediction.data();

In addition to entertainment through games, even more useful applications are conceivable here. Navigation or interaction through gestures can be helpful for people with disabilities as well as for people in special situations. And as I already mentioned: you get all this without any required installation, by simply loading a website.
Another example of such position detection is the PoseNet in Figure 3 . It has already been pre-trained to the point that it can recognize the position of face, arms and legs even with several people in the picture. Here, too, the potential for gaming to controlling serious applications is there, even from a certain distance. The use of PoseNet is again quite trivial and does not even require basic knowledge in the field of machine learning. Listing 2 briefly outlines how easy it is.

Listing 2
import * as posenet from '@tensorflow-models/posenet';
import * as tf from '@tensorflow/tfjs';

// load the posenet model
const model = await posenet.load();

// get the poses from a video element linked to the camera
const poses = await model.estimateMultiplePoses(video);

// poses contain
// - confidence score
// - x, y positions

 

User data does not need to leave the browser

Especially now, when data protection in accordance with the GDPR is becoming more and more important, people think twice as to whether they want to have a particular cookie on their computer or whether you would like to send a statistic of their user data to the manufacturer of a piece of software to improve their user experience. But what about inversion? The manufacturer provides a general model of how to use software, and similar to the Pac-Man game described above, it is adapted to the individual user through a transfer-learning model. Not much has happened here yet, but the potential is big. Let’s wait and see what develops.

Stay tuned!
Learn more about ML Conference:

Conclusion

Machine learning in a browser does not seem to make much sense to many developers at first. But if you take a closer look into the subject, there are application possibilities that no other platform can offer:

  • 1. Training : You can interact directly with machine-learning concepts and learn by experimenting.
  • 2. Development : If you already have or want or need to build a JS application, you can use or train machine learning models directly.
  • 3. Games : Real-Time Position Estimation only via the camera (so how people in front of the camera are moving at the moment) or Image Recognition can be coupled directly with games. There are some very cool examples of what you can do with it. However, the possibilities go well beyond gaming.
  • 4. Deployment : You already have a machine learning model and wonder how to put it into production. The browser is a solution. Even finished models can be integrated into your own application without deeper knowledge of machine learning.
  • 5. Interactive visualizations: For interactive clustering or even artistic projects.

As we can see in Figure 4, we still have some drawbacks in terms of performance over TensorFlow for the same hardware. The comparison runs on a 1080GTX GPU and measures the time for a prediction with MobileNet, as it is also used for the examples shown here. In this case, TensorFlow runs three to four times faster than TensorFlow.js; however, both values are very low. The WebGPU standard, which will allow more direct access to the GPU, gives hope for better performance.

Behind the Tracks