|Tag

Jan 
7, 
2021

Real-time anomaly detection with Kafka and Isolation Forests

Anomalies - or outliers - are ubiquitous in data. Be it due to measurement errors of sensors, unexpected events in the environment or faulty behaviour of a machine. In many cases, it makes sense to detect such anomalies in real time in order to be able to react immediately. The data streaming platform Apache Kafka and the Python library scikit-learn provide us with the necessary tools for this.
Dec 
16, 
2020

Let’s visualize the coronavirus pandemic

Since February, we have been inundated in the media with diagrams and graphics on the spread of the coronavirus. The data comes from freely accessible sources and can be used by everyone. But how do you turn the source data into a data set that can be used to create something visual like a dashboard? With Python and modules like pandas, this is no magic trick.
Apr 
14, 
2020

Continuous Delivery for Machine Learning

In modern software development, we’ve grown to expect that new software features and enhancements will simply appear incrementally, on any given day. This applies to consumer applications such as mobile, web, and desktop apps, as well as modern enterprise software. We’re no longer tolerant of big, disruptive software deployments. ThoughtWorks has been a pioneer in Continuous Delivery (CD), a set of principles and practices that improve the throughput of delivering software to production in a safe and reliable way.
Jan 
28, 
2020

Deep Learning: Not only in Python

Although there are powerful and comprehensive machine learning solutions for the JVM with frameworks such as DL4J, it may be necessary to use TensorFlow in practice. This can, for example, be the case if a certain algorithm exists only in a TensorFlow implementation and the effort to port the algorithm into another framework is too high. Although you interact with TensorFlow via a Python API, the underlying engine is written in C++. Using the TensorFlow Java wrapper library, you can train and inference TensorFlow models from the JVM without having to rely on Python. Existing interfaces, data sources, and infrastructures can be integrated with TensorFlow without leaving the JVM.

Behind the Tracks