Jan
7,
2021
Real-time anomaly detection with Kafka and Isolation Forests
Anomalies - or outliers - are ubiquitous in data. Be it due to measurement errors of sensors, unexpected events in the environment or faulty behaviour of a machine. In many cases, it makes sense to detect such anomalies in real time in order to be able to react immediately. The data streaming platform Apache Kafka and the Python library scikit-learn provide us with the necessary tools for this.
Dec
16,
2020
Let’s visualize the coronavirus pandemic
Since February, we have been inundated in the media with diagrams and graphics on the spread of the coronavirus. The data comes from freely accessible sources and can be used by everyone. But how do you turn the source data into a data set that can be used to create something visual like a dashboard? With Python and modules like pandas, this is no magic trick.
May
5,
2020
Explainability – a promising next step in scientific machine learning
With the emergence of deep neural networks, the question has arisen how machine learning models can be not only accurate but also explainable. In this article, you will learn more about explainability and what elements it consists of, and why we need expert knowledge to interpret machine learning results to avoid making the right decisions for the wrong reasons.
Continuous Delivery for Machine Learning
In modern software development, we’ve grown to expect that new software features and enhancements will simply appear incrementally, on any given day. This applies to consumer applications such as mobile, web, and desktop apps, as well as modern enterprise software. We’re no longer tolerant of big, disruptive software deployments. ThoughtWorks has been a pioneer in Continuous Delivery (CD), a set of principles and practices that improve the throughput of delivering software to production in a safe and reliable way.
Feb
11,
2020
Tutorial: Explainable Machine Learning with Python and SHAP
Machine learning algorithms can cause the “black box” problem, which means we don’t always know exactly what they are predicting. This may lead to unwanted consequences. In the following tutorial, Natalie Beyer will show you how to use the SHAP (SHapley Additive exPlanations) package in Python to get closer to explainable machine learning results.
Jan
28,
2020
Deep Learning: Not only in Python
Although there are powerful and comprehensive machine learning solutions for the JVM with frameworks such as DL4J, it may be necessary to use TensorFlow in practice. This can, for example, be the case if a certain algorithm exists only in a TensorFlow implementation and the effort to port the algorithm into another framework is too high. Although you interact with TensorFlow via a Python API, the underlying engine is written in C++. Using the TensorFlow Java wrapper library, you can train and inference TensorFlow models from the JVM without having to rely on Python. Existing interfaces, data sources, and infrastructures can be integrated with TensorFlow without leaving the JVM.