ML Basics & Principles


Address Matching with NLP in Python

Discover the power of address matching in real estate data management with this comprehensive guide. Learn how to leverage natural language processing (NLP) techniques using Python, including open-source libraries like SpaCy and fuzzywuzzy, to parse, clean, and match addresses. From breaking down data silos to geocoding and point-in-polygon searches, this article provides a step-by-step approach to creating a Source-of-Truth Real Estate Dataset. Whether you're in geospatial analysis, real estate data management, logistics, or compliance, accurate address matching is the key to unlocking valuable insights.

AI is a Human Endeavor

As AI advances, calls for regulation are increasing. But viable regulatory policies will require a broad public debate. We spoke with Mhairi Aitken, Ethics Fellow at the British Alan Turing Institute, about the current discussions on risks, AI regulation, and visions of shiny robots with glowing brains.

AI Alignment

At least since the arrival of ChatGPT, many people have become fearful that we are losing control over technology and that we can no longer anticipate the consequences they may have. AI Alignment deals with this problem and the technical approaches to solve it.

Scalable Programming

Java continuously introduces new, useful features. For instance, Java 8 introduced the Stream API, one of the biggest highlights of the past few years. But is aggregating data with the Stream API a panacea? In this article, I’d like to explore if there’s a better alternative for certain cases from a complexity perspective.

Why are we doing this anyway?

Modularization is frequently discussed, but after some time, the speakers realize that they don’t mean the same thing. Over the last fifty years, computer science has given us a number of good explanations about what modularization is all about—but is that really enough to come to the same conclusions and arguments?

On pythonic tracks

Python has established itself as a quasi-standard in the field of machine learning over the last few years, in part due to the broad availability of libraries. It is logical that Oracle did not really like to watch this trend — after all, Java has to be widely used if it wants to earn serious money with its product. Some time ago, Oracle placed its own library Tribuo under an open source license.

Behind the Tracks