BERT is a state-of-the-art natural language processing (NLP) model that allows pretraining on unlabelled text data and later transfer training to a variety of NLP tasks. Due to its promising novel ideas and impressive performance we chose it as a core component for a new natural language generation product. Reading a paper, maybe following a tutorial with example code and putting a working piece of software into production are, however, two totally different things.
In this session, we will tell you how we trained a custom version of the BERT network and included it into a natural language generation (NLG) application. You will hear how we arrived at the decision to use BERT and what other approaches we tried. We will tell you about the failures and the mistakes we made so you do not have to repeat them, but also about the surprises, successes and lessons learned.