Two great speakers will give presentations on "Incorporating New Knowledge Into Language Models" & "Building a Domain-Specific Search Engine": Nils Reimers from co:here and Matthias Richter from ML6!
The doors will open at 6:30pm and the talks will start at 7:00pm. To coordinate the registration of remote participants and on-site participants, there are two separate meetup pages. This page is for the registration of on-site participants. A separate page is for the registration of remote participants. Sign up here to join via Zoom!
Language models work well for many NLP tasks, but they have one big weakness: Each day passing since they have been pre-trained/fine-tuned, their knowledge becomes more and more obsolete. For example, the BERT model still thinks that Barack Obama is the current US president. Especially in semantic search this is a big issue, as we often search for the most recent events. In this talk, I will give an overview of how to include new knowledge into language models like BERT with a special focus on search. I will then present Generative Pseudo Labeling (GPL), an efficient method to adapt semantic search models to new domains & datasets.
Semantic search engines enjoy more and more attention and at ML6 we deal with a lot of different domains and datasets. In this talk, I will give some insights into practical use cases where semantic search beats classical lexical-based search engines. The latest version of the Haystack framework already integrated an implementation of Generative Pseudo Labeling (GPL). I will demonstrate how you can easily use GPL to adapt a dense retriever to any domain-specific dataset and build a semantic search engine on top. To this end, I will showcase a small demo that compares the results of different search approaches.