February 26-28, 2025
Montreal, Canada

Efficient Code Searching for Large Codebases

This session covers how embedding models can retrieve relevant method matches more effectively than traditional search. We’ll explore real-world examples, like migrating identity providers, where we can use embeddings to identify all HTTP authorization headers across libraries. Additionally, we’ll cover how the top-k selection technique improves search speed and accuracy, by leveraging code-as-data models which use Lossless Semantic Trees.

View all 191 sessions

Justine Gehring

Moderne

Justine Gehring is a researcher in the field of Machine Learning for code (ML4Code) and Graph Neural Networks (GNNs). Justine obtained her master's from McGill and Mila where her research focused on generating code under challenging circumstances such as library-specific code. Presently, Justine is a research engineer at Moderne, focusing on leveraging AI for large-scale code refactoring and impact analysis. She also oversees the partnership with Mila.

Read More