DUENDERS

Case Study

Semantic Search with LLMs

Semantic Search with LLMs

Semantic Search with LLMs

Overview

Overview

Overview

Humans think in ideas, not just words. Ever had that moment when a thought or idea is right on the tip of your tongue, but the exact word just won't come? Computers and the internet, however, have long operated in a way where you need to know the exact word or phrase to find what you're looking for. It's like trying to use a search bar but coming up blank because you couldn't hit the exact keyword.

But what if our tech could think more like us? That's where semantic search jumps in. It's like a search upgrade that allows us to retrieve similar concepts, even if we don't use the exact right word.

In this case study, a project we did for a recruiting firm, we'll discuss how this new tech, boosted by something called Large Language Models (technology behind chatbots like ChatGPT), is changing the game in matching candidates to job descriptions.

Problem

Problem

Problem

In the world of recruitment, the Boolean search method has long been assumed like the only alternative. However, its inherent limitations have become increasingly apparent as the landscape of job titles, skills, and industries evolves.


Inflexible Query Size: Crafting a Boolean string that captures the ideal candidate pool is challenging. Using multiple OR operators often yields a flood of candidates, making the process overwhelming. On the flip side, adding more AND operators to refine the search can result in an overly narrow scope, potentially sidelining valuable candidates.


Over-Specification Issues: Aiming to get precise results, recruiters often layer their Boolean search with multiple conditions. However, this can become counterproductive. The more refined the search, the higher the risk of missing out on qualified profiles that don't match the exact string.


Literal Limitations: Boolean searches are unforgivingly exact. They don't understand synonyms, related skills, or variations in job titles. Thus, recruiters often miss candidates who may be using different terminology or phrasing in their CVs.


Lack of Prioritization: One significant challenge with Boolean search results is the absence of ranking. Recruiters get a vast list of candidates without any clear indication of who might be the most relevant, forcing a manual, time-consuming review.


Platform-Specific Challenges: Boolean search isn’t uniformly supported. Each job platform may have its unique requirements, forcing recruiters to tweak their search strings. This inconsistency isn't just frustrating; it's time-consuming.


Inconsistent Boolean Support: Even if platforms support Boolean search, there’s often a lack of standardization in how the search strings are constructed. This variance means recruiters must often reconfigure their approach for each platform, adding to the complexity.


Examples Highlighting Issues:

  • Imagine looking for a "frontend developer" using a Boolean search. A recruiter might miss out on potential candidates who label themselves as "software engineers" or "coders" specializing in the frontend.

  • A job description might prioritize experience with Azure. However, Boolean search wouldn’t understand that a candidate with AWS or GCP knowledge could also be a valuable addition, given the parallel competencies in cloud platforms.

  • Similarly, if a banking role is being filled, Boolean search might not consider a candidate with experience in the insurance sector, despite both industries being closely related under the financial services umbrella.

Approach

Approach

Approach

Leveraging Semantic Search with LLMs.

Semantic search aims to understand the intent and contextual meaning of search phrases. When combined with LLMs like ChatGPT, this approach can discern nuanced relationships between terms, understand industry overlaps, and prioritize profiles.


Key Components:

  1. Large Language Models (LLMs): These models, trained on vast amounts of text data, understand context and can link related concepts. For example, recognizing that 'software engineer' can be a synonym for 'developer'.

  2. ChatGPT: As a prime example of LLMs, ChatGPT can be incorporated to generate or understand content in the context of semantic search.

  3. Vector Stores: These aid in storing and retrieving high-dimensional vectors, often representing words or documents. They assist in fast and efficient similarity searches for semantic search tasks.

Solution

Solution

Solution

Data Preprocessing: Extract key information from CVs, converting them into structured formats and vectors.

Connect to existing ATS and HR tools APIs.


Boolean Search: While we've reiterated that Boolean search has its limitations, it's not without merit. Used in conjunction with other methods, it becomes a powerful tool for precision. For instance, pinpointing a candidate based on a specific location due to on-site job requirements.


Load Embeddings: We segment various pieces of text and information from each CV. Each segment is transformed into a numerical vector, known as an embedding, and then placed in a vector store.


Semantic Search: When searching for a candidate, input any text—like copying and pasting the full job description. The system then extracts multiple queries from this. Each query is converted into an embedding, and the CVs most similar to that query are fetched. These retrieved CVs come with a similarity score in relation to the query. This allows recruiters to prioritize their review starting with the closest matches, as determined by the AI, rather than the arbitrary order found with a basic Boolean Search.

Results

Results

Results

Enhanced Candidate Discovery: Our approach broadened the search, considering profiles that would have been missed in a classic Boolean search.


Prioritized Profiles: Instead of a flat list, recruiters now received a ranked list, enabling them to focus on the most promising candidates first.


Quality Matches: The semantic search yielded candidates that were more in line with the holistic requirements of the job, rather than just ticking keyword boxes.


Recruiters saved 50-70% of the time spent searching and reviewing CVs, having more time to do what brings value to the company: building human relationships with candidates and hiring managers, bringing more business and revenue to the recruiting firm.

Semantic Search can be used pretty much anywhere.

Do you want to query all your word documents, pdf, power points?

Do you want to search over your project management tools (Confluence, Jira, Asana, etc…)?

Do you want to search and summarize meeting transcripts?

Possibilities are endless. Book a free meeting to discuss how Semantic Search can make your current processes more efficient.

Contact Us