An end-to-end AI pipeline that extracts skills from a resume PDF, generates semantic embeddings, and ranks job listings by similarity — automating the job search process.
Job searching is broken. Candidates spend hours manually reading job descriptions, trying to figure out if their skills match. Resume keyword scanning by ATS systems is noisy and misses semantic relationships between skills.
The goal: build an AI-powered tool that reads a resume, understands its skill profile at a semantic level, and automatically surfaces the most relevant job listings — ranked by actual fit, not keyword overlap.
A Python pipeline with four distinct stages: document parsing, NLP skill extraction, semantic embedding generation, and cosine similarity ranking. The result is a ranked list of job listings with match scores.
The key insight is representing both the resume and job descriptions as dense vectors in the same embedding space — then finding which jobs are "closest" to the resume profile geometrically.
Screenshots & demo video coming soon
The pipeline is fully modular — each stage is a separate Python module with a clean interface, making it easy to swap components (e.g., replace the embedding model with a newer one).