Job Searcher
重點摘要
Back to Articles Job Searcher Team Article Published June 6, 2026 Upvote 2 Emre emrekuruu Follow build-small-hackathon Job hunting as a new grad is a full-time job by itself. You sift through hundreds of postings every week to find a handful worth applying to. You click "Easy Apply" until your eyes hurt. You write the same cover letter forty times. By month two of a search, you're applying to roles you wouldn't take, in industries you don't care about, because at that point the cost of thinking about each listing is higher than the cost of submitting to one. Watch the short tour: drop a resume, watch the queries stream, read the per-job reasoning. How it works A run has three steps. Queries. The student reads the resume and the preferences you set (job type, work modality, location, free-f
Back to Articles Job Searcher Team Article Published June 6, 2026 Upvote 2 Emre emrekuruu Follow build-small-hackathon Job hunting as a new grad is a full-time job by itself. You sift through hundreds of postings every week to find a handful worth applying to. You click "Easy Apply" until your eyes hurt. You write the same cover letter forty times. By month two of a search, you're applying to roles you wouldn't take, in industries you don't care about, because at that point the cost of thinking about each listing is higher than the cost of submitting to one. Watch the short tour: drop a resume, watch the queries stream, read the per-job reasoning. How it works A run has three steps. Queries. The student reads the resume and the preferences you set (job type, work modality, location, free-form notes) and drafts a small set of LinkedIn-shaped search queries, reasoning out loud as it goes. Search. Those queries hit LinkedIn through JobSpy, one at a time. Scoring. For each posting, the model reads the (resume, job) pair and writes a five-dimension fit score: skills match experience relevance education and certifications industry / domain fit seniority alignment Figure 1. End-to-end steps of the framework. What you get back isn't a list of fifty roles. It's a small shortlist with defensible reasoning. You can read why the model thinks the second-ranked job beats the third. Technical Details Dataset Curation - The teacher and the student The teacher is DeepSeek V4 Pro. Strong at structured reasoning, willing to follow a strict output schema, cheap enough to run once over a large corpus offline. It is used as a label generator, not as an inference-time dependency. The student is Qwen3-8B. Small enough to fit on a single ZeroGPU slice once quantized to Q4_K_M, large enough to absorb the teacher's structured judgement. The corpus came from a closed loop, resume-aware end-to-end: Resumes. 2,500, built on Divyaamith/Kaggle-Resume. Queries. The teacher first drafted LinkedIn-shaped search queries from each resume. Jobs. JobSpy then scraped LinkedIn for what those queries actually returned. About 10,000 postings, every one of them surfaced by a query the teacher itself wrote for that specific resume. Labels. The teacher then scored every resulting (resume, job) pair across the same five dimensions used at inference, with one sentence of reasoning per dimension. Everything ships in four foreign-key-clean configs at build-small-hackathon/job-search-distill. Training (Modal) Two LoRA SFT runs on a single A100 via Modal, one per task: Adapter. Rank 16, alpha 16, dropout off, attention plus MLP projections. Schedule. One epoch per task. Mid-epoch checkpoints every 200 steps so a partial run could be sanity-checked before the full one finished. Output. Safetensors at build-small-hackathon/job-searcher-qwen3-8B, and a Q4_K_M base plus LoRA-GGUF sidecars at build-small-hackathon/job-searcher-qwen3-8B-gguf for the llama.cpp serving path. LoraConfig( r=16, lora_alpha=16, task_type="CAUSAL_LM", target_modules=[ "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", ], ) The Space - Inference (llama.cpp) The Space runs llama-cpp-python with the pre-built CUDA wheel on a HuggingFace ZeroGPU Space. Two design choices that matter: Llama inside @spaces.GPU. ZeroGPU recycles the CUDA context per call, so a module-level instance would hold a dead context on the second use. One GPU call per submission, not per job. All fit evaluations for one submission run inside a single @spaces.GPU call. The model loads once and yields events for every job, instead of paying a fresh cold start and a fresh proxy-token request per posting. Streaming uses the OpenAI-shaped create_chat_completion(stream=True) so the reasoning lands in the UI token by token. The live demo is at build-small-hackathon/job-search-assistant. The traces The entire Claude Code session that built this Space is published as an HuggingFace agent-traces dataset at build-small-hackathon/job-search-assistant-agent-trace. Raw JSONL events, native HuggingFace trace viewer, every dead end and recovery on the record. Useful if you want to see how this thing actually came together rather than read the cleaned-up version of it. Try it Drop your resume at huggingface.co/spaces/build-small-hackathon/job-search-assistant. Stop sifting. What I learned Two adapters beat one. I tried folding query generation and fit evaluation into a single LoRA. The model leaked formatting both ways, JSON on the query task and prose on the eval. Splitting them into two heads on the same base, hot-swapped per call, killed the whole class of bugs. The teacher's prompt mattered more than the student's size. Rewriting the teacher's labelling prompt to score against specific resume details ("four years of Rust; the role asks for five" instead of "strong technical match") propagated through distillation. The student picked up the same habit. Models mentioned in this article 2 Datasets mentioned in this article 3 Spaces mentioned in this article 1 More from this author Thousand Token Wood: shipping a multi-agent economy on a 3B model 1 June 5, 2026 Community EditPreview Upload images, audio, and videos by dragging in the text input, pasting, or clicking here. Tap or paste here to upload images Comment · Sign up or log in to comment Upvote 2 Models mentioned in this article 2 Datasets mentioned in this article 3 Spaces mentioned in this article 1
Related
相關文章

Token成本算盤打響,Seedance開始駛向“五環外”
這篇消息聚焦「Token成本算盤打響,Seedance開始駛向“五環外”」。原始導語提到:視頻AI的決勝場,不在模型本身。 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

Pixel 10 手機用戶反饋谷歌 AI“搶鏡”問題,Gmail 無法正常回復郵件
科技媒體 Android Authority 昨日(6 月 18 日)發佈博文,報道稱 Pixel 10 系列手機遭遇 AI“搶鏡”問題,用戶在 Gmail 回覆郵件時無法彈出輸入法鍵盤,優先顯示 Help me write 功能。

DeepSeek 識圖模式正式上線 App 和網頁端
DeepSeek 多模態研究員 Xiaokang Chen 今日表示,DeepSeek 的識圖模式已在網頁和 App 端正式上線。IT之家測試,目前 DeepSeek 的 App 端識圖模式依然提示“圖片理解功能內測中”,網頁端沒有這項提示。

微信、豆包之後,消息稱阿里將推“千問輸入法”
千問團隊將推出名為“千問輸入法”的獨立 App,與 PC 端的千問語音輸入法有一定區別,AI 功能、鍵盤會更貼合手機端操作,填補千問在移動端 AI 輸入法賽道的空白,產品已開發完成,擇日上線各大應用商店。
Kimi Work 迎重大升級:推出“目標模式”並打通外部應用插件
月之暗面旗下 Kimi 電腦客戶端近日煥新升級,為 Kimi Work(Beta 版)引入兩項重磅新特性:目標模式實現連續自主工作 24 小時,插件中心正式對接多家主流辦公軟件,提升工作流效率。為加速用戶深度體驗,官方同步推出限時優惠,2026 年 6 月全月,使用 Work 模式的會員額度消耗直接打 5 折,帶來實惠。
網易雲音樂旗下AI情感陪伴App“妙時”宣佈7月14日停運
網易雲音樂旗下“妙時”(含AI奇遇)AI情感陪伴應用發佈停運公告,將於7月14日0時全面停止服務。客服迴應屬正常業務調整,不影響其他產品。目前已停止新用戶註冊和充值,用戶可在8月14日前申請退還剩餘代幣和會員費,並導出AI戀人聊天記錄。