AI research analyst
Turns model news, benchmark papers, and pricing changes into buyer-readable research with source trails.
Careers
Edxperimental Labs is early. The first roles are for people who can research AI deeply, turn messy signals into structured analysis, and help clients choose systems with evidence.
Turns model news, benchmark papers, and pricing changes into buyer-readable research with source trails.
Can convert a messy workflow into a task packet, scoring rubric, repeatable harness, and inspection page.
Can listen to a client problem, scope the smallest benchmark sprint, and explain technical tradeoffs clearly.
Role scorecards
Early roles are scoped around visible artifacts, not job-title theatre. The best signal is a piece of work a technical buyer can inspect.
Open to exceptional candidates
Publish two research notes, one market map, and one benchmark-methodology memo.
Project-based first
Ship one new benchmark suite with trace fixtures, scoring ledger, and Playwright verification.
Selective outreach
Build three diagnostic briefs, one proposal template, and a handoff workflow for Saujas-led discovery.
Operating rhythm
The work alternates between public research, benchmark harnesses, Studio product demos, and consulting delivery. People who like ambiguous work but need concrete proof tend to fit.
Read primary sources, extract the buyer implication, publish a chart or table, and link the claim to evidence.
Write the task packet before running models, capture traces, score with a rubric, and expose failure modes.
Start with workflow risk, define the first sprint, route ownership, and deliver a decision artifact quickly.
Turn repeated consulting work into Studio demos, public articles, and reusable benchmark surfaces.
Application packet
Benchmark note
A short critique of SWE-bench, LiveBench, BrowserGym, HELM, or another benchmark with one improvement proposal.
Cost model
A small spreadsheet or code snippet estimating cost per accepted output for a real AI workflow.
Workflow task
One realistic task packet with expected output, evidence requirement, scoring rubric, and failure mode.
Studio demo
A tiny interface, notebook, or script that makes an AI evaluation easier to inspect.
Candidate packet
The generated packet gives candidates and reviewers the same operating surface: role prompts, artifact options, a rubric, and a structured schema for future hiring automation.
Careers Operating Plan
The generated operating plan keeps hiring practical: artifact screen, technical conversation, client-simulation screen, and a trial artifact before any broad role commitment.
4
Stages
4
90-day windows
5
Reviewer questions
Sanjay Prasad
Does the artifact show evidence taste, builder speed, and buyer clarity?
Sanjay Prasad
Can the candidate reason from evidence instead of reciting AI discourse?
Saujas
Can the candidate translate technical details into a scoped business decision?
Sanjay Prasad and Saujas
Does the work fit the publishing, Studio, benchmark, or consulting operating loop?
Days 1-15
Days 16-45
Days 46-75
Days 76-90
Reviewer questions
What primary source would change your conclusion?
Which claim in your artifact is weakest, and what proof would fix it?
How would this become a Studio product, benchmark task, article, or consulting deliverable?
What would you remove if a buyer had only five minutes?
What should Sanjay review and what should Saujas review?
Evaluation rubric
The review process favors artifacts that a technical buyer can inspect in minutes. We care about source quality, decision clarity, and whether the work could become a product, article, benchmark, or consulting deliverable.
Evidence taste
Does the work cite primary sources, inspect failure modes, and avoid vague AI claims?
Builder speed
Can the candidate turn an idea into a page, harness, model, or memo quickly?
Buyer clarity
Would a founder, engineering lead, or operator know what decision to make after reading it?
Scope discipline
Does the work stay focused without pretending to solve the entire field?
Careers Role Catalog
The generated catalog explains what each early role owns, what proof signals matter, what first artifacts should look like, and which trial project would prove fit.
4
Roles
4
Pipeline stages
5
Rubric dimensions
Sanjay Prasad
Open to exceptional candidates
Turn primary sources, model releases, benchmark papers, and pricing changes into buyer-readable research.
First artifacts
Proof signals
Publish a short research note on an AI benchmark, model release, or cost curve with one original chart and a weakest-claim note.
Sanjay Prasad
Project-based first
Convert messy workflows into task packets, trace schemas, scoring rubrics, replay artifacts, and inspection pages.
First artifacts
Proof signals
Design one benchmark task and artifact bundle for an agent, browser, support, security, or Indian-workflow suite.
Saujas
Selective outreach
Translate buyer problems into the smallest benchmark sprint, demo path, and evidence request that can answer the decision.
First artifacts
Proof signals
Write a discovery memo for a hypothetical buyer and convert it into a one-week benchmark sprint with owner routing.
Sanjay Prasad
Project-based first
Turn repeated research and consulting workflows into usable Studio demos, calculators, dashboards, and product packets.
First artifacts
Proof signals
Build a small Studio interface that makes one benchmark, cost curve, or service decision easier to inspect.
Application form
This form turns the candidate packet into a structured application record: role track, work sample, artifact link, source trail, availability, and routing owner.
Research and benchmark roles
Routed to Sanjay for evidence, benchmark, and systems review.
Sales engineering
Routed to Saujas for discovery, solution framing, and client-facing judgment.
How to stand out
We care more about analytical taste and execution speed than a formal application. Good work should make a technical buyer smarter within five minutes.