Careers

Help build India’s independent AI analysis layer.

Edxperimental Labs is early. The first roles are for people who can research AI deeply, turn messy signals into structured analysis, and help clients choose systems with evidence.

AI research analyst

Turns model news, benchmark papers, and pricing changes into buyer-readable research with source trails.

Benchmark engineer

Can convert a messy workflow into a task packet, scoring rubric, repeatable harness, and inspection page.

Sales engineer

Can listen to a client problem, scope the smallest benchmark sprint, and explain technical tradeoffs clearly.

Role scorecards

What early candidates should be able to ship

Early roles are scoped around visible artifacts, not job-title theatre. The best signal is a piece of work a technical buyer can inspect.

Open to exceptional candidates

AI research analyst

Publish two research notes, one market map, and one benchmark-methodology memo.

Model/provider comparisonSource-backed chartBenchmark critique

Project-based first

Benchmark engineer

Ship one new benchmark suite with trace fixtures, scoring ledger, and Playwright verification.

Eval harnessTrace schemaScorecard UI

Selective outreach

Sales engineer

Build three diagnostic briefs, one proposal template, and a handoff workflow for Saujas-led discovery.

Discovery memoDemo walkthroughSprint proposal

Operating rhythm

Small team, high evidence standard.

The work alternates between public research, benchmark harnesses, Studio product demos, and consulting delivery. People who like ambiguous work but need concrete proof tend to fit.

1

Research loop

Read primary sources, extract the buyer implication, publish a chart or table, and link the claim to evidence.

2

Benchmark loop

Write the task packet before running models, capture traces, score with a rubric, and expose failure modes.

3

Client loop

Start with workflow risk, define the first sprint, route ownership, and deliver a decision artifact quickly.

4

Product loop

Turn repeated consulting work into Studio demos, public articles, and reusable benchmark surfaces.

Application packet

Send one strong artifact and a short note.

Benchmark note

A short critique of SWE-bench, LiveBench, BrowserGym, HELM, or another benchmark with one improvement proposal.

Cost model

A small spreadsheet or code snippet estimating cost per accepted output for a real AI workflow.

Workflow task

One realistic task packet with expected output, evidence requirement, scoring rubric, and failure mode.

Studio demo

A tiny interface, notebook, or script that makes an AI evaluation easier to inspect.

Candidate packet

Download the work-sample packet.

The generated packet gives candidates and reviewers the same operating surface: role prompts, artifact options, a rubric, and a structured schema for future hiring automation.

Careers Operating Plan

A small-team hiring loop built around visible work.

The generated operating plan keeps hiring practical: artifact screen, technical conversation, client-simulation screen, and a trial artifact before any broad role commitment.

4

Stages

4

90-day windows

5

Reviewer questions

1

Artifact screen

Sanjay Prasad

Does the artifact show evidence taste, builder speed, and buyer clarity?

2

Technical conversation

Sanjay Prasad

Can the candidate reason from evidence instead of reciting AI discourse?

3

Client-simulation screen

Saujas

Can the candidate translate technical details into a scoped business decision?

4

Trial artifact

Sanjay Prasad and Saujas

Does the work fit the publishing, Studio, benchmark, or consulting operating loop?

Days 1-15

Orientation through shipped artifacts

One source-backed research noteOne benchmark critiqueOne internal operating note

Days 16-45

Own one public surface

One article or Studio packetOne chart-ready source tableOne review-ready benchmark/task packet

Days 46-75

Turn a client question into a reusable asset

Discovery memoSprint scopeReusable template, report, or demo surface

Days 76-90

Publish and hand off

Public artifactMaintenance checklistNext-measurement backlog

Reviewer questions

What primary source would change your conclusion?

Which claim in your artifact is weakest, and what proof would fix it?

How would this become a Studio product, benchmark task, article, or consulting deliverable?

What would you remove if a buyer had only five minutes?

What should Sanjay review and what should Saujas review?

Evaluation rubric

How we review early work.

The review process favors artifacts that a technical buyer can inspect in minutes. We care about source quality, decision clarity, and whether the work could become a product, article, benchmark, or consulting deliverable.

Evidence taste

Does the work cite primary sources, inspect failure modes, and avoid vague AI claims?

Builder speed

Can the candidate turn an idea into a page, harness, model, or memo quickly?

Buyer clarity

Would a founder, engineering lead, or operator know what decision to make after reading it?

Scope discipline

Does the work stay focused without pretending to solve the entire field?

Careers Role Catalog

Artifact-led roles for a tiny research and product team.

The generated catalog explains what each early role owns, what proof signals matter, what first artifacts should look like, and which trial project would prove fit.

4

Roles

4

Pipeline stages

5

Rubric dimensions

Sanjay Prasad

AI research analyst

Open to exceptional candidates

82

Turn primary sources, model releases, benchmark papers, and pricing changes into buyer-readable research.

First artifacts

Source-backed research noteChart-ready source tableModel/provider decision memo

Proof signals

Primary-source disciplineClear uncertainty labelsOne useful visualBuyer decision

Publish a short research note on an AI benchmark, model release, or cost curve with one original chart and a weakest-claim note.

Sanjay Prasad

Benchmark engineer

Project-based first

78

Convert messy workflows into task packets, trace schemas, scoring rubrics, replay artifacts, and inspection pages.

First artifacts

Task packetScoring rubricTrace schemaReplay checklist

Proof signals

Task realismLeakage controlReviewer-friendly scoringRunnable verification

Design one benchmark task and artifact bundle for an agent, browser, support, security, or Indian-workflow suite.

Saujas

Sales engineer

Selective outreach

76

Translate buyer problems into the smallest benchmark sprint, demo path, and evidence request that can answer the decision.

First artifacts

Discovery memoBuyer question mapSprint scopeDemo handoff

Proof signals

Client diagnosisScope disciplineTechnical translationNext-action quality

Write a discovery memo for a hypothetical buyer and convert it into a one-week benchmark sprint with owner routing.

Sanjay Prasad

Studio product builder

Project-based first

72

Turn repeated research and consulting workflows into usable Studio demos, calculators, dashboards, and product packets.

First artifacts

Interactive demoProduct packetVerification screenshotDemo readiness notes

Proof signals

Useful interfaceResponsive polishEvidence-linked controlsClear handoff

Build a small Studio interface that makes one benchmark, cost curve, or service decision easier to inspect.

Application form

Submit one artifact for review.

This form turns the candidate packet into a structured application record: role track, work sample, artifact link, source trail, availability, and routing owner.

Research and benchmark roles

Routed to Sanjay for evidence, benchmark, and systems review.

Sales engineering

Routed to Saujas for discovery, solution framing, and client-facing judgment.

Submit one inspectable artifact and enough context for a focused review.

How to stand out

Send one artifact: a benchmark note, market map, eval design, cost model, or working demo.

We care more about analytical taste and execution speed than a formal application. Good work should make a technical buyer smarter within five minutes.