Careers

Help build India’s independent AI analysis layer.

Edxperimental Labs is early. The first roles are for people who can research AI deeply, turn messy signals into structured analysis, and help clients choose systems with evidence.

AI research analyst

Turns model news, benchmark papers, and pricing changes into buyer-readable research with source trails.

Benchmark engineer

Can convert a messy workflow into a task packet, scoring rubric, repeatable harness, and inspection page.

Sales engineer

Can listen to a client problem, scope the smallest benchmark sprint, and explain technical tradeoffs clearly.

Role scorecards

What early candidates should be able to ship

Early roles are scoped around visible artifacts, not job-title theatre. The best signal is a piece of work a technical buyer can inspect.

Open to exceptional candidates

AI research analyst

Publish two research notes, one market map, and one benchmark-methodology memo.

Model/provider comparisonSource-backed chartBenchmark critique

Project-based first

Benchmark engineer

Ship one new benchmark suite with trace fixtures, scoring ledger, and Playwright verification.

Eval harnessTrace schemaScorecard UI

Selective outreach

Sales engineer

Build three diagnostic briefs, one proposal template, and a handoff workflow for Saujas-led discovery.

Discovery memoDemo walkthroughSprint proposal

Operating rhythm

Small team, high evidence standard.

The work alternates between public research, benchmark harnesses, Studio product demos, and consulting delivery. People who like ambiguous work but need concrete proof tend to fit.

Research loop

Read primary sources, extract the buyer implication, publish a chart or table, and link the claim to evidence.

Benchmark loop

Write the task packet before running models, capture traces, score with a rubric, and expose failure modes.

Client loop

Start with workflow risk, define the first sprint, route ownership, and deliver a decision artifact quickly.

Product loop

Turn repeated consulting work into Studio demos, public articles, and reusable benchmark surfaces.

Application packet

Send one strong artifact and a short note.

Benchmark note

A short critique of SWE-bench, LiveBench, BrowserGym, HELM, or another benchmark with one improvement proposal.

Cost model

A small spreadsheet or code snippet estimating cost per accepted output for a real AI workflow.

Workflow task

One realistic task packet with expected output, evidence requirement, scoring rubric, and failure mode.

Studio demo

A tiny interface, notebook, or script that makes an AI evaluation easier to inspect.

Candidate packet

Download the work-sample packet.

The generated packet gives candidates and reviewers the same operating surface: role prompts, artifact options, a rubric, and a structured schema for future hiring automation.

Candidate Application PacketRole tracks, application instructions, work-sample prompts, and team contacts.Candidate Work-Sample RubricReview dimensions for evidence taste, builder speed, buyer clarity, scope, and polish.Careers Application SchemaStructured fields for a future application form, CRM route, or hiring inbox workflow.

Careers Operating Plan

A small-team hiring loop built around visible work.

The generated operating plan keeps hiring practical: artifact screen, technical conversation, client-simulation screen, and a trial artifact before any broad role commitment.

Stages

90-day windows

Reviewer questions

Operating plan Careers manifest

Artifact screen

Sanjay Prasad

Does the artifact show evidence taste, builder speed, and buyer clarity?

Technical conversation

Sanjay Prasad

Can the candidate reason from evidence instead of reciting AI discourse?

Client-simulation screen

Saujas

Can the candidate translate technical details into a scoped business decision?

Trial artifact

Sanjay Prasad and Saujas

Does the work fit the publishing, Studio, benchmark, or consulting operating loop?

Days 1-15

Orientation through shipped artifacts

One source-backed research noteOne benchmark critiqueOne internal operating note

Days 16-45

Own one public surface

One article or Studio packetOne chart-ready source tableOne review-ready benchmark/task packet

Days 46-75

Turn a client question into a reusable asset

Discovery memoSprint scopeReusable template, report, or demo surface

Days 76-90

Publish and hand off

Public artifactMaintenance checklistNext-measurement backlog

Reviewer questions

What primary source would change your conclusion?

Which claim in your artifact is weakest, and what proof would fix it?

How would this become a Studio product, benchmark task, article, or consulting deliverable?

What would you remove if a buyer had only five minutes?

What should Sanjay review and what should Saujas review?

Evaluation rubric

How we review early work.

The review process favors artifacts that a technical buyer can inspect in minutes. We care about source quality, decision clarity, and whether the work could become a product, article, benchmark, or consulting deliverable.

Evidence taste

Does the work cite primary sources, inspect failure modes, and avoid vague AI claims?

Builder speed

Can the candidate turn an idea into a page, harness, model, or memo quickly?

Buyer clarity

Would a founder, engineering lead, or operator know what decision to make after reading it?

Scope discipline

Does the work stay focused without pretending to solve the entire field?

This form turns the candidate packet into a structured application record: role track, work sample, artifact link, source trail, availability, and routing owner.

Research and benchmark roles

Routed to Sanjay for evidence, benchmark, and systems review.

Sales engineering

Routed to Saujas for discovery, solution framing, and client-facing judgment.

How to stand out

Send one artifact: a benchmark note, market map, eval design, cost model, or working demo.

We care more about analytical taste and execution speed than a formal application. Good work should make a technical buyer smarter within five minutes.