Spoke
hris-mapper
Thousands of noisy client job records → a governed canonical Family × Focus × Universal-Level architecture, with confidence and evidence on every mapping.
Character
Problem
External. The data is too messy to price, govern, or analyze: titles carry level / BU / location noise, codes and grades disagree, and there's no shared taxonomy.
Internal. Doing it by hand is soul-crushing and risky — one wrong guess silently sets the wrong pay band, and nobody can reconstruct why.
Philosophical. The source of truth should never be overwritten, and a match engine that isn't sure should — not fabricate.
Guide
Abstract
Background. Onboarding a tenant's workforce into a canonical job architecture is the gating step for pricing, governance, and analytics — yet it's usually done by hand with no audit trail.
Methodology. Staged uploads keep raw rows verbatim and separate; titles are normalized with every removal shown; rows cluster by job code / title+grade / title; each cluster is predicted via the canon matcher with field- and record-level confidence and abstains (no_match) below the floor; review queues triage by headcount × uncertainty (boosted by comp-risk).
Scope. Mapping + triage + export — not an HRIS connector (you bring the export), not the canon (it composes job-family-agent over HTTP), and not pricing.
Contribution. Bulk correction at , explainable abstaining prediction, and a raw-vs-mapped separation that makes every strip and decision auditable.
Evidence / Provenance. PAT-JFE-ENT-3; pure-function tests cover parse, detection, normalization, clustering, prediction + abstain, and export.
Plan
- 01
Stage an upload
POST a CSV/XLSX to
/api/spokes/hris-mapper/datasets; fields are detected and raw rows stored verbatim. - 02
Confirm field mapping
Accept or override the source-column → canonical-field mapping.
- 03
Cluster + predict
Run a mapping run; review the predicted clusters with confidence + evidence + alternatives.
- 04
Decide + export
Accept / correct / reject / defer once per cluster, then GET
/export(JSON or CSV) — mapped + decisions + exceptions + confidence.
Call to Action
Direct. Upload a job export and watch the confident clusters auto-map.
Transitional. Read the Enterprise JobFrame build-out for the pipeline + abstain model.
Spoke I/O (visual language v1)
Every toolbox spoke shares the same abstract choreography: typed inputs on the left, distilled verbs in the center, typed outputs on the right, and (when relevant) cross-spoke HTTP composition along the bottom rail. Source package: @people-analytics-toolbox/spoke-illustrations.
Try it now
Copy this curl. Paste in any terminal. Public read — no auth needed.
hris-mapper.health
GETSchema reachability for the HRIS Dataset Mapper.
curl -sS "https://people-analytics-toolbox.vercel.app/api/spokes/hris-mapper/health"
Vendor the contract
The Zod contract is the source of truth. Vendor a copy into your consumer app — you keep it; we don't break it underneath you. Re-vendor when the version bumps.
// Vendor canonical types: // src/spokes/hris-mapper/contracts/types.ts
Source path: src/spokes/hris-mapper/contracts/types.ts · GitHub
Failure
The messy export gets mapped by hand or by a black box — wrong levels move pay silently, and nobody can defend a single line.
Success
A 4,000-row export becomes a clean canonical architecture in an afternoon: confident clusters auto-map, the queue surfaces only what needs a human, and every line carries its evidence.