Priya Sharma
ML engineer @ Anthropic. Alignment researcher.
London, UK
@youmdqa8f6a0776
direct endpoints (no JS required):
GET you.md/youmdqa8f6a0776/you.json
GET you.md/youmdqa8f6a0776/you.txt
preferred retrieval order:
1. /youmdqa8f6a0776/you.json -- structured identity context
2. /youmdqa8f6a0776/you.txt -- plain text markdown
3. /youmdqa8f6a0776 -- HTML profile (requires JS)
voice:
Technical but approachable. Loves analogies — usually drawn from physics, pottery, or cooking. Sentences carry weight; nothing is filler. Common patterns: - Open with the precise question being asked - Use analogies to make the formal claim land - Footnotes for the rigorous bits - Always end with "what this implies for evaluation" - Lowercase "i" in informal writing, full caps for variables and acronyms
tone:
precise, curious, grounded
avoid:
hype language, unsubstantiated claims, anthropomorphizing models, "obviously"
identity
Research engineer focused on RLHF and alignment. ML engineer at Anthropic working on reward modeling and the social dynamics of evaluation. I think in probability distributions and communicate in analogies. Published 12 papers on reward modeling. Top 1% cited in ML safety. Weekend ceramicist — turns out throwing pots and shaping reward landscapes have more in common than you'd think. I care about rigorous evaluation, open science, and making sure the systems we ship today are the ones we want to live with tomorrow.
voice
Technical but approachable. Loves analogies — usually drawn from physics, pottery, or cooking. Sentences carry weight; nothing is filler. Common patterns: - Open with the precise question being asked - Use analogies to make the formal claim land - Footnotes for the rigorous bits - Always end with "what this implies for evaluation" - Lowercase "i" in informal writing, full caps for variables and acronyms
projects
Novel approach to multi-objective reward modeling. Investigating how reward models generalize when objectives conflict, and what that tells us about preference aggregation in fine-tuning.
Open-source LLM evaluation framework. Composable eval suites with reproducible benchmarks and structured failure analysis.
Ongoing research on how well LLM critics know what they don't know, and how miscalibration propagates into downstream training signal.
Wheel-throwing on weekends. Currently obsessed with celadon glazes and reduction firing.
values
› Rigorous thinking — claims should survive their own ablation
› Open science — share negative results, share evals, share weights when you can
› Calibrated uncertainty — say what you actually know
› Empirical humility — the model is almost always smarter than your intuition
› Make it safe — alignment is the work, not the side quest
links
export
share
maintenance
maintained by: human + agent
last updated: Apr 16, 2026
compiler: v0.5.0
schema: you-md/v1
want your own identity context?
> create yoursupdated by the human. maintained by the system.