Complete a challenge to add a verified artifact to your public profile. Build in any language — evaluated on behaviour, not syntax.
// 1 challenge · Evals & Testing · medium
Design an evaluation suite for a sentiment classifier (positive/negative/neutral) that surfaces known failure modes: sarcasm, mixed sentiment, and domain-specific language.