Is your AI is evaluating you?
Here’s a query for you: what if the mannequin you have been evaluating has been evaluating you proper again? What this implies for analysis design Just a few concrete modifications comply with straight from this consequence: Observer-blind analysis framing: System prompts and analysis harnesses ought to omit any language signaling that the mannequin is being…
