Results for "self-critique"
Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.
Automated detection/prevention of disallowed outputs (toxicity, self-harm, illegal instruction, etc.).
Learning from data by constructing “pseudo-labels†(e.g., next-token prediction, masked modeling) without manual annotation.
Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.
Models evaluating and improving their own outputs.
Sampling multiple outputs and selecting consensus.
Internal representation of the agent itself.