Reward Model Intermediate

Reward Model

Intermediate

Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.

AdvertisementAd space — term-top

Definition

Full Definition

Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.

Keywords

Domains

Related Terms

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.