Results for "reward inference"

AdvertisementAd space — search-top

25 results

Reward Hacking Advanced

Maximizing reward without fulfilling real goal.

AI Safety & Alignment
Reward Shaping Advanced

Modifying reward to accelerate learning.

Reinforcement Learning
Sparse Reward Advanced

Reward only given upon task completion.

Reinforcement Learning
Secure Inference Intermediate

Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.

Foundations & Theory
Reinforcement Learning Intermediate

A learning paradigm where an agent interacts with an environment and learns to choose actions to maximize cumulative reward.

Reinforcement Learning
RLHF Intermediate

Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.

Optimization
Value Function Intermediate

Expected cumulative reward from a state or state-action pair.

AI Economics & Strategy
Policy Gradient Intermediate

Optimizing policies directly via gradient ascent on expected reward.

AI Economics & Strategy
Inverse Reinforcement Learning Advanced

Inferring reward function from observed behavior.

Reinforcement Learning
Latency Intermediate

Time from request to response; critical for real-time inference and UX.

Foundations & Theory
Compute Intermediate

Hardware resources used for training/inference; constrained by memory bandwidth, FLOPs, and parallelism.

Foundations & Theory
Quantization Intermediate

Reducing numeric precision of weights/activations to speed inference and reduce memory with acceptable accuracy loss.

Foundations & Theory
Causal Mask Intermediate

Prevents attention to future tokens during training/inference.

AI Economics & Strategy
Instrumental Variable Advanced

Variable enabling causal inference despite confounding.

Causal AI & Interpretability
Exposure Bias Intermediate

Differences between training and inference conditions.

Model Failure Modes
Token Budgeting Intermediate

Limiting inference usage.

AI Economics & Strategy
Reward Model Intermediate

Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.

Foundations & Theory
Causal Inference Intermediate

Framework for reasoning about cause-effect relationships beyond correlation, often using structural assumptions and experiments.

Foundations & Theory
Bayesian Inference Intermediate

Updating beliefs about parameters using observed evidence and prior distributions.

AI Economics & Strategy
Inference Pipeline Intermediate

Model execution path in production.

MLOps & Infrastructure
Batch Inference Intermediate

Running predictions on large datasets periodically.

MLOps & Infrastructure
Online Inference Intermediate

Low-latency prediction per request.

MLOps & Infrastructure
Inference Cost Intermediate

Cost to run models in production.

AI Economics & Strategy
Edge Inference Intermediate

Running models locally.

AI Economics & Strategy
Active Inference Frontier

Acting to minimize surprise or free energy.

World Models & Cognition

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.