Results for "vision"
Computer Vision
IntermediateAI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
Joint vision-language model aligning images and text.
Devices measuring physical quantities (vision, lidar, force, IMU, etc.).
External sensing of surroundings (vision, audio, lidar).
AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.
Transformer applied to image patches.