Vision Subsystem

intermediate80 min

Why Vision-Based Pose Estimation?

Imagine your robot driving across the field, relying solely on wheel encoders and a gyro. Over time, small errors accumulate—wheel slip, surface variations, measurement inaccuracies. Before you know it, your robot thinks it’s in one place, but it’s actually several feet away. This is drift, and it’s the fundamental challenge of odometry-based localization.

Vision-based pose estimation solves this problem by providing absolute positioning. Instead of accumulating error over time, your robot can look at the field, see known markers (AprilTags), and instantly know exactly where it is. It’s like having GPS for your robot—periodic corrections that eliminate drift and ensure accurate navigation.

When combined with odometry through sensor fusion, you get the best of both worlds: continuous, fast tracking from odometry, with periodic absolute corrections from vision. This creates a robust, accurate pose estimation system that works reliably in competition conditions.

Understanding AprilTags

AprilTags are fiducial markers—visual markers with unique black and white patterns that encode a specific ID. Think of them as QR codes designed for robotics. Each AprilTag on an FRC field has a known position and orientation, specified in the official field layout documentation.

When your camera detects an AprilTag, the vision system can calculate the camera’s position relative to that tag. Since the tag’s field position is known, and the camera’s position relative to your robot is known, the system can determine your robot’s exact pose on the field. This happens in real-time, providing instant absolute positioning without any accumulated error.

FRC fields typically have AprilTags placed at strategic locations: scoring areas, corners, and mid-field positions. Each tag has a unique ID that corresponds to its field position, allowing your robot to identify which tag it’s seeing and calculate its pose accordingly.

The Vision Processing Pipeline

Here’s how vision pose estimation works, step by step:

Step 1: Image Capture - Your camera continuously captures images of the field at 10-30 frames per second. The camera needs to be properly calibrated to account for lens distortion and measure accurate distances.

Step 2: Tag Detection - The vision processor analyzes each image, searching for AprilTag patterns. When a tag is found, the system identifies its unique ID and measures its position in the image.

Step 3: Pose Calculation - Using the tag’s known size and position in the image, the system calculates the camera’s position and orientation relative to the tag. This involves complex geometry and camera calibration data.

Step 4: Robot Pose Conversion - The camera pose is converted to robot pose by accounting for the camera’s mounting position and orientation relative to the robot center. This gives you the robot’s actual position on the field.

Creating the Vision Subsystem

Now let’s create a vision subsystem that can detect AprilTags and provide pose estimates. Both PhotonVision and Limelight are excellent choices for FRC vision processing. PhotonVision is an open-source solution that runs on coprocessors, while Limelight is a dedicated smart camera. Both provide similar functionality for AprilTag detection and robot localization.

For PhotonVision, you’ll configure your camera in the PhotonVision UI and use the PhotonCamera class in your code. For Limelight, you’ll use the LimelightHelpers utility class which simplifies accessing Limelight’s NetworkTables data. Both systems can provide pose estimates that work seamlessly with WPILib’s PoseEstimator.

Step 1: Creating the Camera Object

The first step is to create a camera object that connects to your vision system. For PhotonVision, you create a PhotonCamera object with the camera name that matches your PhotonVision configuration. For Limelight, you don’t need to create a camera object—instead, you’ll use the LimelightHelpers class with your camera’s name (or “limelight” for the default camera).

import org.photonvision.PhotonCamera;

public class VisionSubsystem {
    private final PhotonCamera m_camera;
    
    public VisionSubsystem() {
        // Camera name must match the name configured in PhotonVision UI
        m_camera = new PhotonCamera("MainCamera");
    }
}

Step 2: Configuring Camera Pose and PhotonPoseEstimator

Your camera needs to know its position and orientation relative to your robot’s center. For PhotonVision, you also need to create a PhotonPoseEstimator which handles pose calculation from AprilTag detections. The estimator uses the field layout (AprilTag positions) and camera pose configuration. For Limelight, you configure the pipeline index and set the camera pose programmatically using setPipelineIndex() and setCameraPose_RobotSpace().

import edu.wpi.first.math.geometry.Transform3d;
import edu.wpi.first.apriltag.AprilTagFieldLayout;
import org.photonvision.PhotonPoseEstimator;
import org.photonvision.PhotonPoseEstimator.PoseStrategy;

public class VisionSubsystem {
    // Camera position relative to robot center (in meters)
    // Configure this in PhotonVision UI under "Camera Settings" -> "Robot to Camera Transform"
    // Or load from code if using WPILib's AprilTagFieldLayout
    private static final Transform3d kRobotToCamera = new Transform3d(
        0.3, 0.0, 0.2,  // X, Y, Z position (meters)
        new edu.wpi.first.math.geometry.Rotation3d()  // Rotation
    );
    
    // Field layout with AprilTag positions
    // Load from WPILib's AprilTagFieldLayout or configure in PhotonVision UI
    private final AprilTagFieldLayout kTagLayout;
    
    private final PhotonPoseEstimator m_photonEstimator;
    
    public VisionSubsystem(AprilTagFieldLayout tagLayout) {
        kTagLayout = tagLayout;
        
        // Create PhotonPoseEstimator
        // MULTI_TAG_PNP_ON_COPROCESSOR: Uses multiple tags for better accuracy
        // Falls back to LOWEST_AMBIGUITY if multi-tag fails
        m_photonEstimator = new PhotonPoseEstimator(
            kTagLayout,
            PoseStrategy.MULTI_TAG_PNP_ON_COPROCESSOR,
            kRobotToCamera
        );
        m_photonEstimator.setMultiTagFallbackStrategy(PoseStrategy.LOWEST_AMBIGUITY);
    }
}

Step 3: Getting AprilTag Detections

Once your camera is configured, you can retrieve AprilTag detections. Both systems provide methods to check if targets are detected and to get information about those targets. PhotonVision uses getAllUnreadResults() to get all unread pipeline results (process each one to avoid missing detections), while Limelight uses getLatestResults() to get JSON-formatted results.

import org.photonvision.targeting.PhotonPipelineResult;
import org.photonvision.targeting.PhotonTrackedTarget;
import java.util.List;
import java.util.Optional;

public class VisionSubsystem {
    private final PhotonCamera m_camera;
    
    public Optional<Integer> getDetectedTagId() {
        // Get all unread results to ensure we don't miss any detections
        List<PhotonPipelineResult> results = m_camera.getAllUnreadResults();
        
        for (PhotonPipelineResult result : results) {
            if (result.hasTargets()) {
                PhotonTrackedTarget bestTarget = result.getBestTarget();
                int fiducialId = bestTarget.getFiducialId();
                
                // Fiducial ID >= 0 means it's an AprilTag
                if (fiducialId >= 0) {
                    return Optional.of(fiducialId);
                }
            }
        }
        
        return Optional.empty();
    }
    
    public boolean hasTargets() {
        PhotonPipelineResult result = m_camera.getLatestResult();
        return result.hasTargets();
    }
}

Step 4: Extracting Pose Estimates

The most important part of vision-based pose estimation is extracting the robot’s pose from AprilTag detections. PhotonVision uses PhotonPoseEstimator.update() which automatically calculates the robot pose from pipeline results. It returns an EstimatedRobotPose with pose, timestamp, and target information. Limelight provides getBotPoseEstimate_wpiBlue() which returns a PoseEstimate object containing the pose, timestamp, and metadata about the detection quality.

import edu.wpi.first.math.geometry.Pose2d;
import org.photonvision.EstimatedRobotPose;
import org.photonvision.targeting.PhotonPipelineResult;
import java.util.Optional;

public class VisionSubsystem {
    private final PhotonCamera m_camera;
    private final PhotonPoseEstimator m_photonEstimator;
    
    /**
     * Process all unread results and get pose estimates.
     * Returns the latest valid pose estimate if available.
     */
    public Optional<Pose2d> getAprilTagPose() {
        Optional<EstimatedRobotPose> latestEstimate = Optional.empty();
        
        // Process all unread results to avoid missing detections
        for (PhotonPipelineResult result : m_camera.getAllUnreadResults()) {
            if (result.hasTargets()) {
                // PhotonPoseEstimator handles pose calculation
                Optional<EstimatedRobotPose> estimate = m_photonEstimator.update(result);
                
                if (estimate.isPresent()) {
                    latestEstimate = estimate;
                }
            }
        }
        
        return latestEstimate.map(est -> est.estimatedPose.toPose2d());
    }
    
    /**
     * Get pose estimate with timestamp for sensor fusion.
     * This includes the full EstimatedRobotPose with timestamp and targets.
     */
    public Optional<EstimatedRobotPose> getLatestEstimatedPose() {
        Optional<EstimatedRobotPose> latestEstimate = Optional.empty();
        
        for (PhotonPipelineResult result : m_camera.getAllUnreadResults()) {
            if (result.hasTargets()) {
                Optional<EstimatedRobotPose> estimate = m_photonEstimator.update(result);
                if (estimate.isPresent()) {
                    latestEstimate = estimate;
                }
            }
        }
        
        return latestEstimate;
    }
}

Complete Vision Subsystem

package frc.robot.subsystems;

import edu.wpi.first.math.Matrix;
import edu.wpi.first.math.VecBuilder;
import edu.wpi.first.math.geometry.Pose2d;
import edu.wpi.first.math.geometry.Transform3d;
import edu.wpi.first.math.numbers.N1;
import edu.wpi.first.math.numbers.N3;
import edu.wpi.first.apriltag.AprilTagFieldLayout;
import edu.wpi.first.wpilibj2.command.SubsystemBase;
import java.util.List;
import java.util.Optional;
import org.photonvision.EstimatedRobotPose;
import org.photonvision.PhotonCamera;
import org.photonvision.PhotonPoseEstimator;
import org.photonvision.PhotonPoseEstimator.PoseStrategy;
import org.photonvision.targeting.PhotonPipelineResult;
import org.photonvision.targeting.PhotonTrackedTarget;

/**
 * Vision subsystem for AprilTag detection using PhotonVision.
 * 
 * This implementation uses PhotonPoseEstimator which provides:
 * - Automatic pose calculation from AprilTag detections
 * - Support for multi-tag pose estimation (more accurate)
 * - Fallback strategies for single-tag scenarios
 * - Proper timestamp handling for sensor fusion
 * 
 * IMPORTANT: Configure the following:
 * 1. Camera name must match the camera name passed to constructor
 * 2. Set up AprilTag pipeline in PhotonVision UI
 * 3. Load AprilTagFieldLayout (from WPILib or PhotonVision UI)
 * 4. Configure Robot to Camera Transform (in UI or via code)
 */
public class VisionSubsystem extends SubsystemBase {
    private final PhotonCamera m_camera;
    private final PhotonPoseEstimator m_photonEstimator;
    private Matrix<N3, N1> m_currentStdDevs;
    private final EstimateConsumer m_estimateConsumer;
    
    // Standard deviation constants for pose estimation
    // Lower values = more trust in the measurement
    private static final Matrix<N3, N1> kSingleTagStdDevs = 
        VecBuilder.fill(0.7, 0.7, Double.MAX_VALUE);  // X, Y, heading (meters, meters, radians)
    private static final Matrix<N3, N1> kMultiTagStdDevs = 
        VecBuilder.fill(0.5, 0.5, Double.MAX_VALUE);  // Better accuracy with multiple tags
    
    /**
     * Functional interface for consuming pose estimates.
     * Pass this to your SwerveDrivePoseEstimator.addVisionMeasurement()
     */
    @FunctionalInterface
    public interface EstimateConsumer {
        void accept(Pose2d pose, double timestamp, Matrix<N3, N1> stdDevs);
    }
    
    /**
     * Create a new VisionSubsystem.
     * 
     * @param cameraName Name of the camera configured in PhotonVision UI
     * @param tagLayout AprilTag field layout with tag positions
     * @param robotToCamera Transform from robot center to camera
     * @param estimateConsumer Lambda to pass estimates to your pose estimator
     */
    public VisionSubsystem(
            String cameraName,
            AprilTagFieldLayout tagLayout,
            Transform3d robotToCamera,
            EstimateConsumer estimateConsumer) {
        
        m_camera = new PhotonCamera(cameraName);
        m_estimateConsumer = estimateConsumer;
        
        // Create PhotonPoseEstimator
        // MULTI_TAG_PNP_ON_COPROCESSOR: Uses multiple tags for better accuracy
        // Falls back to LOWEST_AMBIGUITY if multi-tag estimation fails
        m_photonEstimator = new PhotonPoseEstimator(
            tagLayout,
            PoseStrategy.MULTI_TAG_PNP_ON_COPROCESSOR,
            robotToCamera
        );
        m_photonEstimator.setMultiTagFallbackStrategy(PoseStrategy.LOWEST_AMBIGUITY);
        
        // Initialize with single-tag standard deviations
        m_currentStdDevs = kSingleTagStdDevs;
    }
    
    @Override
    public void periodic() {
        // Process all unread results to avoid missing detections
        for (PhotonPipelineResult result : m_camera.getAllUnreadResults()) {
            if (result.hasTargets()) {
                // Get pose estimate from PhotonPoseEstimator
                Optional<EstimatedRobotPose> visionEst = m_photonEstimator.update(result);
                
                // Calculate dynamic standard deviations based on tag count and distance
                updateEstimationStdDevs(visionEst, result.getTargets());
                
                // Pass estimate to pose estimator with calculated standard deviations
                visionEst.ifPresent(est -> {
                    m_estimateConsumer.accept(
                        est.estimatedPose.toPose2d(),
                        est.timestampSeconds,
                        m_currentStdDevs
                    );
                });
            }
        }
    }
    
    /**
     * Calculate dynamic standard deviations based on tag count and distance.
     * This heuristic adjusts trust in measurements:
     * - Multiple tags = more accurate = lower std devs
     * - Single tag far away = less accurate = higher std devs
     * 
     * @param estimatedPose The estimated pose (may be empty)
     * @param targets All targets in this camera frame
     */
    private void updateEstimationStdDevs(
            Optional<EstimatedRobotPose> estimatedPose,
            List<PhotonTrackedTarget> targets) {
        
        if (estimatedPose.isEmpty()) {
            // No pose input, default to single-tag std devs
            m_currentStdDevs = kSingleTagStdDevs;
            return;
        }
        
        // Count AprilTags and calculate average distance
        int numTags = 0;
        double avgDist = 0;
        
        for (PhotonTrackedTarget target : targets) {
            if (target.getFiducialId() < 0) continue;  // Skip non-AprilTag targets
            
            // Get tag pose from field layout
            var tagPose = m_photonEstimator.getFieldTags()
                .getTagPose(target.getFiducialId());
            
            if (tagPose.isEmpty()) continue;
            
            numTags++;
            
            // Calculate distance from robot to tag
            double distance = tagPose.get().toPose2d().getTranslation()
                .getDistance(estimatedPose.get().estimatedPose.toPose2d().getTranslation());
            
            avgDist += distance;
        }
        
        if (numTags == 0) {
            // No tags visible, default to single-tag std devs
            m_currentStdDevs = kSingleTagStdDevs;
        } else {
            // Calculate average distance
            avgDist /= numTags;
            
            // Start with base std devs
            Matrix<N3, N1> estStdDevs;
            if (numTags > 1) {
                // Multiple tags = more accurate
                estStdDevs = kMultiTagStdDevs;
            } else {
                // Single tag
                estStdDevs = kSingleTagStdDevs;
                
                // Reject single tags that are too far away
                if (avgDist > 4.0) {  // 4 meters
                    estStdDevs = VecBuilder.fill(
                        Double.MAX_VALUE,
                        Double.MAX_VALUE,
                        Double.MAX_VALUE
                    );
                }
            }
            
            // Increase std devs based on distance (farther = less accurate)
            // Formula: stdDev *= (1 + distance^2 / 30)
            if (estStdDevs.get(0, 0) != Double.MAX_VALUE) {
                estStdDevs = estStdDevs.times(1 + (avgDist * avgDist / 30.0));
            }
            
            m_currentStdDevs = estStdDevs;
        }
    }
    
    /**
     * Get the current standard deviations for pose estimation.
     * 
     * @return Current standard deviations matrix
     */
    public Matrix<N3, N1> getEstimationStdDevs() {
        return m_currentStdDevs;
    }
    
    /**
     * Get robot pose estimate from latest detection.
     * 
     * @return Optional containing pose estimate if available
     */
    public Optional<Pose2d> getAprilTagPose() {
        PhotonPipelineResult result = m_camera.getLatestResult();
        
        if (result.hasTargets()) {
            Optional<EstimatedRobotPose> estimate = m_photonEstimator.update(result);
            return estimate.map(est -> est.estimatedPose.toPose2d());
        }
        
        return Optional.empty();
    }
    
    /**
     * Get the ID of the detected AprilTag.
     * 
     * @return Optional containing tag ID if detected
     */
    public Optional<Integer> getDetectedTagId() {
        PhotonPipelineResult result = m_camera.getLatestResult();
        
        if (result.hasTargets()) {
            PhotonTrackedTarget bestTarget = result.getBestTarget();
            int fiducialId = bestTarget.getFiducialId();
            
            if (fiducialId >= 0) {
                return Optional.of(fiducialId);
            }
        }
        
        return Optional.empty();
    }
    
    /**
     * Check if any AprilTags are currently detected.
     * 
     * @return true if targets are detected
     */
    public boolean hasTargets() {
        PhotonPipelineResult result = m_camera.getLatestResult();
        return result.hasTargets();
    }
}

Resources

Vision-Based Pose Estimation with Fusion WPILib AprilTag Introduction PhotonVision AprilTag Pipelines Limelight AprilTag Tracking Limelight Robot Localization LimelightHelpers Download