Fix video preprocessing bug in OpenCV loader by OrangeSodahub · Pull Request #522 · facebookresearch/sam3

OrangeSodahub · 2026-04-16T10:43:54Z

Summary

This PR fixes a bug in the OpenCV video loader used for video-file inputs. Specifically, the function load_video_frames_from_video_file_using_cv2 in

sam3/sam3/model/io_utils.py

Lines 332 to 345 in 44ef224

    
           # Convert to tensor 
        
           frames_np = np.stack(frames, axis=0).astype(np.float32)  # (T, H, W, C) 
        
           video_tensor = torch.from_numpy(frames_np).permute(0, 3, 1, 2)  # (T, C, H, W) 
        
           img_mean = torch.tensor(img_mean, dtype=torch.float16).view(1, 3, 1, 1) 
        
           img_std = torch.tensor(img_std, dtype=torch.float16).view(1, 3, 1, 1) 
        
           if not offload_video_to_cpu: 
        
               video_tensor = video_tensor.cuda() 
        
               img_mean = img_mean.cuda() 
        
               img_std = img_std.cuda() 
        
           # normalize by mean and std 
        
           video_tensor -= img_mean 
        
           video_tensor /= img_std 
        
           return video_tensor, original_height, original_width

where decoded video frames were normalized without first being scaled from [0, 255] to [0, 1], even though the normalization parameters assume [0, 1] inputs. This leads to incorrectly scaled model inputs during video inference.
While in image folder loadings, / 255.0 is correctly placed:

sam3/sam3/model/io_utils.py

Line 56 in 44ef224

img_np = img_np / 255.0

And in torchcodec, too:

sam3/sam3/model/io_utils.py

Line 689 in 44ef224

frame_resized /= 255

Changes

divide decoded OpenCV video frames by 255.0 before mean/std normalization
convert video tensor to torch.float16 to align with other loading approaches

Validation

Without fixing, the images tensor after init_state as below is in [-1, 509] when loading mp4 video through opencv which is incorrect, and I have also seen the unusual results

sam3/sam3/model/sam3_video_inference.py

Lines 63 to 71 in 44ef224

    
           images, orig_height, orig_width = load_resource_as_video_frames( 
        
               resource_path=resource_path, 
        
               image_size=self.image_size, 
        
               offload_video_to_cpu=offload_video_to_cpu, 
        
               img_mean=self.image_mean, 
        
               img_std=self.image_std, 
        
               async_loading_frames=async_loading_frames, 
        
               video_loader_type=video_loader_type, 
        
           )

After fixing, the images tensor is always in [-1, 1], and the segmentation results are good

fix video load bug

e9c61e8

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix video preprocessing bug in OpenCV loader#522

Fix video preprocessing bug in OpenCV loader#522
OrangeSodahub wants to merge 1 commit intofacebookresearch:mainfrom
OrangeSodahub:main

OrangeSodahub commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	# Convert to tensor
	frames_np = np.stack(frames, axis=0).astype(np.float32) # (T, H, W, C)
	video_tensor = torch.from_numpy(frames_np).permute(0, 3, 1, 2) # (T, C, H, W)

	img_mean = torch.tensor(img_mean, dtype=torch.float16).view(1, 3, 1, 1)
	img_std = torch.tensor(img_std, dtype=torch.float16).view(1, 3, 1, 1)
	if not offload_video_to_cpu:
	video_tensor = video_tensor.cuda()
	img_mean = img_mean.cuda()
	img_std = img_std.cuda()
	# normalize by mean and std
	video_tensor -= img_mean
	video_tensor /= img_std
	return video_tensor, original_height, original_width

	images, orig_height, orig_width = load_resource_as_video_frames(
	resource_path=resource_path,
	image_size=self.image_size,
	offload_video_to_cpu=offload_video_to_cpu,
	img_mean=self.image_mean,
	img_std=self.image_std,
	async_loading_frames=async_loading_frames,
	video_loader_type=video_loader_type,
	)

Conversation

OrangeSodahub commented Apr 16, 2026

Summary

Changes

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant