TITLE:
Domain-Robust Marine Plastic Detection Using Vision Models
AUTHORS:
Saanvi Kataria
KEYWORDS:
Vision Models, CNN, Marine Plastic Detection
JOURNAL NAME:
Open Journal of Modelling and Simulation,
Vol.14 No.1,
October
28,
2025
ABSTRACT: Marine plastic pollution is a pressing environmental threat, making reliable automation for underwater debris detection essential. However, vision systems trained on one dataset often degrade on new imagery due to domain shift. This study benchmarks models for cross-domain robustness, training convolutional neural networks—CNNs (MobileNetV2, ResNet-18, EfficientNet-B0) and vision transformers (DeiT-Tiny, ViT-B/16) on a labeled underwater dataset and then evaluates them on a balanced cross-domain test set built from plastic-positive images drawn from a different source and negatives from the training domain. Two zero-shot models were assessed, CLIP ViT-L/14 and Google’s Gemini 2.0 Flash, that leverage pretraining to classify images without fine-tuning. Results show the lightweight MobileNetV2 delivers the strongest cross-domain performance (F1 ≈ 0.97), surpassing larger models. All fine-tuned models achieved high Precision (~99%), but differ in Recall, indicating varying sensitivity to plastic instances. Zero-shot CLIP is comparatively sensitive (Recall ~80%) yet prone to false positives (Precision ~56%), whereas Gemini exhibits the inverse profile (Precision ~99%, Recall ~81%). Error analysis highlights recurring confusions with coral textures, suspended particulates, and specular glare. Overall, compact CNNs with supervised training can generalize effectively for cross-domain underwater detection, while large pretrained vision-language models provide complementary strengths. Future work should explore hybrid strategies, such as small CNN backbones with foundation-model priors and domain-aware sampling, to combine high Precision with Recall across heterogeneous marine environments and reduce labeling burdens at scale.