Walia, J.S., Haridass, K. and Pavithra, L.K. (2025) Deep Learning Innovations for Underwater Waste Detection An In-Depth Analysis. IEEE Access, 13, 88917-88929. - References

Journals by Subject

Publish with us

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Article citationsMore>>

Walia, J.S., Haridass, K. and Pavithra, L.K. (2025) Deep Learning Innovations for Underwater Waste Detection: An In-Depth Analysis. IEEE Access, 13, 88917-88929.
https://doi.org/10.1109/access.2025.3569344

has been cited by the following article:

TITLE: Domain-Robust Marine Plastic Detection Using Vision Models

AUTHORS: Saanvi Kataria

KEYWORDS: Vision Models, CNN, Marine Plastic Detection

JOURNAL NAME: Open Journal of Modelling and Simulation, Vol.14 No.1, October 28, 2025

ABSTRACT: Marine plastic pollution is a pressing environmental threat, making reliable automation for underwater debris detection essential. However, vision systems trained on one dataset often degrade on new imagery due to domain shift. This study benchmarks models for cross-domain robustness, training convolutional neural networks—CNNs (MobileNetV2, ResNet-18, EfficientNet-B0) and vision transformers (DeiT-Tiny, ViT-B/16) on a labeled underwater dataset and then evaluates them on a balanced cross-domain test set built from plastic-positive images drawn from a different source and negatives from the training domain. Two zero-shot models were assessed, CLIP ViT-L/14 and Google’s Gemini 2.0 Flash, that leverage pretraining to classify images without fine-tuning. Results show the lightweight MobileNetV2 delivers the strongest cross-domain performance (F1 ≈ 0.97), surpassing larger models. All fine-tuned models achieved high Precision (~99%), but differ in Recall, indicating varying sensitivity to plastic instances. Zero-shot CLIP is comparatively sensitive (Recall ~80%) yet prone to false positives (Precision ~56%), whereas Gemini exhibits the inverse profile (Precision ~99%, Recall ~81%). Error analysis highlights recurring confusions with coral textures, suspended particulates, and specular glare. Overall, compact CNNs with supervised training can generalize effectively for cross-domain underwater detection, while large pretrained vision-language models provide complementary strengths. Future work should explore hybrid strategies, such as small CNN backbones with foundation-model priors and domain-aware sampling, to combine high Precision with Recall across heterogeneous marine environments and reduce labeling burdens at scale.

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals by Subject

Publish with us

Article citationsMore>>

Home

About SCIRP

Service

Policies