Reconstructing realistic underwater scenes from underwater video remains a meaningful yet challenging task in the multimedia domain. The inherent spatiotemporal degradations in underwater imaging, including caustics, flickering, attenuation, and backscattering, frequently result in inaccurate geometry and appearance in existing 3D reconstruction methods. While a few recent works have explored underwater degradation-aware reconstruction, they often address either spatial or temporal degradation alone, falling short in more real-world underwater scenarios where both types of degradation occur.
We propose MarineSTD-GS, a novel 3D Gaussian Splatting-based framework that explicitly models both temporal and spatial degradations for realistic underwater scene reconstruction. Specifically, we introduce two paired Gaussian primitives: Intrinsic Gaussians represent the true scene, while Degraded Gaussians render the degraded observations. The color of each Degraded Gaussian is physically derived from its paired Intrinsic Gaussian via a Spatiotemporal Degradation Modeling (SDM) module, enabling self-supervised disentanglement of realistic appearance from degraded images. To ensure stable training and accurate geometry, we further propose a Depth-Guided Geometry Loss and a Multi-Stage Optimization strategy. We also construct a simulated benchmark with diverse spatial and temporal degradations and ground-truth appearances for comprehensive evaluation.
Experiments on both simulated and real-world datasets show that MarineSTD-GS robustly handles spatiotemporal degradations and outperforms existing methods in novel view synthesis with realistic, water-free scene appearances.
MarineSTD-GS disentangles realistic scene representations from underwater videos, including true appearance (color and geometry) and degradation components like attenuation, backscatter, and caustics.
S1, S2, and S3 scenes under Caustic Pattern A with medium degradation.
Scene S2 under medium degradation with Patterns A, B, and C.
Scene S3 under Caustic Pattern A with low, medium, and high degradation.
Real-world videos from BVICoral and Flsea_VI datasets demonstrate the disentanglement of realistic scene appearance and degradation information in complex natural underwater environments.
Below, you can choose different scenes for comparison.
Each selected scene will present three comparisons: RecGS vs Ours, SeaSplat vs Ours, and WaterSplatting vs Ours.
Specifically, RecGS fails to remove backscattering and attenuation effects, resulting in noticeable floaters and strong color casts.
SeaSplat mitigates some water-induced degradations compared to RecGS but tends to over-amplify brightness and color saturation, leading to unnatural visual artifacts.
WaterSplatting effectively suppresses spatial degradations but struggles with local over-exposures under caustics.
In contrast, our method reconstructs more realistic scene appearances, preserving accurate colors without bias, over-saturation, or brightness artifacts.
Please select a scene from the dropdown menu:
In addition to synthesizing novel views of clean underwater scenes, MarineSTD-GS supports two advanced applications enabled by the disentangled representations of scene content and water degradation:
(1) novel view synthesis with consistent intrinsic water effects, and (2) cross-scene transfer of water appearance.
Below we present three representative examples based on scenes 11404, 11435, and Curacao.
In each row, the leftmost video shows intrinsic water effect rendering using averaged water parameters extracted from the same scene. The middle and right videos demonstrate cross-scene water transfer, where degradation effects from reference scenes (D5 and Sub_Pier) are applied to the disentangled intrinsic content of the current source scene. These examples illustrate how MarineSTD-GS enables controllable and physically interpretable simulation of diverse underwater environments.