Balancing Exploration and Exploitation in Robotics: Path Optimization and Uncertainty Management in Complex Environments
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
Balancing exploration and exploitation is a fundamental challenge in informative path planning for environmental monitoring. Although numerous reward functions have been proposed in the literature, most have been evaluated under different datasets and experimental conditions, making direct comparison difficult. The novelty of this study lies in its development of four synthetic datasets, experimentally validated and designed with increasing spatial complexity (1 to 4 Regions of Interest, ROIs), to enable a fair and systematic comparison of three widely used Gaussian Process-based reward functions: Entropy, Upper Confidence Bound (UCB), and Level Set. The proposed framework integrates a greedy local path optimization algorithm that maximizes expected reward and incorporates a cross-validation strategy to reduce initial model variance and mitigate overfitting. Importantly, this study not only compares the individual performances of these reward functions but also analyzes how each one contributes to the trade-off between exploration and exploitation under varying environmental conditions. Experimental results show that Level Set performs best in high-variance environments (favoring exploration), UCB excels in low-variance settings with fast convergence (favoring exploitation), and Entropy provides stable long-term uncertainty reduction (balancing both aspects). With the inclusion of cross-validation, the model achieves up to 60% reduction in RMSE and 50% reduction in variance across all scenarios. These findings highlight the practical value of reward-aware path planning in robotic exploration tasks, particularly when aligned with the spatial complexity of the monitoring environment.










