Skip to main content

Multi-Objective Monte Carlo Tree Search for Autonomous Space Surveillance with Utility Vector Optimization

Phuong Linh Ngo1,Steve Gehly1,Marco Langbroek1,Pieter Visser1
Delft University of Technology1

Document details

Publishing year2025 PublisherESA Space Debris Office Publishing typeConference Name of conference9th European Conference on Space Debris
Pagesn/a Volume
9
Issue
1
Editors
S. Lemmens, T. Flohrer, F. Schmitz

Abstract

To support sustainable practices in space and secure operational activities, there is growing interest in autonomous methods for space surveillance. Space surveillance aims to gain comprehensive insights into space objects, including their origins, trajectories, mission goals, physical attributes, and rotational dynamics. Within the context of sensor management for space situational awareness (SSA), these requirements present a set of competing objectives.
The proposed approach considers three main objectives: Searching for unregistered space objects, estimating their orbits with limited measurements for cataloguing, and refining orbit information for catalogued objects through ongoing tracking. The performance is evaluated by addressing how many unique objects are detected and successfully catalogued, and how state errors evolve during the observation campaign.
The multi-objective sensor tasking problem is formulated within the framework of a Partially Observable Markov Decision Problem (POMDP). To solve the POMDP, recent work has explored the use of Monte Carlo Tree Search (MCTS), accounting for the evolution of the state space. While MCTS is commonly applied to single-objective problems, this study extends its application to multi-objective sensor tasking from the very beginning. This is accomplished by a redefinition of the action space, which is a set of possible choices under a given environment, as a tuple of macro and micro action spaces. Micro actions are tactical, immediate choices tied to the high-level user objectives, denoted as macro actions. This distinction allows the balancing of short-term, objective-focused decisions within the micro action space with long-term planning across multiple objectives in the macro action space.
Previous research on sensor tasking has primarily revolved around using a single-step approach, maximising immediate rewards and producing optimal yet myopic solutions. The evolution of the problem over time is either not taken into consideration or included as part of the reward via ad hoc functions. To address long-term optimisation, MCTS dynamically builds a decision tree, which considers the evolution of the state and action spaces over multiple time steps and thereby accounts for the future visibility of objects, availability of sensors, and predicted state uncertainties. The method has been successfully applied to sensor tasking but, until now, not to a multi-objective space. The novelty of this research lies in its tailored focus on multi-objective sensor tasking. Different from previous multi-objective sensor tasking studies, where the optimisation process relies on scalarised multi-objective returns, this work keeps the returns rewarded in each objective in the form of a utility vector. The MCTS is hence driven by the vector whose utility dominates other pointing solutions. The decision tree is built out over multiple random trials, tuned to favour more promising solution paths. Based on the number of solutions a given pointing strategy outperforms, the MCTS continuously learns which strategy approaches the non-dominated Pareto-optimal solution front, enabling the final selection to be near Pareto-optimal.
The paper demonstrates the efficacy of the approach by presenting the results of a simulated observation campaign, demonstrating the performance of the novel multi-objective single-sensor MCTS. The findings showcase the potential of this approach for optimising space surveillance operations across various objectives.

Preview