From Vessel Trajectories to Safety-Critical Encounter Scenarios: A Generative AI Framework for Autonomous Ship Digital Testing
Digital testing has emerged as a key paradigm for the development and verification of autonomous maritime navigation systems, yet the availability of realistic and diverse safety-critical encounter scenarios remains limited. Existing approaches eithe…
Authors: Sijin Sun, Liangbin Zhao, Ming Deng
From V essel T rajectories to Safety-Critical Encounter Scenarios: A Generati ve AI Frame work for Autonomous Ship Digital T esting Sijin Sun 1 Liangbin Zhao 1 , ∗ Ming Deng 2 Xiuju Fu 1 ∗ Corresponding author 1 Institute of High Performance Computing, Agency for Science, T echnology and Research (A*ST AR IHPC) 2 Shanghai Univ ersity Abstract —Digital testing has emerged as a key paradigm for the dev elopment and verification of autonomous maritime navi- gation systems, yet the a vailability of realistic and di verse safety- critical encounter scenarios remains limited. Existing appr oaches either rely on handcrafted templates, which lack r ealism, or extract cases directly fr om historical data, which cannot sys- tematically expand rare high-risk situations. This paper proposes a data-driven framework that con verts large-scale Automatic Identification System (AIS) trajectories into structured safety-critical encounter scenarios. The frame- work combines generative trajectory modeling with automated encounter pairing and temporal parameterization to enable scal- able scenario construction while preserving real traffic character- istics. T o enhance trajectory realism and rob ustness under noisy AIS obser vations, a multi-scale temporal v ariational autoencoder is intr oduced to capture vessel motion dynamics across differ ent temporal resolutions. Experiments on real-world maritime traffic flows demonstrate that the proposed method impr oves trajectory fidelity and smoothness, maintains statistical consistency with observed data, and enables the generation of diverse safety-critical encounter scenarios beyond those directly recorded. The resulting frame- work provides a practical pathway for building scenario libraries to support digital testing, benchmarking, and safety assessment of autonomous navigation and intelligent maritime traffic man- agement systems. Code is av ailable at https://anonymous.4open. science/r/traj- gen- anonymous- re view. Index T erms —A utonomous Ships, Maritime T raffic, T rajectory Generation, V ariational Autoencoder , Safety-Critical Scenarios, Simulation-Based Digital T esting I . I N T RO D U C T I O N Recent advances in intelligent transportation and au- tonomous navigation systems have led to a gro wing reliance on digital testing and virtual validation as a necessary comple- ment to field testing, particularly in domains such as maritime navigation where large-scale real-world experiments are costly and dif ficult to control. In specific maritime traffic en viron- ments, vessel behavior is influenced by navigational con- straints, operational practices, and en vironmental uncertainties. Consequently , effecti ve simulation-based validation depends on the av ailability of safety-critical encounter scenarios that are both realistic and sufficiently div erse. Howe ver , such scenarios occur only sparsely in real-world traffic data and are difficult to generate in a systematic manner . This gap between av ailable trajectory data and the scenario requirements of digital testing remains a key obstacle to systematic validation of autonomous maritime systems. T o address the need for scenario-based validation, exist- ing studies have explored se veral approaches to constructing maritime testing scenarios. Kno wledge-driv en methods design encounter cases based on navigation rules and expert-defined configurations, ensuring interpretability and regulatory consis- tency but often lacking di versity and realism. Data-statistical approaches extract representati ve encounters from historical AIS datasets, impro ving realism but remaining constrained by data av ailability and the sparsity of rare high-risk ev ents. More recently , algorithmic and generati ve techniques ha ve been proposed to expand the scenario space. Howe ver , many of these approaches primarily rely on parameter variation or simplified encounter formulations, and only limited ef forts hav e been made to inte grate generative modeling with the statistical structure of real traffic flows. As a result, despite substantial progress in scenario design, current approaches still struggle to simultaneously achie ve realism, di versity , and systematic construction of safety-critical encounter scenarios from real-world traffic data. This limi- tation highlights the need for a data-driv en framework that can le verage large-scale trajectory records while enabling structured and scalable generation of safety-critical scenarios suitable for simulation-based digital testing. In this work, we propose a data-driv en framework that con verts large-scale vessel trajectory data from Automatic Identification System (AIS) into structured safety-critical en- counter scenarios for digital testing. The framework integrates trajectory synthesis and scenario construction into a unified pipeline. First, representativ e vessel motion patterns within designated traffic flows are learned and synthesized through a generativ e modeling approach. T o enhance trajectory realism and robustness under noisy AIS data, a multi-scale tempo- ral variational autoencoder architecture (V AE) is introduced, which captures vessel motion dynamics across different tem- poral scales and preserves the statistical structure of real traffic flows. The generated trajectories are then systematically paired based on spatial and temporal interaction conditions, and each identified interaction is transformed into a standardized sce- nario representation through temporal parameterization around the closest-encounter instant. By linking data-dri ven trajectory generation with structured encounter construction, the pro- posed frame work enables scalable generation of realistic and div erse safety-critical maritime testing scenarios. The main contributions of this work are summarized as follows: 1) A data-driven framework for safety-critical scenario construction: W e propose a unified pipeline that con- verts large-scale v essel trajectory data from Automatic Identification System (AIS) into structured safety-critical encounter scenarios, enabling systematic generation of simulation-ready test cases for maritime autonomous navigation systems. 2) A rob ust generative trajectory synthesis approach: W e de velop a multi-scale temporal v ariational autoen- coder (V AE) architecture to model vessel motion pat- terns from AIS data and integrate trajectory smoothing to enhance realism. 3) A structur ed encounter construction mechanism: W e design an automated trajectory pairing and temporal parameterization strategy that transforms synthesized trajectories into standardized safety-critical encounter scenarios suitable for digital testing. I I . L I T E R A T U R E R E V I E W A. Safety-Critical V essel Encounter Scenario Generation The construction of safety-critical encounter scenarios has attracted increasing attention in recent years, particularly for simulation-based testing of autonomous navigation and decision-support systems. In the domain of maritime au- tonomous vessels, generation of safety-critical scenarios has also been explored, though less extensi vely . Existing ap- proaches to scenario generation can be broadly categorized into knowledge-dri ven, data-statistical, and generative meth- ods, each addressing different aspects of the problem. Knowledge-dri ven approaches construct encounter scenarios based on navigation rules, expert kno wledge, or predefined geometric configurations. Standardized encounter templates deriv ed from COLREGs, such as the Imazu problem [1], or classical cases including head-on, crossing, and ov ertaking situations, are widely used to ev aluate collision av oidance systems in a reproducible and interpretable manner [2]. Some studies further organize scenario sets according to regulatory cov erage or encounter taxonomies to enable systematic testing across rule-defined contexts [3]. Simulation en vironments built upon such templates may also incorporate uncertainty models and en vironmental disturbances to provide configurable ev al- uation suites [4]. While these approaches ensure interpretabil- ity and regulatory alignment, their reliance on handcrafted configurations limits their ability to capture the div ersity and statistical structure of real maritime traffic patterns. Data-statistical approaches construct encounter scenarios directly from historical AIS records by identifying ship inter- actions and analyzing the statistical distributions of key navi- gation parameters. These statistics are then used to reconstruct representativ e real-world encounters or guide the sampling of synthetic scenarios consistent with observed traffic patterns [5], [6]. Some works further incorporate measures such as en- counter complexity or importance to support scenario selection for algorithm testing [7]. While grounding scenario construc- tion in real traffic data improv es realism, such methods remain constrained by the av ailability and distribution of historical observations. Safety-critical encounters are inherently rare, making it difficult for purely data-driv en extraction approaches to systematically generate di verse high-risk scenarios beyond those already recorded. Algorithmic and generativ e approaches attempt to expand the scenario space beyond historical observ ations by synthesiz- ing encounter configurations through sampling, optimization, or learning-based generation. Some studies generate candidate traffic situations by systematically sampling parameter spaces and filtering hazardous cases using predefined risk metrics or clustering strategies [8]. Other works provide toolchains for producing structured encounter sets that enable controlled variation of traffic configurations for systematic testing [9]. More recently , learning-based methods such as reinforcement learning have been explored to adaptiv ely generate high- risk scenarios through interaction with the tested system, improving the efficienc y of discovering safety-critical cases [10]. While these approaches impro ve coverage and controlla- bility of scenario generation, many rely on abstract encounter parameterizations or simplified motion assumptions, making it challenging to preserve the statistical realism and operational patterns observed in real vessel trajectories. Overall, e xisting approaches address complementary aspects of scenario generation, including interpretability , realism, and controllability . Howe ver , a systematic mechanism that can jointly preserve real traffic characteristics while enabling scal- able generation of di verse safety-critical encounter scenarios remains limited. Motiv ated by these limitations, the proposed framew ork aims to preserve the realism of historical traffic data while expanding the av ailability of safety-critical scenar- ios through generativ e trajectory modeling. B. V essel T rajectory Generation A scenario can be interpreted as a temporal sequence of system states, which may be represented using different modalities depending on the lev el of abstraction required for testing and analysis [11]. For many traffic applications, trajectories provide a compact and effecti ve representation of object motion, making them widely used in scenario modeling and simulation studies. In the maritime domain, trajectory data derived from AIS observations hav e been extensi vely used for traf fic analysis, behavior modeling, and collision risk assessment. Existing research has primarily focused on trajectory prediction, where future vessel positions are estimated from historical motion patterns or en vironmental conditions [12]. While prediction methods are valuable for navigation support, they do not directly address the need to generate div erse motion patterns for systematic testing purposes. Fig. 1: Overview of the trajectory generation and scenario construction pipeline including Data Preprocessing, Conflux V AE Structure, and Encounter Pairing and Safety-Critical Scenario Construction: synthetic trajectories are paired and filtered to form realistic vessel encounters for safety analysis. T o synthesize new trajectories, generativ e models hav e recently attracted attention. Recurrent neural networks such as LSTM have been applied to learn motion dynamics from AIS data and generate plausible vessel movements [13]. More advanced generativ e approaches, including generativ e adversarial networks and variational autoencoders, hav e also been explored to capture the statistical structure of trajectory datasets and enable data-driv en synthesis of new motion se- quences [14], [15]. These methods provide promising tools for constructing trajectory sets that remain consistent with real traffic behavior while allowing controlled expansion of the motion space. Motiv ated by these dev elopments, the present work adopts a v ariational autoencoder–based generati ve framework to syn- thesize vessel trajectories, enabling the subsequent construc- tion of safety-critical encounter scenarios while preserving the statistical characteristics of real AIS data. I I I . M E T H O D O L O G Y A. Overall F r amework The proposed frame work consists of two sequential modules designed to construct safety-critical vessel encounter scenarios from historical AIS data, as illustrated in Figure 1. Historical vessel trajectories within a specified traffic flow are extracted and processed to generate representative nav- igation tracks. In addition to trajectory reconstruction and quality control, a data-driv en learning process is employed to capture the statistical characteristics of vessel motion within the traffic flow . This module ensures that the synthesized trajectories preserve the spatial, kinematic, and operational patterns observed in real vessel mov ements. The generated trajectories are then systematically paired to identify combinations satisfying encounter conditions defined by spatial proximity and relative motion consistenc y . Each de- tected interaction is then transformed into a standardized sce- nario through temporal parameterization centered on the close- encounter instant, with configurable pre- and post-encounter time windows. B. Data-Driven T rajectory Generation fr om Historical AIS 1) Data Prepr ocessing: The workflow of the data prepro- cessing step mainly includes interest route filtering, timestamp interval resampling, abnormal data filtering, and time window standardizing. Error data or missing values are filtered out and interpolation is applied where needed. The ship trajectory is denoted as a series of timestamp points, i.e., T ra j = P 1 , P 2 , . . . , P i , . . . , P N , P i = (lon i , lat i ) , i = 1 , . . . , N , where P i is the i th datapoint comprising longitude and latitude and N is the number of points. Among the recorded vessel routes in a designated area, two overlapping routes (Route 1 and Route 2) should be identified under start and end area constraints. Route j = { T ra j 1 , T ra j 2 , . . . , T ra j k , . . . , T ra j n } , j = 1 , 2 , denotes the two route datasets with n trajectories each, which are respectively used as training sets for subse- quent trajectory generation. 2) Conflux EMA for Sequence Modeling: Exponential mov- ing average (EMA) is widely used for sequence smoothing and long-range dependency modeling. Multi-headed EMA maintains se veral EMA states at different temporal scales to fuse short-, medium-, and long-term information. W e propose Conflux EMA, a split-flo w module with three parallel multi-headed EMA branches operating at distinct representation scales. The three branches correspond to natural temporal levels in vessel trajectories: (1) local dynamics (step- wise displacement and instantaneous velocity), (2) segment - lev el maneuvers (turning and course adjustment), and (3) global directional trends. T wo branches would merge segment and global patterns, while more than three introduce redundant scales without additional cov erage. Giv en x ∈ R B × L × d , the three branches produce y s , y m , and y l . The small branch projects d =64 to d s =32 with H s =4 heads for fine-grained features; the medium branch keeps d m =64 with H m =8 heads; the large branch e xpands to d l =128 with H l =16 heads for global context. A learnable softmax gate w fuses them as a residual: w = softmax( w 0 , w 1 , w 2 ) , ˆ x = x + w 0 y s + w 1 y m + w 2 y l . (1) Conflux EMA is embedded in a con volutional block (CE- Con v): a 1D conv olution is applied first, followed by the Conflux EMA residual along the sequence dimension. The same CECon v is used in both encoder and decoder . Fig. 2: Conflux EMA block (CECon v): three parallel multi- headed EMA branches at different scales (small, medium, large) are combined via a learnable softmax gate and added to the input as a residual. Compared to T ransformer self-attention, multi-headed EMA offers two adv antages for noisy AIS trajectories: the exponen- tial decay imposes a temporal locality prior aligned with vessel kinematics, and EMA acts as a low-pass filter that suppresses high-frequency GPS and AIS noise rather than amplifying anomalous frames. 3) ConfluxV AE for T rajectory Generation: A V AE consists of an encoder E and a decoder D . The encoder maps input data to a latent space; the decoder reconstructs data from the latent code. The generati ve model is p ( x, z ) = p θ ( z ) p θ ( x | z ) , where p θ ( z ) is the prior over the latent variable z and p θ ( x | z ) is the likelihood modeled by the decoder . The training objectiv e is to maximize the e vidence lower bound (ELBO) [15]: L ( θ , ϕ ; x i ) = − D KL ( q ϕ ( z | x i ) ∥ p θ ( z ))+ E q ϕ ( z | x i ) [log p θ ( x i | z )] , (2) where ϕ and θ denote the variational and generativ e parame- ters. The encoder outputs the parameters of a Gaussian ov er the latent space (mean µ and log-variance log σ 2 ); z is obtained by the reparameterization z = µ + ϵ · exp(0 . 5 log σ 2 ) with ϵ ∼ N (0 , I ) . ConfluxV AE integrates Conflux EMA into a V AE for tra- jectory generation. The encoder processes input trajectories through con volutional feature extraction and refinement layers, followed by a CECon v block for multi-scale temporal model- ing. The extracted features are flattened and passed through fully connected layers that output the latent parameters µ and log σ 2 . After reparameterization to obtain z , the decoder reconstructs trajectories through a symmetric in verse process: fully connected layers reshape the latent code, which then passes through a CECon v block and transposed con volutional layers, with sigmoid activ ation producing the final output. Compared to a standard V AE that uses a single reconstruction-KL objecti ve, the training loss used here is a β -weighted combination of reconstruction and KL terms. Let ˆ x denote the reconstruction. The reconstruction loss is the mean squared error L recon = MSE( x, ˆ x ) . The KL div ergence term (Gaussian prior and posterior) is L KL = − 1 2 P J j =1 1 + log σ 2 j − µ 2 j − σ 2 j , where J is the latent di- mension. The total loss is L = L recon + β L KL , where β balances latent regularization and reconstruction quality . 4) Data P ostprocessing: T o improve the quality of the gen- erated trajectories, a Savitzk y–Golay filter is applied to denoise the model output. The filter fits adjacent data points with a low-de gree polynomial via linear least squares, smoothing the trajectory while preserving its trend. 5) Evaluation Index: W e use four metrics to ev aluate trajectory generation quality . MAE (Mean Absolute Error) and MSE (Mean Squared Error) measure point-wise reconstruction error: MAE = 1 N P N i =1 | x i − ˆ x i | and MSE = 1 N P N i =1 ( x i − ˆ x i ) 2 , where x i and ˆ x i denote corresponding points in the original and generated trajectories respectiv ely , and N is the total number of points. DM (Distance Metric) measures the distributional distance between the original and generated trajectory sets using Fr ´ echet distance between their statistical representations. MMD (Maximum Mean Discrepancy) quan- tifies distributional discrepancy via a kernel k (exponentiated quadratic [16]). For reference set X and generated set Y , MMD 2 ( X, Y ) = 1 m ( m − 1) X i = j k ( x i , x j ) + 1 n ( n − 1) X i = j k ( y i , y j ) − 2 mn X i,j k ( x i , y j ) (3) where m and n are the sizes of sets X and Y respecti vely . C. Encounter P airing and Safety-Critical Scenario Construc- tion 1) Pr oblem F ormulation: Let T (1) = { τ (1) i } N 1 i =1 and T (2) = { τ (2) j } N 2 j =1 denote two trajectory pools generated for two designated traffic flows (e.g., inbound vs. outbound). Each trajectory τ is represented as a temporally ordered sequence of AIS states τ = { s k } L k =1 , s k = ( t k , ϕ k , λ k ) , (4) where t k is the timestamp, and ( ϕ k , λ k ) are latitude and longitude at sample k . The objectiv e is to automatically construct a set of in- teraction scenarios by pairing trajectories across flo ws and extracting those that satisfy encounter conditions within a predefined encounter region. Let R be a polygonal re gion of interest (R OI) defined by vertices { ( ϕ ( r ) m , λ ( r ) m ) } M m =1 . The output is a scenario set S = { σ ij | τ (1) i ∈ T (1) , τ (2) j ∈ T (2) , σ ij is safety-critical } , (5) where each scenario σ ij stores the paired trajectory indices and the corresponding encounter indices ( k ⋆ , ℓ ⋆ ) at which a safety-critical interaction is detected. 2) Region-Constr ained Encounter P airing: T o av oid spuri- ous pairs from out-of-region ev aluation, we restrict to states within encounter region R via indicator I R ( ϕ, λ ) = 1 if ( ϕ, λ ) ∈ R else 0 . For candidate pair ( τ (1) i , τ (2) j ) , we require indices k , ℓ with I R ( ϕ (1) i,k , λ (1) i,k ) = I R ( ϕ (2) j,ℓ , λ (2) j,ℓ ) = 1 . This focuses detection on interaction hotspots and reduces false positiv es from distant co-existence. 3) Pr oximity and Motion Based Encounter Identification: For encounter detection, relativ e motion of vessels s (1) i,k and s (2) j,ℓ is e valuated from consecutive AIS samples. W ith fixed ∆ t , Speed Over Ground (SOG) and Course Over Ground (COG) are estimated: SOG ≈ d ( p k − 1 , p k ) / ∆ t , COG ≈ bearing ( p k − 1 → p k ) , where p k = ( ϕ k , λ k ) and d is great- circle distance. Based on these reconstructed motion states, relative motion indicators between vessel pairs are ev aluated using standard closest point of approach (CP A) analysis. This enables the interaction assessment to consider not only instantaneous separation but also the motion tendency of the vessels, en- suring that detected ev ents correspond to genuine navigational encounters. An interaction is considered safety-critical only if both proximity and motion-consistency conditions are satisfied. The vessels must exhibit sufficiently close spatial interaction at some point along the trajectories. Let D min = min t D r ( t ) denote the minimum observed separation distance between the vessels ov er the ev aluated interval, where D r ( t ) is the instantaneous inter-vessel distance expressed in nautical miles. The proximity condition requires D min ≤ d min , ensuring that the vessels come into near-contact proximity at least once. d min denotes the minimum-distance threshold. Additionally , the relative motion must indicate a conv erging encounter configuration. This requires that there exists at least one time instant t at which the instantaneous distance and CP A indicators jointly satisfy D r ( t ) ≤ d th , 0 < TCP A( t ) ≤ T th , DCP A( t ) ≤ d cpa , (6) where d th denotes the proximity threshold, T th denotes the admissible time-to-closest-approach windo w , and d cpa denotes the admissible closest-approach distance. These constraints ensure that the vessels are not only spatially close b ut also dynamically con ver ging within a relev ant time horizon. By jointly enforcing the minimum-distance requirement and the motion-consistency constraint, the framework filters out incidental spatial proximity and isolates interaction configura- tions that are both spatially critical and dynamically meaning- ful. The resulting encounters therefore represent safety-critical maritime situations suitable for scenario-based safety analysis and testing of intelligent navigation systems. 4) Configurable Safety-Critical Scenario Repr esentation: Giv en a paired trajectory ( τ (1) i , τ (2) j ) that satisfies the safety- critical encounter conditions, a standardized scenario represen- tation is further constructed by anchoring the scenario at the closest-interaction instant. Let t ⋆ denote the time correspond- ing to the minimum observed inter -vessel separation within the encounter interval, i.e., t ⋆ = arg min t D r ( t ) , (7) where D r ( t ) is the instantaneous inter-v essel distance (in nautical miles) ev aluated at time t . T o enable consistent scenario extraction across different encounter instances, two configurable temporal margins are introduced: t early and t after . These margins define a scenario window centered around t ⋆ : [ t ⋆ − t early , t ⋆ + t after ] . Accordingly , each vessel trajectory is partitioned into three segments: 1) Pre-encounter segment (nominal tracking path): from the trajectory start time to t ⋆ − t early . This segment represents the nominal path that the vessel is expected to track prior to entering the hazardous situation. 2) Encounter segment (safety-critical test window): from t ⋆ − t early to t ⋆ + t after . This segment constitutes the core safety-critical interval used to ev aluate collision avoid- ance and maneuvering capability under the encounter configuration. 3) Post-encounter segment (recovery tracking path): from t ⋆ + t after to the end of the trajectory . This segment represents the reco very phase in which the vessel is expected to re-plan and continue tracking a nominal route after the hazardous interaction. The extracted scenario instance is represented as σ ij = i, j, t ⋆ , t early , t after , (8) together with the corresponding trajectory clips for both ves- sels within the three segments. This representation provides a unified interface for downstream simulation-based digital testing, where the pre- and post-encounter segments define nominal reference paths, and the encounter segment defines the ev aluation window for collision-av oidance performance. The specific values of the configuration parameters, in- cluding t early and t after , can be selected according to test- ing objectives and operational context. In practice, these parameters may be determined based on domain expertise, vessel characteristics, or regulatory requirements. Moreov er , performance ev aluation criteria for autonomous vessels can be defined differently across the three trajectory se gments, enabling flexible assessment of tracking stability , collision- av oidance capability , and post-encounter recovery behavior . I V . E X P E R I M E N T S A N D R E S U LT S A. V essel T r ajectory Generation for Designated T raf fic Flows 1) Data Description: The dataset is deri ved from one-year AIS records of a Singapore seaway , covering the area ap- proximately bounded by longitudes 103.785°E–103.837°E and latitudes 1.180°N–1.215°N. T wo opposing traffic flows along the same waterway are selected as Route 1 (northbound, from 1.180°N to 1.212°N) and Route 2 (southbound, from 1.210°N to 1.189°N). The two routes traverse the same geographic corridor in opposite directions, causing their trajectories to spatially overlap and creating potential v essel encounter situa- tions. Figure 3 shows the spatial distribution of both datasets. Raw trajectories are filtered by geographic start/end bounding-box constraints to isolate each directional flow . The retained trajectories are then resampled to a uniform 10-second interval and missing values are filled by linear interpolation. Each datasets utliers that de viate significantly from the domi- nant flow are identified and removed using pairwise distance filtering to obtain a clean, well-structured dataset suitable for comparativ e ev aluation. A fixed time window is then applied to standardize sequence length, yielding 1094 trajectories with 91 steps for Route 1 and 2310 trajectories with 61 steps for Route 2 as the final training inputs. Fig. 3: V isualization of trajectory datasets for (a) Route 1 and (b) Route 2. 2) Experiment Setup: All experiments are conducted on a workstation running Ubuntu 22.04 with a single NVIDIA GeForce R TX 4080 GPU under PyT orch 2.6. T wo models are trained corresponding to the two route datasets. The Route 1 model is trained on 1094 well-structured trajectories with 91 time steps each, serving as the primary comparativ e ev aluation benchmark. The Route 2 model is trained on 2310 more complex trajectories with 61 time steps each, designed for robustness ev aluation under challenging data conditions. Both models use a latent dimension of 100, batch size of 64, and are trained for 500 epochs. After training, each model generates 1000 trajectories for e valuation. All trajectories maintain a uniform 10-second interval between consecutiv e data points. 3) Experimental Results: Comparativ e ev aluation is con- ducted on both Route 1 and Route 2 datasets against baseline models, where Route 2 additionally serves to demonstrate the robustness of ConfluxV AE under more challenging real-world data conditions. ConfluxV AE demonstrates stable conv ergence on the Route 1 dataset, effecti vely capturing the dominant traffic flow pat- terns. Overall, the generated trajectories e xhibit similar trends and spatial distributions to the original trajectories. Figure 4 visualizes the generated trajectories, demonstrating that the model successfully reproduces the spatial distribution and navigational patterns of real v essel mov ements while maintain- ing realistic trajectory smoothness and continuity . Howe ver , some unexpected jitters occur in the generated trajectories, particularly at trajectory bends. T o improv e data quality and smooth the trajectories, the Savitzk y–Golay filter is applied for denoising after model generation. Fig. 4: V essel trajectories generated by ConfluxV AE T o ev aluate the performance of the generated trajectories, the similarity and error between the original and generated trajectories are measured. Since MAE and MSE are calculated based on pairwise trajectory comparisons, the results are obtained in a 1000 × 1094 matrix for Route 1. The mean value of the calculated matrix is taken as the final metric value. Smaller values indicate better performance and higher similarity between trajectories. T able I shows the comparative results using MAE, MSE, DM, and MMD metrics against baseline models. T ABLE I: Comparative ev aluation on Route 1 dataset. Model MAE ↓ MSE ↓ DM ↓ MMD ↓ CNN 0.1658 0.0453 0.9073 0.5163 GAN 0.0763 0.0125 0.7560 0.5630 LSTM 0.0615 0.0079 0.7301 0.4937 BiLSTM 0.0621 0.0082 0.6807 0.4812 V AE 0.0569 0.0053 0.7235 0.4998 T ransformer V AE 0.0452 0.0038 0.6287 0.5373 Social V AE 0.0396 0.0038 0.6706 0.3748 ConfluxV AE 0.0177 0.0008 0.6315 0.3706 ↓ : Less is better; ↓ : Training metric; ↓ : Maritime AIS-specific metric. Bold : 1 st per column; Underline: 2 nd per column. T o further ev aluate model robustness, T able II presents the comparativ e ev aluation on the Route 2 dataset, which contains greater trajectory irregularity and AIS signal variability . While all models exhibit some performance degradation compared to Route 1, ConfluxV AE maintains top performance across all metrics, demonstrating its robustness under noisy and complex real-world data conditions. The multi-scale temporal modeling capability of Conflux EMA enables ConfluxV AE to simultaneously capture both local trajectory details and global flo w patterns, allo wing it to generalize effecti vely across varying data quality and behavioral diversity . T ABLE II: Comparativ e ev aluation on Route 2 dataset (robust- ness ev aluation). Model MAE ↓ MSE ↓ DM ↓ MMD ↓ CNN 0.1282 0.0291 0.9308 0.6992 GAN 0.0860 0.0145 0.7604 0.8202 LSTM 0.0358 0.0036 0.6724 0.7420 BiLSTM 0.0361 0.0045 0.6717 0.6767 V AE 0.0425 0.0023 0.7512 0.7402 T ransformer V AE 0.0290 0.0021 0.6782 0.6816 Social V AE 0.0392 0.0042 0.7245 0.6452 ConfluxV AE 0.0217 0.0016 0.6393 0.6314 ↓ : Less is better; ↓ : Training metric; ↓ : Maritime AIS-specific metric. Bold : 1 st per column; Underline: 2 nd per column. 4) Ablation Study: T o validate the contribution of each component in ConfluxV AE, an ablation study is conducted on Route 1 by removing one component at a time. Four ablation variants are e valuated: without the Conflux EMA module (replaced by a standard con volutional block), without the complete Conflux module, without β -weighted KL di ver gence, and without the Savitzky–Golay postprocessing filter . T able III reports the results. Removing the Conflux EMA module leads to a substantial increase in MAE and MSE, confirming that multi-scale tempo- ral feature extraction is critical for trajectory reconstruction ac- curacy . Removing the entire Conflux module further degrades MAE while achie ving the second-best MSE, indicating that the EMA sub-module is the primary contributor to distribution- lev el quality . Removing β -weighted KL causes the DM metric to collapse (marked as strikethrough), sho wing that the model fails to generate valid trajectory distributions without proper KL regularization. Removing the Savitzk y–Golay filter has negligible impact on MAE and MSE but slightly degrades DM and MMD, demonstrating that postprocessing smoothing contributes to distribution-le vel fidelity . The full ConfluxV AE achiev es the best performance on MAE, MSE, and MMD, validating the necessity of each component. T ABLE III: Ablation study of ConfluxV AE on Route 1. V ariant MAE ↓ MSE ↓ DM ↓ MMD ↓ w/o Conflux EMA 0.0568 0.0052 0.7086 0.4999 w/o Conflux 0.0351 0.0026 0.6958 0.4658 w/o β -KL 0.0609 0.0047 0.0000 0.9996 w/o SG filter 0.0179 0.0008 0.6481 0.3709 ConfluxV AE (full) 0.0177 0.0008 0.6315 0.3706 5) Qualitative Case Study: Baseline models each exhibit distinct limitations: LSTM-based models and standard V AE suffer from mode collapse with trajectories con ver ging to a single route, CNN and GAN produce chaotic unrealistic paths, Social-V AE generates fragmented trajectories by ov er- capturing AIS signal dropouts, and T ransformer-V AE sho ws noticeable jitter at trajectory bends. ConfluxV AE ov ercomes these limitations through multi-scale temporal modeling, si- multaneously capturing local trajectory details and global traf- fic flow patterns, yielding trajectories that balance smoothness and div ersity while faithfully reproducing maritime traffic flo w structure for safety-critical scenario construction. B. Demonstration of Constructed Safety-Critical Scenario 1) T raf fic Flow and Encounter Context: T o demonstrate the applicability of the proposed framework, two representativ e traffic flows within the study area are selected as experimental contexts. As shown in Figure 5a, the study area is located near the precautionary area with the highest density of multi-vessel encounters according to the chart. Among the intersecting traffic flo ws in this region, the two dominant routes indicated by the thick arro ws are selected for analysis, namely the southbound outbound traffic flo w and the eastbound traffic that turns northward to enter the port. A representative real encounter occurring within these flows is illustrated in Figure 5b, where the vessel trajectories demon- strate a typical interaction pattern observed in the historical AIS data, and the southbound vessel has already initiated an av oidance maneuver . 2) Scenario Construction fr om Generated T rajectories: Based on the representativ e traf fic flows identified in the pre- vious section, synthetic trajectories are generated to construct (a) Study Area and Traf fic Flows (b) An Encounter Example Fig. 5: T raffic Flow and Encounter Context a di verse set of encounter scenarios. The generation process preserves the statistical properties of the original AIS data while allowing controlled variations in vessel motion, timing, and spatial interaction patterns. The resulting set of scenarios exhibits significant diversity in terms of relativ e geometry , encounter timing, and ma- noeuvring behaviour , as illustrated by the six representativ e scenarios shown in Figure 6 (Additional generated scenarios are av ailable at the attached link). In addition, to systemati- cally represent these generated encounters, each scenario is expressed in a structured form consisting of Pre-encounter segment (solid line), Encounter segment (light dashed line) and the Post-encounter segment (thick dashed line). It is worth noting that the trajectory dataset used for training is dominated by encounters in v olving relati vely small vessels, primarily bunk er barges with lengths of around 100 m. Accordingly , the scenario construction parameters adopted in this study are set conservati vely , with d min = 0 . 05 nm and both t early and t after fixed at 100 s. It can be observed that the generated trajectories preserve the ke y characteristics of the original historical tracks while not being exact replicas. At the same time, the interacting trajectories are capable of triggering safety-critical close-encounter situations. Fig. 6: Constructed Safety-Critical Encounter Scenarios V . C O N C L U S I O N This paper presented a data-dri ven framework for construct- ing safety-critical v essel encounter scenarios from histori- cal AIS data. By integrating generative trajectory modeling with structured encounter pairing and temporal parameteri- zation, the proposed method enables scalable construction of simulation-ready maritime testing scenarios while preserving the statistical characteristics of real traffic flows. T o enhance trajectory realism and rob ustness under noisy AIS observations, a multi-scale temporal variational autoen- coder architecture was introduced. Experimental results on real-world traffic flows demonstrated that the proposed Con- fluxV AE ef fectiv ely captures vessel motion patterns, improves trajectory smoothness and distributional fidelity , and maintains robustness across datasets with varying lev els of complexity . The generated trajectories were further combined to construct div erse encounter scenarios that retain ke y characteristics of real historical interactions while enabling the discovery of safety-critical cases beyond those directly observed. The proposed framew ork pro vides a practical pathway for building scenario libraries for simulation-based digital testing, benchmarking, and safety assessment of autonomous navig a- tion systems and decision-support tools in maritime traffic en vironments. Future work will in vestigate the extension of the framew ork to multi-vessel interactions, incorporation of en vi- ronmental factors such as wind and current, and integration with large-scale simulation platforms to support systematic validation of intelligent maritime traffic management systems. R E F E R E N C E S [1] H Imazu. Research on collision avoidance manoeuvre. T okyo University of Marine Science and T echnology: T okyo, Japan , 1987. [2] Ivan Porres, Sepinoud Azimi, and Johan Lilius. Scenario-based testing of a ship collision avoidance system. In 2020 46th Eur omicr o Confer ence on Software Engineering and Advanced Applications (SEAA) , pages 545–552. IEEE, 2020. [3] Ryohei Sawada, Keiji Sato, and Makiko Minami. Framework of safety ev aluation and scenarios for automatic collision av oidance algorithm. Ocean Engineering , 300:117506, 2024. [4] Trym T engesdal and T or A Johansen. Simulation framework and software environment for ev aluating automatic ship collision avoidance algorithms. In 2023 IEEE Conference on Control T ec hnology and Applications (CCT A) , pages 186–193. IEEE, 2023. [5] Feixiang Zhu, Zhengyu Zhou, and Hongrui Lu. Randomly testing an autonomous collision avoidance system with real-world ship encounter scenario from ais data. Journal of Marine Science and Engineering , 10(11):1588, 2022. [6] W eiqiang W ang, Liwen Huang, Kezhong Liu, Y ang Zhou, Zhitao Y uan, Xuri Xin, and Xiaolie W u. Ship encounter scenario generation for colli- sion avoidance algorithm testing based on ais data. Ocean Engineering , 291:116436, 2024. [7] W eiqiang W ang, Kezhong Liu, Liwen Huang, Xuri Xin, Xiaolie W u, and Zhitao Y uan. Generation and complexity analysis of ship encounter scenarios using ais data for collision av oidance algorithm testing. Ocean Engineering , 312:119034, 2024. [8] V ictor Bolbot, Christos Gkerekos, Gerasimos Theotokatos, and Evan- gelos Boulougouris. Automatic traf fic scenarios generation for au- tonomous ships collision avoidance system testing. Ocean Engineering , 254:111309, 2022. [9] T om Arne Pedersen, Chanjei V asanthan, Kristian Karolius, Øystein Engelhardtsen, Koen Pieter Houweling, and Are Jørgensen. Generating structured set of encounters for verifying automated collision and grounding avoidance systems. In Journal of Physics: Conference Series , volume 2618, page 012013. IOP Publishing, 2023. [10] Feixiang Zhu, Y ihan Niu, Moxuan W ei, Y ifan Du, and Pengyu Zhai. A high-risk test scenario adaptive generation algorithm for ship au- tonomous collision av oidance decision-making based on reinforcement learning. Ocean Engineering , 320:120344, 2025. [11] Simon Ulbrich, T ill Menzel, Andreas Reschka, Fabian Schuldt, and Markus Maurer . Defining and substantiating the terms scene, situation, and scenario for automated driving. In 2015 IEEE 18th International Confer ence on Intelligent T ransportation Systems , pages 982–988. IEEE, 2015. [12] Xian Ding, Hongwei Bian, Heng Ma, and Rongying W ang. Ship trajectory generator under the interference of wind, current and wa ves. Sensors , 22(23):9395, 2022. [13] ChangXi Zhuang and Chao Chen. Research on autonomous route generation method based on ais ship trajectory big data and improved lstm algorithm. Fr ontiers in Neuror obotics , 16, 2022. [14] Ian J Goodfellow , Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, Da vid W arde-Farley , Sherjil Ozair , Aaron Courville, and Y oshua Bengio. Generativ e adversarial nets. Advances in neural information pr ocessing systems , 27, 2014. [15] Diederik P Kingma and Max W elling. Auto-encoding variational bayes. arXiv pr eprint arXiv:1312.6114 , 2013. [16] Luming Chen and Sujit K. Ghosh. A test of relativ e similarity for model selection in generativ e models. Entr opy , 26(2):150, 2024.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment