Pilot project 3 – Autonomous shipping technology supported by AI

CHALLENGE! To build a model for the automatic detection of small objects at sea and estimation of the height and direction of waves propagation, i.e. the sea state estimation.

HOW? Computer Vision utilization

WHY? To provide the necessary information in building the deep learning autonomous shipping and safe navigation model.

FINAL RESULT→ CV models for small objects detection and sea states description.

GOALS FOR INNO2MARE PROJECT: To assist in creating the autonomous ship DSS.

Recent Developments and Progress

Personal Watercraft Trajectory Prediction and Classification

CHALLENGE! To build a model for the automatic detection of potentially reckless and inexperienced personal watercraft drivers, monitor rental boats, and predict the driving behavior of motorboat operators.

HOW? Detecting trajectory anomalies using inflection point sequences, and using a Bayesian, Markov chain, and different machine learning models for forecasting.

WHY? To provide the necessary information to avoid collisions and hazards, especially in crowded spaces during the peak of the tourist season in crowded or narrow areas.

FINAL RESULT→ Anomaly detection, and forecasting models for small personal vessels.

GOALS FOR INNO2MARE PROJECT: To assist in creating a new speed-limiting algorithm

for small personal watercraft operators that could replace the existing proximity-based approach.

Experiments and actions on the pilot project so far:

1. Using inflection points for similar trajectory retrieval

The key purposes of our pilot project include similarity metric design, trajectory classification, and clustering, and the modification of machine learning approaches for forecasting that will enable the development of a novel approach to ensure the safety of all maritime traffic participants by aiding the navigation decision process for vehicles that are semi-autonomous, and optionally remotely controlled.

The project focuses on two specific trajectory-related tasks:

the identification of similar drivers by examining trajectory patterns.
the long-term and short-term prediction of watercraft trajectories.

Gathering real-world training data was essential for all the described models. However, we also filtered the data to enhance the quality of trajectories, thus avoiding interpolation that could be misleading, especially in cases where large gaps in transmission were observed.

Below are the key activities implemented thus far in assembling an anomaly detection model:

Data Collection and Analysis: We merged data for 2171 rides from 14 locations, and 19 vehicles in Croatia, Spain, Portugal, Greece, Canada, and the United States of America recorded in July 2022, and 2023.

Conducting a Human Similar Trajectory Selection Experiment: Experts manually selected 5 out of the 20 trajectories in a web interface based on the perceived similarity to a baseline trajectory.

Anomaly Detection: We used DBSCAN and k-Means clustering to divide trajectories into anomalous and non-anomalous based on their similarity generated using inflection point sequences.

Transmitted longitude, latitude, speed, and timestamp data from different locations was essential for training a model that can be applied to new data after preprocessing without additional training. Expert feedback was compared to algorithm selection, and clustering to validate the generated classification. Clustering algorithms provided a basis for comparison with user selection.

1. Developing a Bayesian and Markov chain approach to short-term and long-term personal watercraft trajectory forecasting

We developed Bayesian, and Markov chain approaches to long-, and short-term trajectory forecasting without machine learning using several approaches:

A Bayesian, and Markov chain approach using one or two previous states;
A Bayesian, and Markov chain approach using one or two previous states, and conditional probability dependent on wave height;
A Bayesian, and Markov chain approach using one or two previous states, and conditional probability dependent on wind speed;
A Bayesian, and Markov chain approach using one or two previous states, and conditional probability dependent on temperature.

We tested additional meteorological, and environmental variables, such as wave height, wind speed, and temperature, to examine their impact on small maritime vessels. It was confirmed that personal watercraft trajectories are not significantly affected by these external conditions, since their operation inherently creates large waves, and tourists usually avoid extreme weather. An additional previous state reduced the forecasting errors. Performance was not satisfactory compared to competing models, especially for longer forecasting times, motivating the development of machine learning methods.

2. Developing a neural network for personal watercraft trajectory forecasting

Building upon the SeaStateSynth pipeline, we included 5 additional modules to synthesize richly annotated images containing small objects in the sea:

Recurrent neural network (RNN) models with simple RNN, long short-term memory (LSTM), or gated recurrent unit (GRU) layers in four architectures – forecasting the trajectories of automobiles on highways;
A GRU attention model architecture with four experiment settings – used for sequence-to-sequence translation, adapted to process numbers;
LSTM bidirectional and convolutional models for peptide self-assembly – adapted for prediction instead of classification, processing trajectories instead of sequential properties, and aggregation propensity;
The Unified Time Series Model (UniTS) model – developed for use on diverse time-series data, and tasks, without retraining, but adapted to our data using zero-shot.

Different architectures with various hyperparameter settings were examined on the testing data. Similar methods are well established for other vehicles, including land vehicles such as automobiles and larger maritime vessels that follow predetermined routes. We evaluated the models based on root mean square error (RMSE), to enable comparison with relevant literature sources that inspired their development. Execution time was monitored to determine the practical utility in real-world maritime environments. We identified the UniTS model as the best-performing model. We are presently adapting the established method to produce a model that can be embedded on a personal watercraft.

June 2023

Progress 30.6.2023

Progress file

December 2023

Progress 31.12.2023

Progress file

June 2024

Progress 30.6.2024

Progress file

December 2024

Progress 31.12.2024

Recent Developments and Progress

June 2025



June 2025

Progress 30.6.2025

Lesson learnt and implementation strategy



Lesson learnt and implementation strategy

Publications

Your Title Goes Here

Your content goes here. Edit or remove this text inline or in the module Content settings. You can also style every aspect of this content in the module Design settings and even apply custom CSS to this text in the module Advanced settings.

Estimation of sea state parameters from ship motion responses using attention-based neural networks

Denis Selimović, Franko Hržić, Jasna Prpić-Oršić, Jonatan Lerga, Estimation of sea state parameters from ship motion responses using attention-based neural networks, Ocean Engineering, Volume 281, 2023.
https://arxiv.org/abs/2301.08949

Abstract: On-site estimation of sea state parameters is crucial for ship navigation. Extensive research has been conducted on model-based estimation utilizing ship motion responses. Model-free approaches based on machine learning (ML) have recently gained popularity, and estimation from time-series of ship motion responses using deep learning (DL) methods has given promising results. In this study, we apply the novel, attention-based neural network (AT-NN) for estimating wave height, zero-crossing period, and relative wave direction from raw time-series data of ship pitch, heave, and roll. Despite reduced input data, it has been demonstrated that the proposed approaches by modified state-of-the-art techniques (based on convolutional neural networks (CNN) for regression, multivariate long short-term memory CNN, and sliding puzzle neural network) improved estimation MSE, MAE, and NSE by up to 86%, 66%, and 56%, respectively, compared to the best performing original methods for all sea state parameters. Furthermore, the proposed technique based on AT-NN outperformed all tested methods (original and enhanced), improving estimation MSE by 94%, MAE by 74%, and NSE by 80% when considering all sea state parameters. Finally, we proposed a novel approach for interpreting the uncertainty estimation of neural network outputs based on the Monte-Carlo dropout method to enhance the model’s trustworthiness.

Keywords: Ship motions; Sea state estimation; Deep learning; Attention neural network; Uncertainty estimation

Application of raycast method for person geolocalization and distance determination using UAV images in Real-World land search and rescue scenarios

Goran Paulin, Sasa Sambolek, Marina Ivasic-Kos, Application of raycast method for person geolocalization and distance determination using UAV images in Real-World land search and rescue scenarios, Expert Systems with Applications, Volume 237, Part A, 1 March 2024, 121495.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4450690

Abstract: People enjoy spending time in the wilderness for numerous reasons. However, they occasionally get lost or injured, and their survival depends on being efficiently found and rescued in the shortest possible time. A search and rescue operation (SAR) is launched after the accident is reported, and all possible resources are activated. The inclusion of drones in SAR operations has enabled the use of computer vision methods to detect persons in aerial imagery automatically. When searching by drone, preference is given to oblique photographs that cover a larger area within a single image, reducing the search time. Unlike vertical photographs, oblique photographs include a significant scale change, making it challenging to locate a person in the real world and determine their distance from the drone. In order to solve this problem, encouraged by our previous successful simulations, we explored the possibility of applying the raycast method for person geolocalization and distance determination for use in real-world scenarios. In this paper, we propose a system able to precisely geolocate persons automatically detected in offline processed images recorded during the SAR mission. After a series of experiments on terrains of different configurations and complexity, using a custom-made 3D terrain generator and raycaster, along with a deep neural network-based person detector trained on our custom dataset, we defined a method for geolocating detected person based on raycast, which allows using low-cost commercial drones with a monocular camera and no Real-Time Kinematic module while enabling laser rangefinder emulation during offline image analysis. Our person geolocating method overcomes the problems faced by previous methods and, using a single flight sequence with only 4 consecutive detections, significantly outperforms the previous best results, with reliability of 42,85% (geolocating error of 0.7 m on recording from a 30 m height). Also, a short time of only 247 s enables offline processing of data recorded during a 21-minute drone flight covering approximately an area of 10 ha, proving that the proposed method can be effectively used in actual SAR missions. We also proposed a new evaluation metric (ErrDist) for person geolocalization and provided recommendations for using the proposed system for person detection and geolocation in real-world scenarios.

Keywords: Raycasting; Drone imagery; Object detection; YOLOv4; Object geolocalization; Distance determination; Search and rescue missions

Detection of motor imagery based on short-term entropy of time-frequency representations

Luka, Batistić; Jonatan, Lerga; Isidora, Stanković , Detection of motor imagery based on short-term entropy of time-frequency representations, BioMedical Engineering OnLine volume 22, Article number: 41 (2023)

https://doi.org/10.1186/s12938-023-01102-1

Abstract:

Motor imagery is a cognitive process of imagining a performance of a motor task without employing the actual movement of muscles. It is often used in rehabilitation and utilized in assistive technologies to control a brain–computer interface (BCI). This paper provides a comparison of different time–frequency representations (TFR) and their Rényi and Shannon entropies for sensorimotor rhythm (SMR) based motor imagery control signals in electroencephalographic (EEG) data. The motor imagery task was guided by visual guidance, visual and vibrotactile (somatosensory) guidance or visual cue only.

When using TFR-based entropy features as an input for classification of different interaction intentions, higher accuracies were achieved (up to 99.87%) in comparison to regular time-series amplitude features (for which accuracy was up to 85.91%), which is an increase when compared to existing methods. In particular, the highest accuracy was achieved for the classification of the motor imagery versus the baseline (rest state) when using Shannon entropy with Reassigned Pseudo Wigner–Ville time–frequency representation.

Our findings suggest that the quantity of useful classifiable motor imagery information (entropy output) changes during the period of motor imagery in comparison to baseline period; as a result, there is an increase in the accuracy and F1 score of classification when using entropy features in comparison to the accuracy and the F1 of classification when using amplitude features, hence, it is manifested as an improvement of the ability to detect motor imagery.

Keywords: Brain–computer interface , Electroencephalography , Information entropy, Motor imagery, Movement detection , Time–frequency representations

Evaluating YOLOV5, YOLOV6, YOLOV7, and YOLOV8 in Underwater Environment: Is There Real Improvement?

Boris Gašparović; Goran Mauša; Josip Rukavina; Jonatan Lerga, Evaluating YOLOV5, YOLOV6, YOLOV7, and YOLOV8 in Underwater Environment: Is There Real Improvement?

DOI: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5156721

Published in: 2023 8th International Conference on Smart and Sustainable Technologies (SpliTech)

Abstract:

This paper compares several new implementations of the YOLO (You Only Look Once) object detection algorithms in harsh underwater environments. Using a dataset collected by a remotely operated vehicle (ROV), we evaluated the performance of YOLOv5, YOLOv6, YOLOv7, and YOLOv8 in detecting objects in challenging underwater conditions. We aimed to determine whether newer YOLO versions are superior to older ones and how much, in terms of object detection performance, for our underwater pipeline dataset. According to our findings, YOLOv5 achieved the highest mean Average Precision (MAP) score, followed by YOLOv7 and YOLOv6. When examining the precision-recall curves, YOLOv5 and YOLOv7 displayed the highest precision and recall values, respectively. Our comparison of the obtained results to those of our previous work using YOLOv4 demonstrates that each version of YOLO detectors provide significant improvement.

Author Keywords: object detection, yolov5, yolov6, yolo7, yolov8, comparison

A computer vision approach to estimate the localized sea state

Aleksandar Vorkapic, Miran Pobar, Marina Ivasic-Kos, A computer vision approach to estimate the localized sea state, Ocean Engineering , Volume 309, Part 1, 1 October 2024, 118318.

https://arxiv.org/abs/2407.03755

Abstract

This research presents a novel application of computer vision (CV) and deep learning methods for real-time sea state recognition, aiming to contribute to improving the operational safety and energy efficiency of seagoing vessels, key factors in meeting the legislative carbon reduction targets. Our work focuses on utilizing sea images in operational envelopes captured by a single stationary camera mounted on the ship bridge. The collected images are used to train a deep learning model to automatically recognize the state of the sea based on the Beaufort scale. To recognize the sea state, we used 4 state-of-the-art deep neural networks with different characteristics that proved useful in various computer vision tasks: Resnet-101, NASNet, MobileNet_v2, and Transformer ViT -b32. Furthermore, we have defined a unique large-scale dataset, collected over a broad range of sea conditions from an ocean-going vessel prepared for machine learning. We used the transfer learning approach to fine-tune the models on our dataset. The obtained results demonstrate the potential for this approach to complement traditional methods, particularly where in-situ measurements are unfeasible or interpolated weather buoy data is insufficiently accurate. This study sets the groundwork for further development of sea state classification models to address recognized gaps in maritime research and enable safer and more efficient maritime operations.

Keywords: Energy efficient shipping, Computer vision, Sea state recognition, Deep neural networks, Real-time monitoring

Interpretable Machine Learning: A Case Study on Predicting Fuel Consumption in VLGC Ship Propulsion

Aleksandar Vorkapić, Sanda Martinčić-Ipšić, Rok Piltaver, Interpretable Machine Learning: A Case Study on Predicting Fuel Consumption in VLGC Ship Propulsion, Journal of Marine Science and Engineering, 2024, 12(10),1849.

https://doi.org/10.3390/jmse12101849

Abstract

The integration of machine learning (ML) in marine engineering has been increasingly subjected to stringent regulatory scrutiny. While environmental regulations aim to reduce harmful emissions and energy consumption, there is also a growing demand for the interpretability of ML models to ensure their reliability and adherence to safety standards. This research highlights the need to develop models that are both transparent and comprehensible to domain experts and regulatory bodies. This paper underscores the importance of transparency in machine learning through a use case involving a VLGC ship two-stroke propulsion engine. By adhering to the CRISP-DM standard, we fostered close collaboration between marine engineers and machine learning experts to circumvent the common pitfalls of automated ML. The methodology included comprehensive data exploration, cleaning, and verification, followed by feature selection and training of linear regression and decision tree models that are not only transparent but also highly interpretable. The linear model achieved an RMSE of 23.16 and an MRAE of 14.7%, while the accuracy of decision trees ranged between 96.4% and 97.69%. This study demonstrates that machine learning models for predicting propulsion engine fuel consumption can be interpretable, adhering to regulatory requirements, while still achieving adequate predictive performance.

Keywords: interpretability; machine learning; decision trees; linear regression; feature selection; two-stroke marine engines; fuel consumption

A Bayesian and Markov chain approach to short-term and long-term personal watercraft trajectory forecasting

A Bayesian and Markov chain approach to short-term and long-term personal watercraft trajectory forecasting

Lucija Žužić , Ivan Dražić , Loredana Simčić , Franko Hržić , Jonatan Lerga , A Bayesian and Markov chain approach to short-term and long-term personal watercraft trajectory forecasting, Journal of the Franklin Institute , January 2025.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5156719

(https://www.sciencedirect.com/science/article/pii/S0016003225000031)

Abstract:

In this work, vessel position is estimated using a Bayesian approach based on heading, speed, time intervals, and offsets of latitude and longitude. An additional approach using a Markov chain is presented. The trajectory data comes from a cloud-based marine watercraft tracking system that enables remote control of the vessels. Wave height and meteorological reports were used to evaluate the impact of weather on personal watercraft trajectories. One proposed approach to trajectory estimation uses the longitude and latitude offsets, while another uses the speed, heading, and actual time intervals. A long-term forecasting window of up to ten seconds is achieved by dividing trajectories into segments that do not overlap. The limitation this method faces in long-term forecasting inspires more sophisticated machine-learning approaches. The most successful estimation method used one or two previous actual values and a Bayesian approach, proving that using previously predicted values in a chain accumulates errors. Considering environmental variables did not improve the model, highlighting that small watercrafts operate well even in unstable sea states. This occurs because they generate and ride waves, having a larger impact than oceanic currents.

Keywords

Personal watercraft, Trajectory forecasting, Markov chain