Underwater Acoustic Research Trends with Machine Learning: Ocean Parameter Inversion Applications
Article information
Abstract
Underwater acoustics, which is the study of the phenomena related to sound waves in water, has been applied mainly in research on the use of sound navigation and range (SONAR) systems for communication, target detection, investigation of marine resources and environments, and noise measurement and analysis. Underwater acoustics is mainly applied in the field of remote sensing, wherein information on a target object is acquired indirectly from acoustic data. Presently, machine learning, which has recently been applied successfully in a variety of research fields, is being utilized extensively in remote sensing to obtain and extract information. In the earlier parts of this work, we examined the research trends involving the machine learning techniques and theories that are mainly used in underwater acoustics, as well as their applications in active/passive SONAR systems (Yang et al., 2020a; Yang et al., 2020b; Yang et al., 2020c). As a follow-up, this paper reviews machine learning applications for the inversion of ocean parameters such as sound speed profiles and sediment geoacoustic parameters.
1. Introduction
Underwater acoustics is the study of the phenomena related to the generation, propagation, transmission, and reception of sound waves in underwater environments. It has been applied in underwater communication, target detection, marine resource and environment investigation using sound navigation and range (SONAR) systems, and the measurement and analysis of underwater sound source characteristics. The main representative field that utilizes underwater acoustics is remote sensing, wherein information on a target object is acquired indirectly from acoustic data. Machine learning, which has recently achieved substantial success in information acquisition and extraction, is actively utilized in remote sensing. In the previous parts of this work, the research trends in the machine learning techniques and theories that are mainly used in underwater acoustics and their applications in active and passive SONAR systems were reviewed (Yang et al., 2020a; Yang et al., 2020b; Yang et. al., 2020c). In this paper (the final part of this work), its application to the field of ocean parameter inversion is described.
2. Application of Machine Learning in the Field of Ocean Parameter Inversion
A sound wave measured in the form of a signal on a hydrophone contains information of the medium along the propagation path. The process of indirectly extracting necessary information such as ocean parameters by processing acoustic signals is called inversion. In underwater acoustics, inversion can be largely divided into the localization of surface vehicles and underwater vehicles (Parvulescu and Clay, 1965; Clay, 1966; Clay, 1987; Clay and Li, 1988; Bucker, 1976; Baggeroer et al., 1993; Tolstoy, 1993); tomography inversion, wherein the inversion operation is performed on the physical properties of seawater, such as the temperature profile over a broad area of water (Shang, 1989; Tolstoy et al., 1991; Tolstoy, 1992); and geoacoustic inversion, which yields the composition, morphology, and geological properties of the marine sediment (Rajan et al., 1987; Lynch et al., 1991). The localization of underwater sound sources has been covered in the previous parts of this review work (Yang et al., 2020b; Yang et al., 2020c). The results and trends of tomography inversion and geoacoustic inversion are reviewed herein.
The ocean parameter inversion established in underwater acoustics is model based. Model-based inversion is a technique that compares the measured signal and simulated signal with a model and then adopts the model that yields the result most similar to the inversion solution. The matched field processing (MFP) technique is a representative technique. Here, the physical quantities of main interest in underwater acoustics are the sound speed profile and the environmental information related to the geological properties of marine sediment. These are essential factors for improving the prediction accuracy of the acoustic propagation model.
In the 1990s, the modeling of wave propagation in complex marine environments was realized, and model-based parameter inversion began to be studied extensively. To address the mismatch problem arising from sound source localization using an MFP, the sound source location and sound speed profile were inversed simultaneously (Collins and Kuperman, 1991). Subsequently, the inversion technique for the geoacoustic parameters of marine sediments was developed to address the mismatch problem (Collins et al., 1992; Lindsay and Chapman, 1993; Dosso et al., 1993; Gerstoft, 1994; Tolstoy et al., 1998). Until recently, the inversion technique has been steadily undergoing development in conjunction with the most advanced signal processing and optimization techniques. This article aims to review the research trends in ocean parameter inversion using machine learning techniques.
As mentioned previously, the ocean parameter inversion technique underwent development in conjunction with the emergence of a representative model-based inversion technique (MFP) and exhibited significantly good results. However, when utilizing the inversion method using MFP, the ambiguity of the inversion solution increases. Furthermore, the method is not robust against model mismatch when the wave propagation environment becomes complex, such as in the case of shallow waters. In addition, the problem becomes more challenging in restrictive situations where a high spignal-to-noise ratio in the sensor array cannot be guaranteed. To address this, recently, studies with high-resolution results have been reported in conjunction with the development of a sparse signal representation in the spatial or temporal (frequency) domain, e.g., the compressive sensing (CS) technique. The fundamental theory underlying CS is that in a linear system, the raw signal can be recovered under appropriate conditions even when the raw signal has a dimension higher than that of the observed data (Candes and Wakin, 2008). The CS technique utilizing this principle has been applied to active and passive SONAR signal processing and ship radiation noise analysis. Further details are available in the special issue of the Journal of the Acoustical Society of America (Gerstoft et al., 2018).
As observed from these cases, as in the case of problems of inversion of underwater acoustics, the most advanced signal processing theory was first applied to improve the localization performance of underwater sources in combination with MFP. It was then applied to the inversion of ocean parameters such as marine sediment properties and sound speed profile (Yardim et al., 2014; Bianco and Gerstoft, 2016; Choo and Seong, 2018). An example of a subsequent related study is sound speed profile inversion using dictionary learning which is an unsupervised machine learning technique. The existing MFP method is disadvantageous in that it is considerably complex and time-consuming in terms of computation. Moreover, it is challenging to apply the method with limited resources because it requires a number of sensors to acquire data. From this perspective, signal processing techniques using sparse signal representation, one of which is dictionary learning, have attracted attention in the field of underwater acoustic signal processing. Dictionary learning is a technique in which a dictionary is introduced from the measured data, sparse signal representation of meaningful information is extracted, and training is performed with the extracted information. In particular, when this is applied to ocean parameter inversion, to represent the sound speed profile as an optimum sparse signal, the dictionary of the shape function is developed to increase the resolution of the sound speed profile (Bianco and Gerstoft, 2017). The sound speed profile can be expressed as an expansion of the empirical orthogonal function for the sound speed variation. Bianco and Gerstoft (2017) developed a dictionary through a sparse signal representation that reduces the dimension of the empirical orthogonal function vectors. The training was performed using a clustering-based algorithm to obtain the recovered sound speed profile (Fig. 1).

Sound speed profile reconstruction using the dictionary learning (Bianco and Gerstoft, 2017).
An early machine learning-based approach to sound speed profile inversion used a multi-layer perceptron-based technique composed of a much shallower network than the state-of-the-art deep neural network models (Park and Kennedy, 1996; Jain and Ali, 2006). Park and Kennedy (1996) estimated the sound speed profile using a combination of sound speed records and environmental information, and a multi-layer perceptron structure, given that the sound speed profile can be estimated as an expansion of the empirical orthogonal function for the sound speed variation (Fig. 2).

Comparison of sound speed profiles between observations and predictions produced by using multi-layer perceptron (Park and Kennedy, 1996)
Here, the environmental information includes date, sea surface temperature obtained by the infrared sensor of a satellite, seafloor temperature measured by a temperature sensor embedded in the seafloor, and time of flight between two acoustic sensors buried in the seafloor calculated using the wave propagation model to consider the multiple paths of a sound wave in the corresponding environment. With regard to the sound speed records, the large amount of data from the World Ocean Atlas constructed by the National Oceanic and Atmospheric Administration was used. It was demonstrated that sound speed profile prediction is possible almost in real time through a simple multi-layer perceptron structure using only two layers of the hidden layer. However, the temperature and time information they used as input factors had limitations: the mean record had to be stable and the acoustic model used to calculate the multiple paths was applicable only in a range-independent environment. The research results of Jain and Ali (2006) were not considerably different from those of the above research. However, they estimated the sound speed profile using surface observations from a mooring that were measured on an hourly basis for one year as input parameters, utilizing a backpropagation algorithm and multi-layer perceptron structure with two hidden layers.
Early machine learning-based approaches to geoacoustic parameter inversion (such as that for sound velocity, density of marine sediments, layer thickness, and damping coefficients) were also based on the expansion of the basis function. This is similar to the case mentioned above. Caiti and Jesus (1996) investigated the estimation of the seafloor’s geoacoustic parameters from sound field measurements of the water layer. In particular, the Gaussian radial basis function was used to approximate the inverse function to speed up the computation. In addition, the results of inversion of sound velocity, damping coefficient, and density were demonstrated by applying the function to simulation data and marine experimental data measured using a horizontal towed line array. The results of their research exhibited very high efficiency in terms of inversion speed. However, there was a limitation: the method was applicable only when all the information on the depth, sound speed profile, and configuration of the source and receiver were provided.
Another approach to geoacoustic parameter inversion is to introduce neural networks into model-based inversion, which can be considered as an optimization problem. Benson et al. (2000) trained a neural network with the calculation data of the sound field of each sensor in a vertical line array using a wave propagation model in a shallow water environment. In particular, the spectral component of transmission loss was used as the input parameter of the neural network. In-situ experimental data were used to verify the neural network model. The distance and depth of the sound source and the sound speed and thickness of the marine sediment were estimated with high accuracy.
With regard to the exact physical quantity of marine sediments, traditionally, only sample information with sparse distribution has been obtained through core collection, which introduces the problem of uncertainty. From a probabilistic perspective, this uncertainty was quantitatively calculated by reflecting random characteristics using a multivariate probability density function or a joint ensemble moment. This can result in questions on the estimation results from places where data are unavailable. Research is underway to determine solutions using a machine learning-based approach. In particular, the geoacoustic inversion technique is gradually becoming sophisticated, e.g., using machine learning techniques for extraction of physical quantities such as porosity and hydraulic conductivity (Tartakovsky et al., 2008; Martin et al., 2015). Because existing porosity estimation methods using interpolation or regression did not sufficiently reflect seafloor topography or geological properties, Martin et al (2015) used the random forest (RF) technique, which utilizes multiple predictor variables, to predict the seafloor porosity (Fig. 3). They developed predictor grids that added several variables related to porosity to vast geological property data obtained from deep sea/marine drilling and combined them with RF, a tool for regression tree analysis. The results were compared with the existing porosity estimation results in terms of root-mean-square error, Nash-Sutcliffe efficiency, Benchmark efficiency, etc. Their proposed method demonstrated the highest prediction performance compared to the existing methods.

Seafloor porosity prediction produced using machine learning (Martin et al., 2015)
Finally, we provide a brief summary of the application of the inversion technique on sediment geology. The geological form of sediments is essential information for dredging to secure a route, laying submarine cables, and dredging of rivers to regulate flow. In the case of underwater and marine sediments, classification techniques based on machine learning and statistical methods have been undergoing continuous development. Starting from the early classification of surficial sediments using neural network architecture and statistical classifiers (Michalopoulou et al., 1995), recent classification techniques based on supervised learning that perform classification on the type of surficial sediment and grain size based on multi-beam SONAR, backscattered acoustic data, and bathymetry have been reported to display considerably high performance (Stephens and Diesing, 2014; Diesing et al., 2014; Buscombe and Grams, 2018). In particular, Stephens and Diesing (2014) selected six types of machine learning approaches (k-nearest neighbor, support vector machine, classification tree, RF, neural network, and Naïve Bayes) as supervised learning-based classification techniques for seabed mapping. The secondary features such as roughness, curvature, Moran’s I, and Sobel filter were newly extracted from the water depth data and backscattered acoustic data, and the results of prediction of the type of surficial sediment were compared. Diesing et al. (2014) compared the classification results of surficial sediments and grain sizes (Fig. 4) that were predicted by manual interpretation, image analysis using a 2D Fourier filter, geostatistics techniques, and RF (Fig. 4). The comparison indicated that the results could be improved by the ensemble technique (combining multiple techniques to produce the desired output). Buscombe and Grams (2018) proposed a fully connected conditional random field that can consider the relative size and proximity of backscattered acoustic data for substrate characterization and compare the results with those obtained by the Gaussian mixture model to evaluate the performance.

Seabed substrate maps including bathymetry, backscatter strength, and machine learning and geostatistics results (Diesing et al., 2014)
3. Conclusion
In this review, we examined the trend of research conducted by applying machine learning (including deep learning) to underwater acoustics and SONAR applications. It is evident that machine learning techniques need to be improved further for application in the areas of underwater acoustics, acoustical oceanography, and technologies related to SONAR through follow-up studies. Furthermore, the implementation of machine learning is likely to provide flexibility in future research directions and increase the applicability of the approach.
However, in the case of data-based techniques represented by deep learning, the problem wherein securing data including class information is essential based on the inherent characteristics of deep learning still remains. More specifically, in underwater acoustics, it is challenging to obtain such data, and understanding of the environment is essential. Therefore, in addition to the development of physical intuition and theory, the use of verified wave propagation models or statistical and mathematical models requiring a relatively marginal amount of data should be developed in parallel to overcome the above challenges and limitations.