How Google’s Machine Learning Model Is Helping Decode Whale Songs

The Single largest direct human impact on marine ecosystems comes from over-exploitation of resources by activities like fishing. Now, researchers have devised a technique to track the population of endangered marine life, like Humpback Whales, by listening to their peculiar hour-long song.

Audio analysing techniques developed by Google’s Artificial Intelligence Perception team have been used before for YouTube videos for non-speech captions. And, now similar techniques are being used for conservational activities.

Google, in association with the Pacific Islands Fisheries Science Center of the US National Oceanic and Atmospheric Administration (NOAA), has developed algorithms to identify humpback whale calls in 15 years of underwater recordings from a number of locations in the Pacific. This research provides new and important information about humpback whale presence, seasonality, daily calling behaviour, and population structure.

HARP

In fact, Google has used HARP (high-frequency acoustic recording packages) devices to collect audio data (9.2 terabytes) over a period of 15 years.

Manually marking humpback whale calls is extremely time-consuming. That is why, researchers at Google, deployed supervised machine learning models to optimize images for detecting the whales. An audio event detection is considered as an image classification problem.

This data of different magnitudes of sound intensities are plotted on time-frequency axes.

Spectrograms of audio events found in the dataset, with time on the x-axis and frequency on the y-axis. Left: a humpback whale call, Center: narrow-band noise from an unknown source, Right: hard disk noise from the HARP

For classifying the images, a Resnet-50, was used which has given reliable results in classifying non-speech audio.

A Resnet is a residual learning framework introduced back in 2015 to ease the training of deep networks. The inputs to the layers in the neural network are used as a reference to model these layers as a residual function for learning. So, as the depth of a network increases, the accuracy gain from this residual layers increases.

The idea behind using residual learning instead of stacking layers over layers is, challenges like delayed convergence caused to vanishing or exploding gradients. For example, in case of a vanishing gradient problem, the weights in the network receive an update with respect to the error function after each iteration, and the weights become vanishingly small and might lead to complete shutdown of a network’s training. Resnet’s have also been good with tackling the degradation of training accuracy problem caused due to increasing depth and saturated weights.

Humpback whales have a varied, but sustained frequencies. These frequencies, if don’t vary at all then a spectrogram would display a horizontal bar. The arcs mean that the signals have been modulated. The challenge with collecting humpback’s audio data is the noise that gets mixed up along with it; noise caused by the propellers of the ships and other equipment. This noise is taken as an unvaried signal and displayed as a horizontal bar on a spectrogram.

PCEN (per-channel energy normalization) is a technique used in far-field speech recognition tasks. PCEN is modelled as deep neural networks and uses dynamic compression instead of logarithmic compression to spot keywords in distant or noisy acoustic environments.This technique suppresses the stationary narrow band noise generated by machines resulting in an error reduction at a rate of 24%.

A whale song is generally a structure, sequential audio signal that can last over 20 minutes. And, there is a high possibility of a new song beginning within a few seconds. This incoming audio units with such large time windows give extra information useful for predicting with improved precision.The test-set consists of 75-second audio clips, for which the model showed accuracy scores above 90%.

Unsupervised Learning For Similar Song Units

In this approach, the labels are used to learn from the ResNet output. The classification is done by identifying the Euclidean distances between two ResNet output vectors belonging to corresponding audio units of similar time frames. This helps in distinguishing different humpback unit types from each other.

Calculating distances has been done using Unsupervised Learning Of Semantic Audio Representations. The basic idea here is to highlight the correlation between closeness in time and closeness in meaning. So a sample consisting of three vectors which are representations of sounds of humpback unit(anchor), a similar unit(positive) and noise(negative) respectively. The model tries to minimize the loss such by forcing the Euclidean distance between the anchor-negative exceeds that of anchor-positive distance. The nearest neighbours in the entire dataset are retrieved using Euclidean distance between embedding vectors.

Time density of presence on year/month axes via Google AI blog

The above plot summarises the model output with respect to time and location(Kona and Saipan). The results clearly show that seasonal variation is consistent with a known pattern in which humpback populations spend summers feeding near Alaska and then migrate to the vicinity of the Hawaiian Islands to breed and give birth.

These results further can be used to determine the effects due to anthropogenic activity and the success of this project demands for application of machine learning tool to a wider spectrum of environmental challenges.

The post How Google’s Machine Learning Model Is Helping Decode Whale Songs appeared first on Analytics India Magazine.

How Google’s Machine Learning Model Is Helping Decode Whale Songs

HARP

Unsupervised Learning For Similar Song Units

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112