The communication space of humpback whale social sounds …

The communication space of humpback whale social sounds in wind-dominated noiseRebecca A. Dunlop

Citation: The Journal of the Acoustical Society of America 144, 540 (2018); doi: 10.1121/1.5047744View online: https://doi.org/10.1121/1.5047744View Table of Contents: http://asa.scitation.org/toc/jas/144/2Published by the Acoustical Society of America

Articles you may be interested inA model and experimental approach to the middle ear transfer function related to hearing in the humpback whale(Megaptera novaeangliae)The Journal of the Acoustical Society of America 144, 525 (2018); 10.1121/1.5048421

On the reliability of acoustic annotations and automatic detections of Antarctic blue whale calls under differentacoustic conditionsThe Journal of the Acoustical Society of America 144, 740 (2018); 10.1121/1.5049803

Dolphins maintain high echolocation vigilance for eight hours without primary (food) reinforcementThe Journal of the Acoustical Society of America 144, 660 (2018); 10.1121/1.5049585

Auditory distraction by speech: Comparison of fluctuating and steady speech-like masking soundsThe Journal of the Acoustical Society of America 144, EL83 (2018); 10.1121/1.5048637

Perception of relative pitch of sentence-length utterancesThe Journal of the Acoustical Society of America 144, EL89 (2018); 10.1121/1.5048636

Validating automated click detector dolphin detection rates and investigating factors affecting performanceThe Journal of the Acoustical Society of America 144, 931 (2018); 10.1121/1.5049802

http://asa.scitation.org/author/Dunlop%2C+Rebecca+A

/loi/jas

https://doi.org/10.1121/1.5047744

http://asa.scitation.org/toc/jas/144/2

http://asa.scitation.org/publisher/

http://asa.scitation.org/doi/abs/10.1121/1.5048421








The communication space of humpback whale social sounds inwind-dominated noise

Rebecca A. Dunlopa)

Cetacean Ecology and Acoustics Laboratory, School of Veterinary Science, University of Queensland, GattonCampus, Queensland, QLD 4343, Australia

(Received 1 May 2018; revised 1 July 2018; accepted 3 July 2018; published online 1 August 2018)

In animal social networks, a large acoustic communication space tends to involve complex net-

works. Signal masking may reduce this space, leading to detrimental effects on the animal’s ability

to obtain important social information. Humpback whales use acoustic social sounds (vocal sounds

and surface-generated sounds from breaching or fin slapping) for within- and between-group com-

munication. In this study, changes in various sound parameters (e.g., signal-above-noise and fre-

quency content) of received humpback whale social sounds were statistically modeled against the

combined effect of increasing wind-dominated noise and distance from the source (whale) to pro-

duce masking models. Behavioral data on vocalizing groups were also used to inform these models.

The acoustic communication space, in this shallow water (<50 m) environment, extended to

approximately 4 km from the signaler in median wind noise. However, the majority of behavioral

interactions occurred within 2 km of the signaler. Surface-generated signals propagated better and

likely function to maintain this space in higher wind noise. This study provides a basic wind-noise

masking model for social communication signals in humpback whales which can be updated as

more information on humpback auditory capabilities, and potential masking effects of anthropo-

genic noise sources, becomes available. VC 2018 Acoustical Society of America.

https://doi.org/10.1121/1.5047744

[WWA] Pages: 540–551

I. INTRODUCTION

The area over which an acoustic signal transmits, and is

audible to the intended receiver, is known as the active or

communication space (Brenowitz, 1982; Janik, 2000; Clark

et al., 2009). The dimensions of this area depend on the tem-

poral and spectral structure of the signal, the characteristics

of the environment, and the receiver’s ability to detect (the

detection threshold) and discriminate the signal from back-

ground noise (Wiley and Richards, 1978; Clark et al., 2009).

Acoustic signals that propagate over relatively long distan-

ces relative to the distribution of conspecifics can create a

large communication space (McGregor and Krebs, 1984;

Morton, 1986; Janik, 2000). Within this space, for any given

signaler, there may be a number of other intended, and/or

unintended, receivers and together these form a communica-

tion network (McGregor, 1993; McGregor and Horn, 2015).

Intended receivers are those at which the signals are directed

whereas unintended receivers, such as eavesdroppers, can

obtain beneficial information from exchanges between the

signaler and intended receiver (McGregor, 1993; McGregor

et al., 2001; Otter et al., 1999; Peake et al., 2001). Within

the context of breeding interactions, a male eavesdropper

can assess male competitors and therefore make decisions as

to whether or not to compete for the female (e.g., Siamese

fighting fish, Betta splendens, Matos et al., 2003; great tits,

Parus major, Peake et al., 2001; wolf spiders, Lycosidaespp., Clark et al., 2015). For breeding females, eavesdrop-

ping may allow assessment of potential mates without

having to directly interact with the senders (e.g., black-

capped chickadees, Poecile atricapilla, Mennill et al., 2002;

crayfish, Procambarus clarkia, Aquiloni et al., 2008;

Siamese fighting fish, Betta splendens, Doutrelant and

McGregor, 2000). Any reduction in the communication

space, due to increased background noise, may therefore

have detrimental effects on the ability of animals to obtain

breeding information within a network leading, ultimately,

to a reduced breeding success.

In the marine environment there are several natural

sources of noise, such as wind, wave, rain, and biological,

and these noise sources can limit the communication space

used by animals through masking (Clark et al., 2009).

Masking is the process by which the threshold of hearing for

one sound is raised by the presence of another (masking)

sound, expressed in dB (American National Standards

Institute, 2008). In the case of signalers and receivers in the

underwater environment, the sound would be from the sig-

naler, and the masking sound would be natural underwater

noise such as wind noise. Underwater noise can also be

anthropogenic in origin (e.g., noise from vessel activity, oil

and gas exploration, naval sonar activity or construction),

and, in some circumstances, this noise may result in further

masking of the animal’s acoustic signals compared to mask-

ing effects of natural noise. The effect of various sources of

noise, with regards to masking, however, is difficult to quan-

tify, given it is essential to know the auditory capabilities of

the animal.

Many masking studies relate a change in a particular

signal feature, most commonly the signal level in noise (i.e.,

a measure of signal-to-noise, or SNR), to increasing distancea)Electronic mail: [email protected]

540 J. Acoust. Soc. Am. 144 (2), August 2018 VC 2018 Acoustical Society of America0001-4966/2018/144(2)/540/12/$30.00

https://doi.org/10.1121/1.5047744

mailto:[email protected]

http://crossmark.crossref.org/dialog/?doi=10.1121/1.5047744&domain=pdf&date_stamp=2018-08-01

from the receiver to determine the distance at which a signal

is just detectable in noise. A decrease in this detectable dis-

tance can then be related to an increase in signal masking

due to an increase in critical bandwidth noise. Critical band-

width is defined as the bandwidth of noise at which the

detection threshold of a tone signal at the centre of the noise

band ceases to increase with increasing width of the noise

band and is therefore the noise band that is applicable to

masking studies. However, the detection of a signal may not

necessarily equate to recognition (Franklin et al., 2006).

Some species of bird require a signal excess of 2–3 dB above

the critical ratio (the audible threshold of a pure tone in noise

of constant spectral density; American National Standards

Institute, 2008) for successful discrimination of conspecific

calls and a further 2–3 dB for call recognition (Dooling and

Blumenrath, 2016; Lohr et al., 2003). This is likely because

signal characteristics, such as frequency and temporal

parameters, also change as the signal propagates through the

environment. In seawater high frequencies are readily

absorbed (Francois and Garrison, 1982), meaning high-

frequency components of a signal are likely to attenuate

more rapidly with distance compared to lower frequency

components. In shallow water, low frequencies propagate

poorly, leading to increased loss of low-frequency compo-

nents with distance. Therefore, as the signal travels through

the environment, it is subject not only to loss in signal level,

but to changes in signal structure meaning features of the

signal should be measured at the receiver, as well as at the

source.

Humpback whales use two different signal types; song,

which is audible over 10’s of kilometres (Au et al., 2006),

and social sounds, which are likely audible over a few kilo-

metres (Dunlop et al., 2013a). Song is a broadcast signal and

therefore results in a large (10’s of kilometres) and complex

communication network, likely involving multiple signalers,

intended receiver(s) and unintended receiver(s). Intended

receivers, for example, may be females (Winn and Winn,

1978; Tyack, 1981; Chu and Harcourt, 1986; Smith et al.,2008) or other male singing whales (Darling and B�erub�e,

2001; Darling et al.,. 2006; Frankel et al., 1995) and unin-

tended receivers may be other non-singing competitive

males in the area (Dunlop, 2016a; Dunlop and Noad, 2016).

Social sounds can be vocal signals (in this paper referred to

as social vocalizations) or surface-generated signals from

breaching or fin slapping. As these signals are significantly

quieter compared to song (Dunlop et al., 2013a; Dunlop

et al., 2013b), the communication network is likely to be kil-

ometres rather than 10’s of kilometres. Behavioral studies

suggest they are likely to be used as within-group signals as

well as between-group signals (Dunlop et al., 2008; Dunlop

et al., 2010; Dunlop, 2017; Parks et al., 2014; Rekdahl et al.,2015; Silber, 1986; Thompson et al., 1977).

Humpback whale social vocalizations range from long,

low frequency (<80 Hz) “grumbles” to short, very high fre-

quency (�2 kHz) “squeaks,” with many of the sound types

forming a continuum rather than being discretely different

signals (Dunlop et al., 2007; Dunlop, 2017, Fournet et al.,2015; Stimpert et al., 2011). Given this variation in sound

structure, discrete sound types are likely to give information

on group membership, changes in group membership, and,

as lone animals also use social sounds, perhaps the sex, size

and location of the signaler (Dunlop et al., 2008). The con-

tinuum in structure between different sound types (gradual

changes in frequency, duration, and bandwidth) likely pro-

vides additional information to the intended receiver on

(changeable) signaler features such as its motivation

(Dunlop, 2017). Humpback whale signaling behavior also

changes in response to other (likely unintended) receivers in

the area, known as the audience effect (Dunlop, 2016a).

Taken together, the communication network of humpback

whales likely extends beyond the immediate group to

encompass other groups in the area.

Before understanding the consequences of signal mask-

ing by increased anthropogenic noise, the size of the com-

munication space must first be determined, as well as any

changes its size due to increases in natural noise sources

such as wind-dominated noise. Humpback whales compen-

sate for increases in wind-dominated noise by both switching

from vocal sounds to surface-generated sounds (Dunlop

et al., 2010) and increasing their vocal source level (Dunlop

et al., 2014). Though responses are likely to maintain, to

some extent, the size of the communication space in noisier

conditions, it is currently unknown how large this network is

and what, if any, effects of increased natural wind-noise lev-

els on has on this space. In light of this, this study first aims

to measure and statistically model changes in received signal

level above noise, and structural parameters of different cat-

egories of social sounds, with both increasing proximity to

the source (signaling group) combined with increasing wind-

noise. This will provide information on the communication

space of humpback whale social sounds within a shallow-

water (<50 m) environment for different signal types.

Second, the analysis will be repeated whilst accounting for

the Lombard response, to determine how effective this

Lombard response is in maintaining this space during peri-

ods of increased wind-dominated noise. Third, behavioral

data on the interactions between signaling groups within the

communication network will be used to inform these statisti-

cal models. The results of this study will provide baseline

information that can be used to compare the size of the

humpback whale communication space in natural noise, with

the size of the space during periods of increased anthropo-

genic noise such as noise from vessels, to determine by how

much this space is potentially reduced during periods of high

vessel noise.

II. METHODS

A. Visual and acoustic data collection

Data were collected during the southwards (from their

breeding ground in the Great Barrier Reef, towards their

Antarctic feeding grounds) migration of the eastern

Australian humpback whale (September/October of

2002–2004 and 2008). Acoustic recordings of were made

using an array of five hydrophone-buoy systems where each

hydrophone-buoy consisted of a surface buoy, a pre-

amplifier (þ20 dB) and VHF radio transmitter anchored in

20–28 m of water. At the seabed, a High Tech HTI-96-MIN

J. Acoust. Soc. Am. 144 (2), August 2018 Rebecca A. Dunlop 541

hydrophone with built-in þ40 dB pre-amplifier was sus-

pended about a metre of the sea floor and cabled to the sur-

face buoy. Buoys 1–3 were 1.5 km from the beach, parallel

to the shoreline, and approximately 0.7 km apart. Buoys 4

and 5 extended seaward from buoy 2, in a line perpendicular

to the shore and were approximately 0.5 km apart. Accurate

positions of hydrophones were obtained by a shore-based

theodolite survey of the surface buoys (Noad et al., 2004).

Water depth in the study site (where most of the groups were

recorded) varied from 25 to 35 m.

Radio transmissions from the buoys were received in

real time at a base station just behind the beach using a verti-

cally orientated Yagi antenna attached to a four channel, low

noise, VHF receiver (type 8101), and a Winradio receiver.

Two computers, equipped with National Instruments E-

series data acquisition cards and using ISHMAEL software

(Mellinger, 2001), were used to record and track the acoustic

signals (at a sampling rate of 22.05 kHz). Signal sources

were tracked in real-time (or post-field if required) using the

arrival time differences of the signals. Bearings to the signal

sources calculated by ISHMAEL were accurate, however, there

were small errors in the range estimates (less than 50 m at

2 km and less than 1 km at 10 km from the array). Signal

location accuracy was significantly improved by using the

mean position of several estimates calculated over a brief

period. Array validation was carried out by comparing

acoustically calculated positions with accurate theodolite

positions of visually identified singing whales within the

study area (for further validation experiments see Noad

et al., 2004). The hydrophone with a built-in preamplifier

was calibrated at the Defence Science and Technology

Organisation calibration facility in Woronora Dam. The

remainder of the recording chain was calibrated by inserting

tones and white noise of known levels into the amplifier in

the buoy in place of the hydrophone. Full system sensitivity

varied by 1.5 dB over the frequency range 40 to 10 000 Hz.

Hydrophone sensitivity was �164 dB re 1 V lPa–1, which

included þ40 dB gain for the built-in preamplifier.

Land-based tracking of the groups occurred simulta-

neously using a theodolite linked to a visual tracking pro-

gramme (CYCLOPES; developed by Eric Kniest, University of

Newcastle). Each position was annotated with the social

composition (number of adults within the group and whether

or not there was a calf) and social behavior (splitting of ani-

mals from, and joining of animals to, the group). Acoustic

tracks of vocalizing whales (from ISHMAEL) were overlaid on

the visual tracking map in CYCLOPES and the combined acous-

tic/visual data were shared between the base and hilltop sta-

tions using a wireless network providing a real-time

superposition of acoustic and visual tracks. There were

rarely more than six groups migrating through the study area

at any one time, and these were usually widely dispersed

(unless the groups were joining together). Given the accu-

racy of the system, and the way in which groups could be

simultaneously visually and acoustically tracked in real-

time, social sounds could be assigned at a group level,

though not at an individual level. Using this combination of

visual and acoustic tracking data, the distance of each signal-

ing group to each hydrophone-buoy receiver, as well as the

distance of each signaling group to other groups in the area,

could be measured for each recorded social sound.

B. Signal measurements

Spectrograms of social vocalizations and surface-

generated sounds (signals) were produced using RAVEN 1.2

(Cornell Lab of Ornothology) with the DFT size set at 4096

samples, Hamming window, and 80% overlap. Each signal

was isolated from three different channels (three different

receivers at a range of distances from the signaling group)

and saved as a separate file along with a sample of back-

ground noise. If a signal could not be isolated from one of

the three channels (in that the signal was below noise and no

part was visible on the spectrogram), only two channels

were used in the analysis.

Social vocalizations were audibly and visually (by

inspecting the spectrogram) classified into a series of vocali-

zation types based on a previous study at this site (Dunlop

et al., 2007). Various parameters, including estimates of

received and source level, were then measured using a

custom-made MATLAB script. First, a noise-correction was

made on each vocalization by subtracting the spectrum of

the noise file from that of the associated vocalization file, see

below. Then, various temporal and frequency measurements

were made on the vocalization (Table I). All frequency

parameters were log-transformed to better represent the

mammalian perception of pitch (Evans, 1992).

To measure the received level, each signal was divided

into 743 ms segments with a 50% overlap, multiplied by a

Hanning window and a discrete Fourier transform (DFT) cal-

culated [using a fast Fourier transform (FFT) size of 16 384

with a 1.35 resolution]. The power spectrum (calibrated to

account for the sensitivity of the hydrophone and the gain of

the system) for each segment was extracted as the squared

magnitude of the DFT. A noise-correction was made on each

segment and the resulting power spectrum used as the sound

received level for each 1.35 Hz frequency band for each

segment.

Transmission loss (TL) was measured at the site as

described in Dunlop et al. (2013a), using a boat as a source

(running various transects towards and away from the

array) and playback of octave band limited white noise at

three positions. This gave various regression lines of

received levels as a function of distance from the sources in

the form

TL ¼ aþ b logðxÞ; (1)

where b is the slope of the regression line, x is distance

(meters) and a is a constant (which may be frequency depen-

dent). The horizontal distance was approximated as the slant

range since water depths of the transmission paths were less

than 40 m and thus very small compared with the distances.

For most frequencies, b varied with distance but could be

well approximated by two values, one applying to distances

less than, and the other greater than, a cross over value

where the slope changed. Values of a and b and the cross

over distances are given in Dunlop et al. (2013a). Both a and

542 J. Acoust. Soc. Am. 144 (2), August 2018 Rebecca A. Dunlop

b varied with frequency, so Eq. (1) values were calculated

for each octave band.

The square root of the power spectrum was converted

back into a waveform by applying an inverse Fourier trans-

form. All processed segments within each signal were then

concatenated to reconstruct the signal. This was filtered from

35 to 5623 Hz, corresponding to the lower and upper limit of

the 1/3 octave band centred at 40 and 5000 Hz, respectively,

to eliminate low frequency turbulence noise and high fre-

quency system noise, whilst capturing the peak frequencies

of the signals. The 1/3 octave band containing the most

energy was used as the RLrms for the analysis. From the

received level of each signal (RLrms over 35 to 5623 Hz), the

source level was estimated as

SLrms ¼ RLrms þ TL: (2)

To estimate broadband wind-dominated background

noise levels (NLbb), a 10-min file was used for each buoy

where each file began approximately when the first signal

for the group was audible on one of the hydrophone-buoys

(excluding any signals when measuring the noise). For each

file, mean square pressures were summed over the 40 Hz to

2.5 kHz 1/3 octave bands (actual band 36 Hz to 2.8 kHz) and

converted to decibels to give the broadband noise level. This

bandwidth was chosen because almost all the energy in the

signals lies within this band and this encompassed the peak

frequencies for wind-dominated noise. Groups were only

included in this analysis if there were no boats audible on

the array (and there were no sighted boats traversing the

study site) as well as no audible singing whales (so that sing-

ers would have been more than 10 km away and would not

have contributed significantly to the background noise at the

group) at the time the group was vocalizing. On a few occa-

sions, there was very faint song audible but this singer noise

contributed less than 1 dB to the broadband noise level. This,

and snapping shrimp at higher frequencies, were the only

other two sources of noise in the study area during times

measurements were made. There was no evidence that dis-

tant shipping noise contributed to the measured noise. In the

absence of singer and vessel noise, the measurements were

similar to those observed for wind-dependent noise in

Australian waters up to 800 Hz, showing the general

decrease in noise level with increasing frequency (Cato,

1997). Since the wind speed was stable over the study site,

the background noise measured at the array could be consid-

ered to be similar to that at the signaling whales. The system

electronic noise over the 36 Hz to 2.8 kHz band had an

equivalent input level of 77 dB re 1 lPa (using the type 8101

receiver). Ambient background noise levels included in this

analysis were well above this level meaning there was no

contribution of system noise. The received (at the hydro-

phone) signal-to-noise level (SNR) of each sound was then

estimated as

rSNR ¼ RLrms � NLbb; (3)

where the RLrms was measured in the 1/3 octave containing

the most energy and the NLbb was measured over 36 Hz to

2.8 kHz.

TABLE I. Measured frequency and duration parameters used in the CART analysis. Frequency parameters were logged for use in data analyses procedures.

Measurement Abbreviation Description

Centre time peak (s) pTC The time at which the peak of the sound is divided into two intervals containing equal energy

First quartile time peak (s) pTQ1 Time containing the first 25% of the energy of the sound peak.

Third quartile time peak (s) pTQ3 Time containing 75% of the energy of the sound peak.

Inter-quartile time peak (s) pTIQ Difference between the 3rd and 1st quartile times of the sound peak.

10th centile time peak (s) pTC10 Time containing the first 10% of the energy of the sound peak.

90th centile time peak (s) pTC90 Time containing 90% of the energy of the sound.

Inter-centile time peak (s) pTIC Time between the 10th and 90th centile times of the sound peak.

Centre time (s) TC Time at which the sound is divided into two intervals containing equal energy.

First quartile time (s) TQ1 Time containing the first 25% of the energy of the sound.

Third quartile time (s) TQ3 Time containing 75% of the energy of the sound.

Inter-quartile time (s) TIQ Difference between the 3rd and 1st quartile times.

10th centile time (s) TC10 Time containing the first 10% of the energy of the sound.

90th centile time (s) TC90 Time containing 90% of the energy of the sound.

Inter-centile time (s) TIC Time between the 10th and 90th centile times.

Peak frequency (Hz) FP The frequency at maximum level

Centre frequency (Hz) FC Frequency at which the sound is divided into two components of equal energy

First quartile frequency (Hz) FQ1 Frequency that divides the sound into two components containing 25% and 75% of the sound’s energy.

Third quartile frequency (Hz) FQ3 Frequency that divides the sound into two components containing 75% and 25% of the sound’s energy.

Inter-quartile frequency (Hz) FIQ Difference between the 3rd and 1st quartile frequencies.

10th centile frequency (Hz) FC10 Frequency that divides the sound into two intervals containing 10% and 90% of the sound’s energy

(�MinF).

90th centile frequency (Hz) FC90 Frequency that divides the sound into two intervals containing 90% and 10% of the sound’s energy

(�MaxF).

Inter-centile frequency (Hz) FIC The difference between the 90th and 10th centile frequencies.

First quartile time frequency (HzHz) TFQ1 Frequency at TQ1

Third quartile time frequency (Hz) TFQ3 Frequency at TQ3

Frequency trend FTrend Calculated as FQ1T / FQ3T.


C. Call categorization

Classification and regression tree (CART) analysis was

used following methods of Garland et al. (2012) and

Rekdahl et al. (2013). This was carried out on vocalizations

(not surface-generated signals) using the rpart package in R

(R Core Team, 2013). As each vocalization was measured

up to three times (from three different hydrophone

receivers), the one closest to the receiver (being the sound

with the highest received signal-to-noise) was selected to

develop the initial CART. To minimize propagation effects

on sound structure, sounds beyond 1.5 km of the receiver

were excluded as well as those under 3 dB SNR (calculated

as the signal RLrms minus the NLrms of the noise sample used

for the noise correction). Measured frequency and temporal

parameters from Table I were used, where each branch

(split) uses the parameter that best differentiates the resulting

nodes, based on a measure of “goodness of fit”; the Gini

index (Breiman et al., 1996). In brief, the split with the low-

est splitting error is chosen to continue tree growth resulting

in homogeneous nodes equating to similar sound categories,

and eventually through further splitting, vocalization types.

Splitting continues until all vocalizations have been used, or,

subsequent split results in too few cases in the terminal node

(set to five). The minimum number of observations required

in a node for a split to be attempted was set to 10. The initial

tree was cross-validated (V-fold cross validation with 50

subsets) and then pruned until a final decision tree, with the

smallest estimated error and lowest misclassification rate,

was reached.

Using the final CART output as a guide, vocalizations

were categorized into broad categories, with the assumption

that each category contained relatively similar vocalization

types. A randomForest analysis (randomForest package in R;

Liaw and Wiener, 2002) was then run on these categories.

randomForest creates a forest of trees that internally calcu-

lates misclassification errors during tree construction. Within

each tree branch, a predefined number of parameters are ran-

domly selected for each branching event (¼10) and a num-

ber of different trees are constructed (¼1000) and compared.

The variable importance is then ranked (using the Gini

index) and an overall misclassification estimated for each

sound category. This “out-of-bag” (OOB) error was used to

assess how well the broad categorizations performed.

First, a randomForest analysis was performed only on

those vocalizations selected for the CART analysis (i.e.,

excluding sounds beyond 1.5 km of the receiver and those

under 3 dB SNR). Next, a second randomForest analysis was

performed on the full vocalization dataset. Here all aurally

classified vocalization types were grouped into their respec-

tive categories based on their initial subjective categoriza-

tion. randomForest then assessed suitability of this broad

classification scheme irrespective of the distance of the

vocalization from the receiver (equating to likely differences

in structure) and level above noise.

D. Data analysis

A generalised additive model (GAM) framework was

used to statistically model the response variables using R

software with the MRSEA (Scott-Hayward et al., 2014) and

GEEPACK (Yan and Fine, 2004; Højsgaard et al., 2006) pack-

ages for model fitting and selection. The response variables

were received signal-to-noise level (rSNR) or a representa-

tive measure of sound frequency. A Gaussian distribution

was appropriate for all response models. Covariates in all

response models were SLrms, a measure of frequency and/or

a measure of duration, to account for differences in signal

propagation due to source level and signal structure. These

covariates were considered as one-dimensional smooth

terms.

A complex region spatial smoother (CReSS) (Scott-

Hayward et al., 2014) was then used to fit a two-dimensional

smooth surface to the interaction between broadband wind-

dominated noise (x) and the distance from the receiver (y).

This procedure statistically models the two-dimensional (x

and y) surface using a spatially adaptive local smoothing

algorithm (SALSA) (Walker et al., 2011). In short, the 2D

surface contains a number of knots, being sources of flexibil-

ity of the surface which can raise or lower the surface

according to the relationship between x and y. A knot-

selection process (SALSA) is used to select the position of

the knots and a smoothing method (CReSS) is used to

manipulate the flexibility of the surface. Bayesian

Information Criteria was used for selection of number and

location of knots.

This surface model was then used as one of the covari-

ates within the analysis, where each measured signal param-

eter now had an associated integrated measure of broadband

wind-noise and distance of the signaling group from the

receiver. Model selection of covariates (SLrms, frequency or

duration measures, 2D surface) used a five-fold cross-valida-

tion (CV) procedure, where a smaller number indicated a

better fitting model. The final optimal model included the

combination of covariates which best explained the variation

in response data. This final model was rerun in a generalized

estimating equation (GEE) (Hardin, 2005) with “Signal ID”

as the panel structure as each signal was measured three

times on three different hydrophones at three different dis-

tances from the signaling group. Predictions were then made

and displayed as figures. These figures illustrate the inte-

grated relationship between measured signal parameters at

the receiver, the distance of the receiver to the source (sig-

naling group), and the broadband wind-noise, whilst control-

ling for variation due to differences in source level and

frequency content.

In the first set of analyses, SLrms and measures of signal

structure (frequency or duration) were included as smooth

terms to control for propagation differences due to level and

structure (aim 1). In the Lombard response analysis (aim 2),

these smooth terms were not included. Here both the SLrms

and the rSNR were statistically modelled as a function of dis-

tance and wind-noise to display the Lombard response

(SLrms) and then to account for the Lombard response on

received signal-to-noise levels (rSNR).

For comparison between the results of the different

models, a nominal rSNR of 0 was used. It is unknown, due to

the lack of information on how humpback whales detect con-

specific signals in noise, whether this equates to the


detection limit for a humpback whale’s hearing in noise.

Currently, there is also no available information on mecha-

nisms of masking release in humpback whales. However, as

a range of rSNRs are presented, the comparison measure can

be updated when more information on hearing and masking

release parameters becomes available.

For the behavioral analysis (aim 3), each signaling

group was annotated with three behavioral variables at the

beginning of every 10 min time period (beginning with the

first audible social sound) until signaling ceased. These

were:

(1) the number of other groups within a 5 km radius of the

signaling group,

(2) the distance of each of the other groups from the signal-

ing group, noting if it was the nearest neighbor, 2nd

nearest neighbor and so on and,

(3) interaction state; where a 1 was given if the group was

actively approaching or was being approached by

another group to result in them joining together or com-

ing within 500 m of each other (otherwise 0).

The probability of an interaction occurring (1 or 0) was

modelled using a GLM (binomial regression model) with vari-

ables 1 and 2 (above) and wind-dominated noise levels as pre-

dictor variables. The “behavioral” communication distance

was then determined by comparing the range of neighbor dis-

tances from the signaling group in which group interactions

were most likely to occur, with the range of neighbor distan-

ces in which interactions were not likely to occur.

III. RESULTS

A. Sound categorization

Signals were initially qualitatively (aurally and visually

using the spectrogram) separated into 31 different types

(based on Dunlop et al., 2007) including two types of

surface-generated sound. The initial CART analysis (n¼ 449

sounds with 29 vocalization types) separated the vocaliza-

tions into 10 different types (Fig. 1). The 1st split separated

vocalizations according to duration (TQ3 less than or greater

than approximately 0.5 s) and the 2nd split produced four

broad categories:

(1) long (TIC >¼ 0.84 s), low-frequency (log FC10

>¼ 2.1 Hz) vocalizations, referred to as “long-low”

(containing low-frequency unmodulated “grumbles,”

broadband “screeches,” “growls,” and “purrs” and likely

blow-hole associated sounds; see Dunlop et al., 2007 for

spectrograms),

(2) mid-length (TIC< 0.84 s), low-frequency vocalizations,

referred to as “mid-low” (containing frequency-modulated

“moans,” “groans,” “wops,” and “thwops”; Dunlop et al.,2007),

(3) short and low-frequency (log FC< 2.2 Hz) referred to as

“short-low” vocalizations (containing short low-

frequency “snorts,” “grunts,” and “barks; Dunlop et al.,2007), and

(4) short high-frequency (log FC> 2.2 Hz) referred to as

“short-high” vocalizations (containing “yaps” and

high-frequency “squeaks” and “barks”; Dunlop et al.,2007).

Each vocalization was assigned to one of the four catego-

ries, according to which branch the majority of the sounds

within each type were grouped into (e.g., “grumbles,” despite

appearing in two categories, were assigned to category 1).

Surface-generated sounds were placed into a separate category.

A randomForest analysis was ran using these four catego-

ries (Table II) and the out-of-bag estimate of error rate was

4.69% indicating a low mis-classification rates within all cate-

gories. All measured vocalizations (n¼ 2010) were then clas-

sified, according to their original subjective classification, into

one of the four categories and the randomForest analysis re-

ran (Table II). The out-of-bag estimate of error rate was 0.3%

and therefore the use of these four broad categories to be

appropriate.

B. The combined effect of distance and wind-noiseon received signal-to-noise

Signals within the various categories were recorded out

to approximately 3.5 to 5 km from the array (depending on

the category) in wind-dominated noise levels ranging from

FIG. 1. The output of the CART analysis showing the frequency or temporal

parameter used at each split (squares), the cut-off value used for each

parameter, and the resulting ten different sound types (circled). The second

CART split produced four broad categories (named by the most common

sound within this category; “grumbles,” “wops,” “snorts,” and “yaps”), re-

labelled to “long-low,” “mid-low,” “short-low,” and “short-high”).

TABLE II. The number of vocalizations classified by the RANDOMFOREST

analysis into one of the four categories. The left column is the number of

aurally-classified vocalizations within each category. The four right hand

columns are the number of vocalizations classified by the RANDOMFOREST

analysis into each category. Numbers in normal font are sounds used in the

initial CART analysis. Numbers in italics are the full sample size (i.e.,

including all sounds regardless of distance from the receiver, SNR, and

hydrophone-buoy the recording came from).

Long-low Mid-modulated Short-low Short-high

Long-low (125, 490) 114 2 7 2

487 1 2 0

Mid-low (109, 660) 2 104 3 0

0 660 0 0

Short-low (155, 648) 2 2 150 1

0 0 648 0

Short-high (59, 208) 0 0 0 59

0 1 2 205


91 to 113 dB re 1 lPa (equating to wind speeds of 7 to

20 kn). Low wind-dominated noise levels were considered to

be less than 95 dB re 1 lPa, median levels were 100 dB re1 lPa (wind speed of approximately 12 kn) and high levels

were considered to be over 105 dB re 1 lPa.

Within all four vocalization categories, there was a sig-

nificant relationship between the rSNR, wind noise, and dis-

tance from the receiver. The final statistical model outputs are

presented below, with the measured response variable on the

left, and various significant predictor variables (including the

combined effect of broadband wind noise and distance from

the receiver with degrees of freedom, d.f.) on the right. The

figures (Fig. 2) display the relationship between the distance

and wind noise variables (x and y) and the response variable

(rSNR) whilst controlling for source level and frequency. The

final model outputs are a representation of how the signals

would be received according to signaler distance combined

with wind noise:

(1) “Long-low”: rSNR � s (wind noise, distance, d.f.¼ 6)

þ SLrms þ log FQ3,

(2) “Mid-low”: rSNR � s (wind noise, distance, d.f.¼ 6)

þ SLrmsþlog FC10,

(3) “Short-low”: rSNR � s (wind noise, distance, d.f.¼ 6)

þ SLrms þ log FC90,

(4) “Short-high”: rSNR � s (wind noise, distance, d.f.¼ 4)

þ SLrms þ log FC10.

The rSNR remained above 0 out to, and potentially

beyond, 4 km in “long-low” [490 observations of 170 vocal-

izations; Fig. 2(a)], “mid-low” [660 observations of 226

vocalizations; Fig. 2(b)] and “short-low” vocalizations [648

observations of 226 vocalizations; Fig. 2(c)] in low wind-

noise. In other words, the receiving whale is likely to hear

these vocalizations from signalers at least 4 km away. These

categories had the similar source level ranges (“long-low”

sounds ranged in SLrms from 131 to 188 dB re 1 lPa @ 1 m,

“mid-low” ranging from 125 to 195 dB re 1 lPa @ 1 m and

“short-low” from 127 to 179 dB re 1 lPa @ 1 m). “Short-

high” vocalizations [208 observations of 84 vocalizations;

Fig. 2(d)] were the least common category and lower in

source level (128 to 167 dB re 1 lPa @ 1 m). The

rSNRremained above 0 until approximately 2.5 km in low

wind-noise conditions, suggesting poorer propagation of

these higher frequency vocalizations after controlling for dif-

ferences in source level.

In median wind-noise levels, the rSNR remained above 0

to approximately 3 km in “long-low,” 4 km in “mid-low” and

“short-low” vocalizations, and reducing to 2.5 km in “short-

high” vocalizations [Figs. 2(a), 2(b), 2(c), and 2(d)].

However, in high wind-noise, the rSNR of “long-low,” “mid-

low,” and “short-low’ vocalizations remained above 0 until

approximately 1 km, whereas “short-high” vocalizations

remained above 0 until approximately 500 m [Figs. 2(a), 2(b),

2(c), and 2(d)]. This suggests a significant reduction in the

FIG. 2. The relationship between the

distance of the signaling group from

the receiver (x axis), wind-dominated

noise levels (y axis), and measured

rSNRfor “long-low” (a), “mid-low”

(b), “short-low” (c), “short-high” (d),

and surface-generated (e) categories.

The SLrms for each category was set at

the mean (154, 159, 154, 150, and

161 dB re 1 lPa at 1 m for the five cat-

egories, respectively) to standardize

for source level. The representative

frequency measures used were FQ3 [set

at 2.30 Hz (a)], log FC10 [set at 2.05 Hz

(b)], log FC90 [set at 1.97 Hz (c)], log

FC10 [set at 2.60 Hz (d)], and the log

FC [set at 2.24 Hz (e)] to control for the

effects of frequency on the sound

propagation.


signaler/receiver vocalization communication space due to

increased masking.

Surface-generated signals [136 observations of 46

sounds; Fig. 2(e)] ranged in SLrms from 145 to 177 dB re1 lPa @ 1 m. Although these signals also significantly

reduced in rSNR with increased signaler distance and wind-

noise, they were of a higher rSNR compared to vocalizations

in all conditions (Fig. 2). Even in high wind-noise condi-

tions, these signals remained above 0 rSNR until at least

3 km from signaler, suggesting these sounds were less likely

to be masked in higher noise [Fig. 2(e)].

C. The combined effect of receiver distanceand wind-noise on received signal frequency

The full dataset (n¼ 2006 observations of 705 vocaliza-

tions and 134 observations of 81 surface-generated signals)

was used for this analysis. “Short-low,” “mid-low,” and

“long-low” categories were combined given these were of a

similar frequency range (i.e., all low-frequency vocaliza-

tions; Fig. 1). The frequency measure chosen was a measure

of approximate minimum (FC10) and a time variable was

included as a smoother to standardize for changes in signal

duration with frequency (TQ3 ranged from 0.03 to 11.80 s

with a mean of 0.89 s). Vocalizations in the “short-high” cat-

egory were analyzed separately using a measure of approxi-

mate maximum frequency (FC90) and standardized for

duration (TQ3 ranged from 0.03 to 2.10 s with a mean of

0.25 s). The peak frequency (FP) of surface-generated signals

was used as a measured of frequency content. Source level

was also standardized in all statistical models as before.

The log FC10 of low-frequency vocalizations remained

relatively stable out to approximately a 2 km from the sig-

naler in median wind-noise then progressively disappeared

in increased noise and with greater separation distances. The

high-frequency components of “short-high” vocalizations

were also progressively lost at distances greater than 500 m

and wind noise levels above median (Fig. 3). Note, there was

insufficient data at close distances and in low-noise therefore

these are predicted outputs within 500 m and below 100 dB

re 1 lPa. In contrast, the FP of surface-generated sounds,

even in high wind-noise, remained stable out to approxi-

mately 3 km from the signaler (Fig. 3).

(1) “Low-frequency”: log FC10 � s (wind noise, distance,

d.f.¼ 5) þ SLrms þ TQ3,

(2) “high-frequency”: log FC90 � s (wind noise, distance,

d.f.¼ 5) þ SLrms þ TQ3,

(3) “surface-generated”: log FP � s (wind noise, distance,

d.f.¼ 5) þ SLrms þ TQ3.

D. Including the vocal Lombard effect

The Lombard hypothesis (an animal will increase its

vocal level in response to increasing noise levels; Lombard,

1911) can be measured from an acoustic array. There are two

problems with this approach. The first is that the further the

source is from the array, the more likely it is that lower source

level sounds will be missed in the recordings. The second is

that increased background noise will increase the proportion

of lower source level sounds missed, thus biasing the data to

higher source levels in higher noise. These biases are clearly

shown in Fig. 4. A previous study accounted for these issues

and found groups to maintain about a 60 dB excess above

wind noise, though not in the highest noise levels (Dunlop

et al., 2014). The current study measured signal-to-noise in adifferent way, but found that, at 1 km from the receiver, where

the effects of the biases are minimal, the measured signal

excess at the signaling group remained at 50 dB above noise

up to approximately 105 dB re 1 lPa of noise [Fig. 4(a)].

This signaler Lombard response resulted in vocalization

rSNRs of 5 to 10 dB above noise from signalers within 3 km,

and just above 0 from signalers out to 4 km [Fig. 4(b)]. In

high wind-noise, however, the vocal Lombard response in sig-

naling whales did not compensate for the masking effect of

increased wind noise. Although there is a bias in measuring

the Lombard response, in that only the louder sounds will be

recorded, this bias could be considered to be a representation

FIG. 3. The relationship between the distance of the signaling group from the

receiver (x axis; note the different ranges), wind-dominated noise levels (y

axis), and measured approximate minimum (FC10) for “long-low” (a), approx-

imate maximum (FC90) for “short-high” sound categories and peak frequency

(FP) for surface-generated sounds (c). The SLrms for each sound category was

set at the mean (156, 149, and 161 dB re 1 lPa at 1 m for the three categories,

respectively) to standardize for source level. The representative temporal mea-

sure was TQ3 [set at 0.89, 0.26, and 0.20 s for (a), (b), and (c), respectively] to

control for the effects of duration on the frequency propagation.


of the reception of the signals by a whale receiver, in that

whale receiver will only hear louder signals in higher noise.

(1) SLrms � s (wind noise, distance, d.f.¼ 4),

(2) rSNR � s (wind noise, distance, d.f.¼ 4).

E. Behavioral responses within the communicationnetwork

Social sounds were not recorded from signaling

groups beyond 5 km from the array (the detection limits of

the system) therefore the number of potential receivers

was counted within 5 km from the signaling group. This

ranged between 1 and 4 (with an average of 2 to 3) at the

time of signaling. The distance of the nearest neighbor

was 300 m to 4.8 km from the signaling group (depending

on the number of other groups in the area), the 2nd and

3rd neighbors were 1 km to 5 km, and the fourth neighbor

was always beyond 3 km.

Signaling groups, if joining with another group (n¼ 6),

always joined the nearest neighbor. The separation between

two joining groups at the beginning of the interaction ranged

from 500 m to 2.4 km in wind-dominated noise levels up to

101 dB re 1 lPa. One of the groups also joined the 2nd near-

est neighbor, which approached it from 4.1 km away. Three

additional signaling groups were approached by the nearest

neighbour from approximately 1.5 km away, but did not

join.

The probability of a signaling group interacting with

another group was significantly dependent on the distance of

the nearest neighbor [p¼ 0.03; Fig. 5(a)] and the total num-

ber of other groups (other potential receivers) within the

5 km radius [p¼ 0.005; Fig. 5(b)] but not the wind-

dominated noise level or the distance of the other (2nd, 3rd,

or 4th) groups in the area. Specifically, signaling groups

were more likely to join with another group if the nearest

neighbor was within 2 km when signaling and unlikely to

join if they were more than 2.5 km from their nearest neigh-

bour [Fig. 5(a)]. Signaling groups were more likely to have a

close-by neighbor as the number of groups in the area

increased and therefore more likely to interact [Fig. 5(b)].

These observations suggest the signals are audible about

to at least 4 km, where the received level above noise (as

measured in this study) of low-frequency vocalizations was

approximately 0 in median wind conditions. However, inter-

actions between groups within this communication space

were more likely to occur within 2 km of the signaling

group. Here, received signal-above-noise levels were þ5 to

þ10 dB for low-frequency vocalizations, with no significant

loss of low-frequency components. For high-frequency

vocalizations, received signal-above-noise levels were just

above 0 but with some loss of high-frequency components.

IV. DISCUSSION

The received levels and frequency components of each

measured social sound category varied significantly with

increasing levels of wind-dominated noise and separation

distance between the signaling group and receiver. In rela-

tively low wind-noise (less than 95 dB re 1 lPa), the signal-

to-noise of low-frequency vocalizations remained above 0,

with no loss of low-frequency components out at least 4 km.

High-frequency vocalizations, however, had poor propaga-

tion in that even in low wind-noise, the signal-to-noise

remained above 0 out to 2.5 km with some loss of high-

frequency components. The behavioral data suggest the

signals were audible out to at least 4 km (likely the low-

frequency vocalizations and surface-generated signals) but

group interactions were more likely to occur within 2 km,

where the higher-frequency signals were more likely to

FIG. 4. The relationship between the distance of the signaling group from

the receiver (x axis), wind-dominated noise levels (y axis), and measured

SLrms (a) and rSNR (b) for vocal sounds.

FIG. 5. Boxplot (displaying the median, inter-quartile ranges and minimum/

maximum of the distributions) comparing the distance of the nearest neigh-

bor (a) and number of groups within 5 km (b) between non-interacting to

interacting signaling groups.


become audible. In high wind-noise (þ105 dB re 1 lPa) con-

ditions, however, the vocalization communication space was

likely to be reduced. Low-frequency vocalizations remained

above 0 SNR to only 1 km from the receiver, and high-

frequency vocalizations, less than 1 km, without a Lombard

response. When accounting for the Lombard response, the

measured rSNR of vocalizations was almost constant up to,

but not above, noise levels of 105 dB re 1 lPa. Frequency

components also changed, meaning the received signal was

likely progressively more distorted as the wind-noise

increased resulting in a loss of information such as signaler

motivation. However, surface-generated signals, in higher

wind-noise, remained above 0 rSNR out to at least 3 km and

with a constant peak frequency meaning they are likely to be

audible and less prone to distortion. A previous study found

that humpback whales switch using primarily surface-

generated signals in higher wind-noise (Dunlop et al., 2010)

and the results of this study suggest this switch would main-

tain their communication space in high wind-noise

conditions.

It should be noted that this measure of signal-to-noise

(rSNR) does not assume audibility of signal in noise, in that

an rSNR of 0 does not necessarily mean the signal becomes

inaudible to the whale receivers. As yet, there are no empiri-

cal measures of humpback hearing in noise and an rSNR of 0

(as measured in this study) was chosen as a level with which

to compare the performance of different sound categories.

These models can be updated when more information on

humpback whale hearing becomes available. The choice of

the signal and noise measurement used in this study was

based on what was considered to be a reasonable approxima-

tion of how whales might hear their own signals in noise.

The received level was measured in the 1/3 octave band, as

most of the energy of each social sound type is contained

within the 1/3 octave band and is presumably what a hump-

back whale receiver would hear. The noise bandwidth was

chosen because almost all the energy in the vocalizations (if

taking the social vocalization repertoire as a whole and not-

ing that signals can occur in bouts containing multiple signal

types; Rekdahl et al., 2015) lies within this band, meaning

this is presumably the noise band the whales are listening to.

This bandwidth also encompassed the peak frequencies for

wind-dominated noise.

Critical ratios for marine mammals are summarised by

Richardson et al. (1995), Southall et al. (2007), and Erbe

et al. (2016), where the values at 2 kHz range from approxi-

mately 19 to 26 dB. An earlier humpback whale playback

study at this study site, using this population of whales,

played 2 kHz tones to groups of migrating humpback whales

and measured their behavioral response. The masked thresh-

old of the 2 kHz tone stimulus was assumed to be �7.7 to

�0.7 dB (tonal signal measured in 1/3 octave noise; Dunlop

et al., 2013c), which was equivalent to a critical ratio of

19–26 dB. Critical ratios, however, are measured for tonal

signals and there does not appear to be measurements appli-

cable to signals like social sounds, where the energy is dis-

tributed over multiple frequencies. The peak frequencies of

social vocalizations varied from 43 Hz to 2.8 kHz with the

peak energy for the majority of signals below 150 Hz. Most

of the available data on critical ratios at low frequencies

comes from pinniped species, and is approximately 15 dB at

100 Hz and 10 to 18 dB at 200 Hz (summarised in Erbe

et al., 2016, using tone signals). This would be equivalent to

1 to �2 dB for 100 and 200 Hz tones, respectively, if mea-

sured in 1/3 octave noise. Given the lack of information for

baleen whale hearing of their own signals in noise, and fol-

lowing the argument above, 0 dB rSNR (as measured in this

study) was assumed to be close to the audible limits of the

receiver.

Behavioral evidence also suggests 0 dB rSNR may be

a reasonable value to use to estimate the size communica-

tion space for humpback whales in this environment. First,

the switch to surface-generated sounds occurred in higher

wind-noise (Dunlop et al., 2010) when vocalization

received levels above noise fell below 0 SNR. Second, the

Lombard response resulted in vocalizations remaining

above 0 rSNR until 4 km in noise levels up to 105 dB re1 lPa, after which, surface-generated signals, which propa-

gated further, were more commonly used for lower fre-

quency sounds. This agrees somewhat with estimates by

Cholewiak et al. (2018), where masking levels were mod-

elled for whale receivers and agent-based model used to

calculate changes in communication. Here the communica-

tion of humpback social sounds in ambient noise was esti-

mated to be 3.4 6 1.6 km. Third, in median wind-noise,

the rSNR and signal structure of high-frequency vocaliza-

tions was relatively unchanged out to approximately 2 km.

Interactions between groups were significantly more likely

to occur within this separation distance suggesting these

higher frequency sounds may function to mediate groups

joining together.

Fin and blue whales (Balaenoptera spp.) produce ste-

reotyped and redundant songs as a way to increase the

communication range of their signals in a noisy underwa-

ter environment (Clark et al., 2009). In addition, the fre-

quency range and peak frequencies of fin and blue whale

song from shallow water coastal environments are higher

than song from pelagic environments to compensate for

differences in sound propagation (Clark and Ellison,

2004). Humpback whale social sounds, however, are not

structurally stereotyped and modifying the frequency con-

tent may change the signaler message (Dunlop, 2017). The

analyses presented here suggest low-frequency vocal sig-

nals remain above noise out to approximately 4 km and up

to wind noise levels of 105 dB re 1 lPa. In higher noise,

surface-generated signals likely function to maintain this

space. Behavioral interactions more likely to occur within

2 km where higher frequency signals remained above

noise. Humpback whales neither increase their source lev-

els, nor switch to surface-generated signals, in the presence

of traversing fishing vessels (Dunlop, 2016b). The received

vessel noise levels were, in some instances, well above

high wind-noise levels. Given that this study has provided

masking models (in terms of changes in SNR and fre-

quency content with noise and distance), the next step will

be to model changes in this parameter with increasing ves-

sel noise. Without any compensation, it is likely that signal

masking will occur at closer distances than in wind noise,


more-so if humpback whales do not compensate. These

models will not account for any release from masking

mechanisms, given there is currently no available data.

They can, however, be updated as research in this area

progresses.

ACKNOWLEDGMENTS

The author would like to thank everyone involved in the

Humpback Acoustic Research Collaboration (HARC)

(funded by the U.S. Office of Naval Research and the

Australian Antarctic Divison), in particular the numerous

volunteers who donated their time and energy to this project.

The author also thanks David Paton for his invaluable field

expertise and Eric Kniest for his continued support in the

development of CYCLOPES. The author would particularly like

to acknowledge Associate Professor Michael Noad for

leading the HARC work (without which, this study would

not have been possible) and Dr. Douglas Cato for his

continued support and mentorship.

American National Standards Institute (2008). Bioacoustical Terminology(ANSI S3.20-1995, R 2008) (Acoustical Society of America, New York,

NY).

Aquiloni, L., Buric, M., and Gherardi, F. (2008). “Crafish females eavesdrop

on fighting males before choosing the dominate mate,” Curr. Biol. 18,

R462–R463.

Au, W. W. L., Pack, A. A., Lammers, M. O., Herman, L. M., Deakos, M.

H., and Andrews, K. (2006). “Acoustic properties of humpback whale

songs,” J. Acoust. Soc. Am. 120, 1103–1110.

Breiman, L. (1996). “Technical note: Some properties of splitting criteria,”

Mach. Learn. 24, 41–47.

Brenowitz, E. A. (1982). “Long-range communication of species identity by

song in the red-winged blackbird,” Behav. Ecol. Sociobiol. 10, 29–38.

Cato, D. H. (1997). “Features of ambient noise in shallow water,” in

Shallow Water Acoustics, edited by R. Zhang and J. Zhou (China Ocean

Press, Beijing, China), pp. 385–390.

Cholewiak, D., Clark, C. W., Ponirakis, D., Frankel, A., Hatch, L. T., Risch,

D., Stanistreet, J. E., Thompson, M., Vu, E., and Van Parijis, S. (2018).

“Communicating amidst the noise: Modeling the aggregate influence of

ambient and vessel noise on baleen whale communication space in a

national marine sanctuary,” Endang. Spec. Res. 36, 59–75.

Chu, K., and Harcourt, P. (1986). “Behavioral correlations with aberrant pat-

terns in humpback whale songs,” Behav. Ecol. Sociobiol. 19, 309–312.

Clark, C. W., and Ellison, W. T. (2004). “Potential use of low-frequency

sounds by baleen whales for probing the environment: Evidence from

models and empirical measurements,” in Advances in the Study ofEcholocation in Bats and Dolphins, edited by J. A. Thomas and R. A.

Kastelein (Plenum, New York), pp. 564–589.

Clark, C. W., Ellison, W. T., Southall, B. L., Hatch, L., Van Parijs, S. M.,

Frankel, A., and Ponirakis, D. (2009). “Acoustic masking in marine eco-

systems: Intuitions, analysis, and implication,” Marine Ecol. Prog. Ser.

395, 201–222.

Clark, D. L., Zeeff, C. K., Sabovodny, G., Hollenberg, A., Roberts, J. A.,

and Uetz, G. W. (2015). “The role of social experience in eavesdropping

by male wolf spiders (Lycosidae),” Anim. Behav. 106, 89–97.

Darling, J. D., and B�erub�e, M. (2001). “Interactions of singing humpback

whales with other males,” Mar. Mammal Sci. 17, 570–584.

Darling, J. D., Jones, M. E., and Nicklin, C. P. (2006). “Humpback whale

songs: Do they organize males during the breeding season?,” Behaviour

143, 1051–1101.

Dooling, R. J., and Blumenrath, S. H. (2016). “Masking Experiments in

Humans and Birds Using Anthropogenic Noises,” Eff. Noise Aquat. Life

II 875, 239–243.

Doutrelant, C., and McGregor, P. K. (2000). “Eavesdropping and mate

choice in female fighting fish,” Behaviour 137, 1655–1669.

Dunlop, R. A. (2016a). “Changes in vocal parameters with social context in

humpback whales: Considering the effect of bystanders,” Behav. Ecol.

Sociobiol. 70, 857–870.

Dunlop, R. A. (2016b). “The effect of vessel noise on humpback whale,

Megaptera novaeangliae, communication behaviour,” Anim. Behav. 111,

13–21.

Dunlop, R. A. (2017). “Potential motivational information encoded within

humpback whale non-song vocal sounds,” J. Acoust. Soc. Am. 141,

2204–2213.

Dunlop, R. A., Cato, D. H., and Noad, M. J. (2008). “Non-song acoustic

communication in migrating humpback whales (Megapteranovaeangliae),” Mar. Mammal Sci. 24, 613–629.

Dunlop, R. A., Cato, D. H., and Noad, M. J. (2010).“Your attention please:

Increasing ambient noise levels elicits a change in communication behav-

iour in humpback whales (Megaptera novaeangliae),” Proc. R. Soc. B

277, 2521–2529.

Dunlop, R. A., Cato, D. H., and Noad, M. J. (2014).“Evidence of a Lombard

response in migrating humpback whales (Megaptera novaeangliae),”

J. Acoust. Soc. Am. 136, 430–437.

Dunlop, R. A., Cato, D. H., Noad, M. J., and Stokes, D. M. (2013a). “Source

levels of social sounds in migrating humpback whales (Megapteranovaeangliae),” J. Acoust. Soc. Am. 134, 706–714.

Dunlop, R. A., and Noad, M. J. (2016). “The ‘risky’ business of singing:

Tactical use of song during joining by male humpback whales,” Behav.

Ecol. Sociobiol. 70, 2149–2160.

Dunlop, R. A., Noad, M., and Cato, D. (2013b). “Modification of humpback

whale social sound repertoire and vocal source levels with increased

noise,” J. Acoust. Soc. Am. 133(5), 3535.

Dunlop, R. A., Noad, M. J., Cato, D. H., Kniest, E., Miller, P. J. O., Smith,

J. N., and Stokes, M. D. (2013c). “Multivariate analysis of behavioural

response experiments in humpback whales (Megaptera novaeangliae),”

J. Exp. Biol. 216, 759–770.

Dunlop, R. A., Noad, M. J., Cato, D. H., and Stokes, D. (2007). “The

social vocalization repertoire of east Australian migrating humpback

whales (Megaptera novaeangliae),” J. Acoust. Soc. Am. 122,

2893–2905.

Erbe, C., Reichmuth, C., Cunningham, K., Lucke, K., and Dooling, R.

(2016). “Communication masking in marine mammals: A review and

research strategy,” Mar. Pollut. Bull. 103, 15–38.

Evans, E. F. (1992). “Auditory processing of complex sounds—An over-

view,” Philos. Trans. R. Soc. London Ser. B 336, 295–306.

Fournet, M. E., Szabo, A., and Mellinger, D. K. (2015). “Repertoire and

classification of non-song calls in Southeast Alaskan humpback whales

(Megaptera novaeangliae),” J. Acoust. Soc. Am. 137, 1–10.

Francois, R. E., and Garrison, G. R. (1982). “Sound-absorption based on

ocean measurements. 2. Boric-acid contribution and equation for total

absorption,” J. Acoust. Soc. Am. 72, 1879–1890.

Frankel, A. S., Clark, C. W., Herman, L. M., and Gabriele, C. M. (1995).

“Spatial-distribution, habitat utilization, and social interactions of hump-

back whales, Megaptera-novaeangliae, off Hawaii, determined using

acoustic and visual techniques,” Can. J. Zool. 73, 1134–1146.

Franklin, C. A., Thelin, J. W., Nabelek, A. K., and Burchfield, S. B. (2006).

“The effect of speech presentation level on acceptance of background

noise in listeners with normal hearing,” J. Am. Acad. Audiol. 17,

141–146.

Garland, E. C., Lilley, M. S., Goldizen, A. W., Rekdahl, M. L.,

Garrigue, C., and Noad, M. J. (2012). “Improved versions of the

Levenshtein distance method for comparing sequence information in

animals’ vocalisations: Tests using humpback whale song,” Behaviour

149, 1413–1441.

Hardin, J. W. (2005). “Generalized estimating equations (GEE),” in

Encyclopedia of Statistics in Behavioral Science (Wiley, 2005).

Højsgaard, S., Halekoh, U., and Yan, J. (2006). “The R package geepack for

generalized estimating equations,” J. Stat. Softw. 15(2), 1–11.

Lombard, E. (1911). “Le signe de l’elevation de la voix,” Ann. Mal. Oreil.

Larynx 37, 101–199.

Janik, V. M. (2000). “Source levels and the estimated active space of bottle-

nose dolphin (Tursiops truncatus) whistles in the Moray Firth, Scotland,”

J. Comp. Physiol. A 186, 673–680.

Liaw, A., and Wiener, M. (2002). “Classification and Regression by

randomForest,” R News 2(3), 18–22.

Lohr, B., Wright, T. F., and Dooling, R. J. (2003). “Detection and discrimi-

nation of natural calls in masking noise by birds: Estimating the active

space of a signal,” Anim. Behav. 65, 763–777.

Matos, R. J., Peake, T. M., and McGregor, P. K. (2003). “Timing of presen-

tation of an audience: Aggressive priming and audience effects in male


https://doi.org/10.1016/j.cub.2008.04.006

https://doi.org/10.1121/1.2211547

https://doi.org/10.1007/BF00117831

https://doi.org/10.1007/BF00296393

https://doi.org/10.3354/esr00875

https://doi.org/10.1007/BF00295703

https://doi.org/10.3354/meps08402

https://doi.org/10.1016/j.anbehav.2015.05.001

https://doi.org/10.1111/j.1748-7692.2001.tb01005.x

https://doi.org/10.1163/156853906778607381

https://doi.org/10.1007/978-1-4939-2981-8

https://doi.org/10.1007/978-1-4939-2981-8

https://doi.org/10.1163/156853900502763

https://doi.org/10.1007/s00265-016-2108-0

https://doi.org/10.1007/s00265-016-2108-0


https://doi.org/10.1121/1.4978615

https://doi.org/10.1111/j.1748-7692.2008.00208.x

https://doi.org/10.1098/rspb.2009.2319

https://doi.org/10.1121/1.4883598

https://doi.org/10.1121/1.4807828

https://doi.org/10.1007/s00265-016-2218-8

https://doi.org/10.1007/s00265-016-2218-8

https://doi.org/10.1121/1.4806382

https://doi.org/10.1242/jeb.071498

https://doi.org/10.1121/1.2783115

https://doi.org/10.1016/j.marpolbul.2015.12.007

https://doi.org/10.1098/rstb.1992.0062

https://doi.org/10.1121/1.4904504

https://doi.org/10.1121/1.388673

https://doi.org/10.1139/z95-135

https://doi.org/10.3766/jaaa.17.2.6

https://doi.org/10.1163/1568539X-00003032

https://doi.org/10.18637/jss.v015.i02

https://doi.org/10.1007/s003590000120

https://doi.org/10.1006/anbe.2003.2093

displays of Siamese fighting fish (Betta splendens),” Behav. Process. 63,

53–61.

McGregor, P. K. (1993). “Signaling in territorial systems—A context for

individual identification, ranging and eavesdropping,” Philos. Trans. R.

Soc. London Ser. B 340, 237–244.

McGregor, P. K., and Horn, A. G. (2015). “Communication and social

networks,” in Animal Social Networks, edited by J. Krause, R. James, D.

W. Franks, and D. P. Croft (Oxford University Press, Oxford, UK, 2015).

McGregor, P. K., and Krebs, J. R. (1984). “Sound degradation as a distance

cue in great tit (Parus-major) song,” Behav. Ecol. Sociobiol. 16, 49–56.

McGregor, P. K., Peake, T. M., and Lampe, H. M. (2001). “Fighting fish

Betta splendens extract relative information from apparent interactions:

What happens when what you see is not what you get,” Anim. Behav. 62,

1059–1065.

Mellinger, D. K. (2001). “Ishmael 1.0 User’s Guide,” NOAA, Technical

Memorandum OAR PMEL-120.

Mennill, D. J., Ratcliffe, L. M., and Boag, P. T. (2002). “Female

eavesdropping on male song contests in songbirds,” Science 296,

873–873.

Morton, E. S. (1986). “Predictions from the ranging hypothesis for the evo-

lution of long-distance signals in birds,” Behaviour 99(1/2), 65–86.

Noad, M. J., Cato, D. H., and Stokes, M. D. (2004). “Acoustic tracking of

humpback whales: Measuring interactions with the acoustic environment,”

in Proceedings of Acoustics 2004, Annual Conference of the AustralianAcoustical Society, Gold Coast, 3–5 November 2004, pp. 353–358.

Otter, K., McGregor, P. K., Terry, A. M. R., Burford, F. R. L., Peake, T. M.,

and Dabelsteen, T. (1999). “Do female great tits (Parus major) assess

males by eavesdropping? A field study using interactive song playback,”

Proc. R. Soc. London Ser. B 266, 1305–1309.

Parks, S. E., Cusano, D. A., Stimpert, A. K., Weinrich, M. T., Friedlaender,

A. S., and Wiley, D. N. (2014). “Evidence for acoustic communication

among bottom foraging humpback whales,” Sci. Rep. 4, 7508.

Peake, T. M., Terry, A. M. R., McGregor, P. K., and Dabelsteen, T. (2001).

“Male great tits eavesdrop on simulated male-to-male vocal interactions,”

Proc. R. Soc. London Ser. B 268, 1183–1187.

R Core Team (2013). “R: A language and environment for statistical

computing,” RFoundation for Statistical Computing, Vienna, Austria.

Rekdahl, M. L., Dunlop, R. A., Goldizen, A. W., Garland, E. C.,

Biassoni, N., Miller, P., and Noad, M. J. (2015). “Non-song social call

bouts of migrating humpback whales,” J. Acoust. Soc. Am. 137,

3042–3053.

Rekdahl, M. L., Dunlop, R. A., Noad, M. J., and Goldizen, A. W. (2013).

“Temporal stability and change in the social call repertoire of migrating

humpback whales,” J. Acoust. Soc. Am. 133, 1785–1795.

Richardson, W. J., Greene, C. R., Jr., Malme, C. I., and Thomson, D. H.

(1995). Marine Mammals and Noise (Academic Press, San Diego, CA).

Scott-Hayward, L. A. S., MacKenzie, M. L., Donovan, C. R., Walker, C.,

and Ashe, E. (2014). “Complex Region Spatial Smoother (CReSS),”

J. Comput. Graph. Stat. 23(2), 340–360.

Southall, B. L., Bowles, A. E., Ellison, W. T., Finneran, J. J., Gentry, R. L.,

Green, Jr., C. R., Kastak, D., Ketten, D. R., Miller, J. H., and Nachtigall,

P. E. (2007). “Marine mammal noise exposure criteria: Initial scientific

recommendations,” Aquat. Mammals 33(4).

Silber, G. K. (1986). “The relationship of social vocalizations to surface

behavior and aggression in the Hawaiian humpback whale (Megaptera-novaeangliae),” Can. J. Zool. 64, 2075–2080.

Smith, J. N., Goldizen, A. W., Dunlop, R. A., and Noad, M. J. (2008).

“Songs of male humpback whales, Megaptera novaeangliae, are involved

in intersexual interactions,” Anim. Behav. 76, 467–477.

Stimpert, A. K., Au, W. W. L., Parks, S. E., Hurst, T., and Wiley, D. N.

(2011). “Common humpback whale (Megaptera novaeangliae) sound

types for passive acoustic monitoring,” J. Acoust. Soc. Am. 129, 476–482.

Thompson, P. O., Cummings, W. C., and Kennison, S. J. (1977). “Sound

production of humpback whales, Megaptera-novaeangliae, in Alaskan

waters,” J. Acoust. Soc. Am. 62, S89–S89.

Tyack, P. (1981). “Interactions between singing Hawaiian humpback

whales and conspecifics nearby,” Behav. Ecol. Sociobiol. 8,

105–116.

Walker, C. G., Mackenzie, M. L., Donovan, C. R., and O’Sullivan, M. J.

(2011). “SALSA—A spatially adaptive local smoothing algorithm,”

J. Stat. Comput. Simul. 81(2), 179–191.

Wiley, R. H., and Richards, D. G. (1978). “Physical constraints on acoustic

communication in atmosphere—Implications for evolution of animal

vocalizations,” Behav. Ecol. Sociobiol. 3, 69–94.

Winn, H. E., and Winn, L. K. (1978). “Song of humpback whale

Megaptera-novaeangliae in West-Indies,” Mar. Biol. 47, 97–114.

Yan, J., and Fine, J. P. (2004). “Estimating equations for association

structures,” Stat. Med. 23, 859–880.


https://doi.org/10.1016/S0376-6357(03)00029-9



https://doi.org/10.1007/BF00293103

https://doi.org/10.1006/anbe.2001.1850

https://doi.org/10.1126/science.296.5569.873

https://doi.org/10.1163/156853986X00414



https://doi.org/10.1121/1.4921280

https://doi.org/10.1121/1.4789941

https://doi.org/10.1080/10618600.2012.762920

https://doi.org/10.1139/z86-316


https://doi.org/10.1121/1.3504708

https://doi.org/10.1121/1.2016437

https://doi.org/10.1007/BF00300822

https://doi.org/10.1080/00949650903229041

https://doi.org/10.1007/BF00300047

https://doi.org/10.1007/BF00395631

https://doi.org/10.1002/sim.1650

Documents

The communication space of humpback whale social sounds …