Glossary

Classifiers

All classifiers in Batscope 4.2 have been trained and optimized on the same training set consisting of 61’700 recorded calls of 30 Swiss bat species. This is an improvement over the training set used in Batscope 4.1.1

_images/fig-n-calls.png

Figure 8 Number of total sequences and calls per species in the training set. European Bats - WSL 2022 is the new dataset version in Batscope 4.2

All classifiers are based on an underlying feature set of 40 features per bat call. The set of features is the same for call classifiers. The features includes the frequencies and slopes of the call trajectory, the call duration, the peak frequency, percentiles of the energy distribution and other differentiating features.

The most reliable classifier is the SVM, with an accuracy of 89.2%. The seven classifiers performed differently on this training set and also differ in their ability to correctly recognize new bat calls. For an explanation of accuracy”, *precision” and *sensitivity see the table given in Wikipedia.

_images/fig-classification.png

Figure 9 Statistics of 10-fold crossvalidated classifications on the training set

Combining the results from different Classifiers can improve the classification quality.

_images/fig-multi-classification-2022.png

Figure 10 Statistics of classification when combining the outputs of multiple classifiers. Only calls where the listed classifiers agree are considered, assignable being the percentage of sequences that can be classified under this constraint. The top ranking KKNN - SVM combination is chosen as a default for Batscope, as it offers a good trade off between accuracy and number of sequences that remain assignable. Combinations with QDA are excluded, as they consistently perform worse as their counterparts without QDA.

Note

For cross-validation, sequences were split randomly into test and training sets. This controls for direct overfitting of the model to the distribution of the training set. The training set was carefully constructed to represent the domain of bat calls as well as possible. However, it has to be assumed that that the training may be biased towards specific recording equipment, setup, method, location, pre-processing and other hard to control variables. It is therefore not guaranteed that the classification performance stated above will be attained on a completely independently recorded dataset.

Confidence Interval Test

After each classification, i.e. each time a classifier or ensemble of classifiers has proposed a most likely species for a call, the three most distinctive features of the call are tested against stored reference values of the proposed species.

These features are: Call duration, peak frequency and bandwidth.

If any of these values lies outside a 95% Confidence Interval of the training set’s average of the proposed species, the field Confidence Interval Test is marked with Fail, indicating a questionable classification result.

You could consider disabling the Status of such a call to improve overall Classification accuracy.

Range Base

For each of the species in the training base, we have defined its range of occurrence with the help of species’s distribution data derived from either CSCF (for CH Alphahull) or from GBIF (for Europe MCP and Europe Alphahull).

For acknowledgements see BatScope’s About dialog.

Proposed classifications for sequences with known recording location will be tested against the range base set for the project to determine if the classification is realistic. The result is displayed in the classification table as either Pass or Fail together with how far outside the species range the recording was made.

Preferences

  • Confidence Display Threshold - only classifications above this threshold (0.01 = 1%) are shown in the user interface.

  • Verifier Name - the name saved with verifications you add. Different users can store their names in their own user account.

  • Channel - the recording channel during import of wav-files with multiple channels.

  • Map - the map collection ‘leaflet’ provides OpenStreetMap as well as SwissTopo areal image and maps.

  • Editor Application Path - the filesystem path of your audio editor installation (e.g. Audacity or Raven).

  • Sequence Audio Playback Mode - set the preferred mode of playing back your audio sequences here:
    • UNPROCESSED - play back at the original sampling rate of the recording

    • SLOWED - the sampling rate is slowed down by a user settable factor - this corresponds to Time Expansion Mode

    • FREQUENCY DIVISION - frequencies are divided by a user selectable factor - the temporal pattern of a Sequence will be retained

    • HETERODYNE - the playback functions like a mixing detector - the Mixer Frequency can be set

Species Set

A species set is a collection of all data and processors associated with a group of species, such as Swiss bats or Latin American bats. Each species set contains its own Taxonomy, Plugins, Custom Processes, Filters and Statistical Models to assign species to the respective Taxonomy.

For legacy reasons, we have included the Species Set form BatScope 4.1.1 as European Bats - WSL. The new Species Set in BatScope 4.2 is termed European Bats - WSL 2022.

After having switched BatScope to another species set from the Species Set menu, all Projects and Collections not assigned to the current species set will be grayed out in the Projects View. You now can switch a project to the new species set, if it has the same frequency range as the original data.

Warning

When switching a project to a new species set, the processing state of its sequences will be reset. All calls will be removed!

Call Parameters

Table 2 The columns of the metadata files

COLUMN

UNIT

DESCRIPTION

most_likely_species

Most likely species for the call

ci_test

Result of confidence interval test

confidence_most_likely_species

%

Confidence for the most likely species for the call

agreeing_classifiers_count

Number of classifiers agreeing on the most likely species

classifiers_used_count

Number of classifiers used for classification

snr

dB

Signal to Noise Ratio of call

timep25p05

ms

Duration between X% and Y% total call energy in filtered signal

timep50p25

timep75p50

timep95p75

trajectorybandwidth

kHz

Bandwidth of trajectory

timep05raw

ms

Time at which X% of total call energy is reached in raw signal

timep25raw

timep50raw

timep75raw

timep95raw

timepeakraw

Time of peak energy in raw signal

durationiqrraw

Duration containing 50% of call energy in raw signal

durationd90raw

Duration containing 90% of call energy in raw signal

freqp05raw

kHz

Frequency at which X% of total call energy is reached in raw signal

freqp25raw

freqp50raw

freqp75raw

freqp95raw

freqpeakraw

Frequency of peak energy in raw signal

bandwidthiqrraw

Bandwidth containing 50% of call energy in raw signal

bandwidthd90raw

Bandwidth containing 90% of call energy in raw signal

timep05fil

ms

Time at which X% of total call energy is reached in filtered signal

timep25fil

timep50fil

timep75fil

timep95fil

timepeakfil

Time of peak energy in filtered signal

durationiqrfil

Duration containing 50% of call energy in filtered signal

durationd90fil

Duration containing 90% of call energy in filtered signal

freqp05fil

kHz

Frequency at which X% of total call energy is reached in filtered signal

freqp25fil

freqp50fil

freqp75fil

freqp95fil

freqpeakfil

Frequency of peak energy in filtered signal

bandwidthiqrfil

Bandwidth containing 50% of call energy in filtered signal

bandwidthd90fil

Bandwidth containing 90% of call energy in filtered signal

trajectoryduration

ms

Duration of trajectory

trajectorystartfreq

kHz

Characteristic frequencies of trajectory

trajectorycenterfreq

trajectoryendfreq

trajectorymaxfreq

trajectoryminfreq

trajectoryavgfreqbin1

Averaged frequency of trajectory in each time bin

trajectoryavgfreqbin2

trajectoryavgfreqbin3

trajectoryavgfreqbin4

trajectoryavgfreqbin5

trajectoryavgslopebin1

kHz/ms

Averaged slope of trajectory in each time bin

trajectoryavgslopebin2

trajectoryavgslopebin3

trajectoryavgslopebin4

trajectoryavgslopebin5

trajectoryavgcurvbin1

Averaged curvature of trajectory in each time bin

trajectoryavgcurvbin2

trajectoryavgcurvbin3

trajectoryavgcurvbin4

trajectoryavgcurvbin5

trajectorystarttime

ms

Start time of trajectory

intervalpre

Interval to previous call

intervalpost

Interval to next call

_images/fig-call-parameters.png

Figure 11 Graphical representation of the call parameters