Glossary

Classifiers

All classifiers in Batscope 4.2 have been trained and optimized on the same training set consisting of 61’700 recorded calls of 30 Swiss bat species. This is an improvement over the training set used in Batscope 4.1.1

Figure 8 Number of total sequences and calls per species in the training set. European Bats - WSL 2022 is the new dataset version in Batscope 4.2

All classifiers are based on an underlying feature set of 40 features per bat call. The set of features is the same for call classifiers. The features includes the frequencies and slopes of the call trajectory, the call duration, the peak frequency, percentiles of the energy distribution and other differentiating features.

The most reliable classifier is the SVM, with an accuracy of 89.2%. The seven classifiers performed differently on this training set and also differ in their ability to correctly recognize new bat calls. For an explanation of accuracy”, *precision” and *sensitivity see the table given in Wikipedia.

Figure 9 Statistics of 10-fold crossvalidated classifications on the training set

Combining the results from different Classifiers can improve the classification quality.

_images/fig-multi-classification-2022.png

Figure 10 Statistics of classification when combining the outputs of multiple classifiers. Only calls where the listed classifiers agree are considered, assignable being the percentage of sequences that can be classified under this constraint. The top ranking KKNN - SVM combination is chosen as a default for Batscope, as it offers a good trade off between accuracy and number of sequences that remain assignable. Combinations with QDA are excluded, as they consistently perform worse as their counterparts without QDA.

Note

For cross-validation, sequences were split randomly into test and training sets. This controls for direct overfitting of the model to the distribution of the training set. The training set was carefully constructed to represent the domain of bat calls as well as possible. However, it has to be assumed that that the training may be biased towards specific recording equipment, setup, method, location, pre-processing and other hard to control variables. It is therefore not guaranteed that the classification performance stated above will be attained on a completely independently recorded dataset.

Confidence Interval Test

After each classification, i.e. each time a classifier or ensemble of classifiers has proposed a most likely species for a call, the three most distinctive features of the call are tested against stored reference values of the proposed species.

These features are: Call duration, peak frequency and bandwidth.

If any of these values lies outside a 95% Confidence Interval of the training set’s average of the proposed species, the field Confidence Interval Test is marked with Fail, indicating a questionable classification result.

You could consider disabling the Status of such a call to improve overall Classification accuracy.

Range Base

For each of the species in the training base, we have defined its range of occurrence with the help of species’s distribution data derived from either CSCF (for CH Alphahull) or from GBIF (for Europe MCP and Europe Alphahull).

For acknowledgements see BatScope’s About dialog.

Proposed classifications for sequences with known recording location will be tested against the range base set for the project to determine if the classification is realistic. The result is displayed in the classification table as either Pass or Fail together with how far outside the species range the recording was made.

Preferences

Confidence Display Threshold - only classifications above this threshold (0.01 = 1%) are shown in the user interface.
Verifier Name - the name saved with verifications you add. Different users can store their names in their own user account.
Channel - the recording channel during import of wav-files with multiple channels.
Map - the map collection ‘leaflet’ provides OpenStreetMap as well as SwissTopo areal image and maps.
Editor Application Path - the filesystem path of your audio editor installation (e.g. Audacity or Raven).
Sequence Audio Playback Mode - set the preferred mode of playing back your audio sequences here:
- UNPROCESSED - play back at the original sampling rate of the recording
- SLOWED - the sampling rate is slowed down by a user settable factor - this corresponds to Time Expansion Mode
- FREQUENCY DIVISION - frequencies are divided by a user selectable factor - the temporal pattern of a Sequence will be retained
- HETERODYNE - the playback functions like a mixing detector - the Mixer Frequency can be set

Species Set

A species set is a collection of all data and processors associated with a group of species, such as Swiss bats or Latin American bats. Each species set contains its own Taxonomy, Plugins, Custom Processes, Filters and Statistical Models to assign species to the respective Taxonomy.

For legacy reasons, we have included the Species Set form BatScope 4.1.1 as European Bats - WSL. The new Species Set in BatScope 4.2 is termed European Bats - WSL 2022.

After having switched BatScope to another species set from the Species Set menu, all Projects and Collections not assigned to the current species set will be grayed out in the Projects View. You now can switch a project to the new species set, if it has the same frequency range as the original data.

Warning

When switching a project to a new species set, the processing state of its sequences will be reset. All calls will be removed!

Call Parameters

Table 2 The columns of the metadata files
COLUMN	UNIT	DESCRIPTION
most_likely_species		Most likely species for the call
ci_test		Result of confidence interval test
confidence_most_likely_species	%	Confidence for the most likely species for the call
agreeing_classifiers_count		Number of classifiers agreeing on the most likely species
classifiers_used_count		Number of classifiers used for classification
snr	dB	Signal to Noise Ratio of call
timep25p05	ms	Duration between X% and Y% total call energy in filtered signal
timep50p25
timep75p50
timep95p75
trajectorybandwidth	kHz	Bandwidth of trajectory
timep05raw	ms	Time at which X% of total call energy is reached in raw signal
timep25raw
timep50raw
timep75raw
timep95raw
timepeakraw		Time of peak energy in raw signal
durationiqrraw		Duration containing 50% of call energy in raw signal
durationd90raw		Duration containing 90% of call energy in raw signal
freqp05raw	kHz	Frequency at which X% of total call energy is reached in raw signal
freqp25raw
freqp50raw
freqp75raw
freqp95raw
freqpeakraw		Frequency of peak energy in raw signal
bandwidthiqrraw		Bandwidth containing 50% of call energy in raw signal
bandwidthd90raw		Bandwidth containing 90% of call energy in raw signal
timep05fil	ms	Time at which X% of total call energy is reached in filtered signal
timep25fil
timep50fil
timep75fil
timep95fil
timepeakfil		Time of peak energy in filtered signal
durationiqrfil		Duration containing 50% of call energy in filtered signal
durationd90fil		Duration containing 90% of call energy in filtered signal
freqp05fil	kHz	Frequency at which X% of total call energy is reached in filtered signal
freqp25fil
freqp50fil
freqp75fil
freqp95fil
freqpeakfil		Frequency of peak energy in filtered signal
bandwidthiqrfil		Bandwidth containing 50% of call energy in filtered signal
bandwidthd90fil		Bandwidth containing 90% of call energy in filtered signal
trajectoryduration	ms	Duration of trajectory
trajectorystartfreq	kHz	Characteristic frequencies of trajectory
trajectorycenterfreq
trajectoryendfreq
trajectorymaxfreq
trajectoryminfreq
trajectoryavgfreqbin1		Averaged frequency of trajectory in each time bin
trajectoryavgfreqbin2
trajectoryavgfreqbin3
trajectoryavgfreqbin4
trajectoryavgfreqbin5
trajectoryavgslopebin1	kHz/ms	Averaged slope of trajectory in each time bin
trajectoryavgslopebin2
trajectoryavgslopebin3
trajectoryavgslopebin4
trajectoryavgslopebin5
trajectoryavgcurvbin1		Averaged curvature of trajectory in each time bin
trajectoryavgcurvbin2
trajectoryavgcurvbin3
trajectoryavgcurvbin4
trajectoryavgcurvbin5
trajectorystarttime	ms	Start time of trajectory
intervalpre		Interval to previous call
intervalpost		Interval to next call

Figure 11 Graphical representation of the call parameters