acoupipe.datasets.synthetic

Contains classes for the generation of microphone array data from synthesized signals for acoustic testing applications.

Currently, the following dataset generators are available:

DatasetSynthetic: A simple and fast method that relies on synthetic white noise signals and spatially stationary sources radiating under anechoic conditions.

../../../../_images/msm_layout.png — Default measurement setup used in the `acoupipe.datasets.synthetic` module.

Module Contents

class acoupipe.datasets.synthetic.ConfigBase

Bases: traits.api.HasPrivateTraits

Configuration base class for generating microphone array datasets.

get_sampler()

Return dictionary containing the sampler objects of type acoupipe.sampler.BaseSampler.

this function has to be manually defined in a dataset subclass. It includes the sampler objects as values. The key defines the idx in the sample order.

e.g.: >>> sampler = { >>> 0 : BaseSampler(…), >>> 1 : BaseSampler(…), >>> … >>> }

Returns:: dictionary containing the sampler objects
Return type:: dict

class acoupipe.datasets.synthetic.DatasetBase(config=None, tasks=1, logger=None)

Bases: traits.api.HasPrivateTraits

Base class for generating microphone array datasets with specified features and labels.

config

Configuration object for dataset generation.

Type:: ConfigBase

tasks

Number of parallel tasks for data generation. Defaults to 1 (sequential calculation).

Type:: int

get_feature_collection(features, f, num)

Get the feature collection of the dataset.

Returns:: BaseFeatureCollection object.
Return type:: BaseFeatureCollection

generate(features, split, size, f=None, num=0, start_idx=0, progress_bar=True)

Generate dataset samples iteratively.

Parameters:

features (list) – List of features included in the dataset. The features “seeds” and “idx” are always included.
split (str) – Split name for the dataset (‘training’, ‘validation’ or ‘test’).
size (int) – Size of the dataset (number of source cases).
f (float) – The center frequency or list of frequencies of the dataset. If None, all frequencies are included.
num (integer) –
Controls the width of the frequency bands considered; defaults to 0 (single frequency line).

num

frequency band width

0

single frequency line

1

octave band

3

third-octave band

n

1/n-octave band
start_idx (int, optional) – Starting sample index (default is 0).
progress_bar (bool, optional) – Whether to show a progress bar (default is True).

Yields:

data (dict) – Generator that yields dataset samples as dictionaries containing the feature names as keys.

Examples

Generate features iteratively.

>>> from acoupipe.datasets.synthetic import DatasetSynthetic
>>> # define the features
>>> features = ["csm", "source_strength_analytic", "loc"]
>>> f = 1000
>>> num = 3
>>> # generate the dataset
>>> generator = DatasetSynthetic().generate(
        f=f, num=num, split="training", size=2, features=features)
>>> # iterate over the dataset
>>> for data in generator:
        print(data)

save_h5(features, split, size, name, f=None, num=0, start_idx=0, progress_bar=True)

Save dataset to a HDF5 file.

Parameters:

features (list) – List of features included in the dataset. The features “seeds” and “idx” are always included.
split (str) – Split name for the dataset (‘training’, ‘validation’ or ‘test’).
size (int) – Size of the dataset (number of source cases).
name (str) – Name of the HDF5 file.
f (float) – The center frequency or list of frequencies of the dataset. If None, all frequencies are included.
num (integer) –
Controls the width of the frequency bands considered; defaults to 0 (single frequency line).

num

frequency band width

0

single frequency line

1

octave band

3

third-octave band

n

1/n-octave band
start_idx (int, optional) – Starting sample index (default is 0).
progress_bar (bool, optional) – Whether to show a progress bar (default is True).

Return type:

None

Examples

Save features to a HDF5 file.

>>> from acoupipe.datasets.synthetic import DatasetSynthetic
>>> # define the features
>>> features = ["csm", "source_strength_analytic", "loc"]
>>> f = 1000
>>> num = 3
>>> # save the dataset
>>> dataset = DatasetSynthetic().save_h5(
        f=f, num=num, split="training", size=10, features=features,name="/tmp/example.h5")

class acoupipe.datasets.synthetic.AnalyticNoiseStrengthFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:: str

freq_data

The frequency data to calculate the feature for.

Type:: instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:: float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:: int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:: list of tuples

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.AnalyticSourceStrengthFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:: str

freq_data

The frequency data to calculate the feature for.

Type:: instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:: float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:: int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:: list of tuples

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.BaseFeatureCollection

Bases: traits.api.HasPrivateTraits

BaseFeatureCollection base class for handling feature funcs.

feature_funcs

List of feature_funcs.

Type:: list

add_feature_func(feature_func)

Add a feature_func to the BaseFeatureCollection.

Parameters:: feature_func (str) – Feature to be added.

get_feature_funcs()

Get all feature_funcs of the BaseFeatureCollection.

Returns:: List of feature_funcs.
Return type:: list

class acoupipe.datasets.synthetic.BaseFeatureCollectionBuilder

Bases: traits.api.HasPrivateTraits

BaseFeatureCollectionBuilder base class for building a BaseFeatureCollection.

feature_collection

BaseFeatureCollection object.

Type:: BaseFeatureCollection

add_custom(feature_func)

Add a custom feature to the BaseFeatureCollection.

Parameters:: feature_func (str) – Feature to be added.

build()

Build a BaseFeatureCollection.

Returns:: BaseFeatureCollection object.
Return type:: BaseFeatureCollection

class acoupipe.datasets.synthetic.CSMFeature

Bases: SpectraFeature

CSMFeature class for handling cross-spectral matrix calculation.

name

Name of the feature (default=’csm’).

Type:: str

freq_data

The object which calculates the cross-spectral matrix.

Type:: instance of class acoular.PowerSpectra

f

the frequency (or center frequency) of interest

Type:: float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:: int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:: list of tuples

static calc_csm1(sampler, freq_data, name)

Calculate the cross-spectral matrix (CSM) from time data.

Parameters:: freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature
Returns:: The complex-valued cross-spectral matrix with shape (numfreq, num_mics, num_mics).
Return type:: numpy.array

static calc_csm2(sampler, freq_data, fidx, name)

Calculate the cross-spectral matrix (CSM) from time data.

Parameters:

freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature
fidx (list of tuples, optional) – list of tuples containing the start and end indices of the frequency bands to be considered, by default None

Returns:

The complex-valued cross-spectral matrix with shape (numfreq, num_mics, num_mics) with numfreq depending on the number of frequencies in fidx.

Return type:

numpy.array

get_feature_func(): Return the callable for calculating the cross-spectral matrix.

class acoupipe.datasets.synthetic.CSMtriuFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:: str

freq_data

The frequency data to calculate the feature for.

Type:: instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:: float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:: int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:: list of tuples

static calc_csmtriu1(sampler, freq_data, name)

Calculate the cross-spectral matrix (CSM) from time data.

Parameters:: freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature
Returns:: The real-valued cross-spectral matrix with shape (numfreq, num_mics, num_mics).
Return type:: numpy.array

static calc_csmtriu2(sampler, freq_data, fidx, name)

Calculate the cross-spectral matrix (CSM) from time data.

Parameters:

freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature
fidx (list of tuples, optional) – list of tuples containing the start and end indices of the frequency bands to be considered, by default None

Returns:

The real-valued cross-spectral matrix with shape (numfreq, num_mics, num_mics) with numfreq depending on the number of frequencies in fidx.

Return type:

numpy.array

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.EigmodeFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:: str

freq_data

The frequency data to calculate the feature for.

Type:: instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:: float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:: int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:: list of tuples

static calc_eigmode1(sampler, freq_data, name)

Calculate the eigenvalue-scaled eigenvectors of the cross-spectral matrix (CSM) from time data.

Parameters:: freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature
Returns:: The eigenvalue scaled eigenvectors with shape (numfreq, num_mics, num_mics).
Return type:: numpy.array

static calc_eigmode2(sampler, freq_data, fidx, name)

Calculate the eigenvalue-scaled eigenvectors of the cross-spectral matrix (CSM) from time data.

Parameters:

freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature
fidx (list of tuples, optional) – list of tuples containing the start and end indices of the frequency bands to be considered, by default None

Returns:

The eigenvalue scaled eigenvectors with shape (numfreq, num_mics, num_mics) with numfreq depending on the number of frequencies in fidx.

Return type:

numpy.array

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.EstimatedNoiseStrengthFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:: str

freq_data

The frequency data to calculate the feature for.

Type:: instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:: float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:: int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:: list of tuples

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.EstimatedSourceStrengthFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:: str

freq_data

The frequency data to calculate the feature for.

Type:: instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:: float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:: int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:: list of tuples

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.LocFeature

Bases: BaseFeatureCatalog

BaseFeatureCatalog base class for handling feature funcs.

name

Name of the feature.

Type:: str

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.SourcemapFeature

Bases: BaseFeatureCatalog

SourcemapFeature class for handling the generation of sourcemaps obtained with microphone array methods.

name

Name of the feature (default=’sourcemap’).

Type:: str

beamformer

The beamformer to calculate the sourcemap.

Type:: instance of class acoular.BeamformerBase

f

The center frequency or list of frequencies of the dataset. If None, all frequencies are included.

Type:: float

num

Controls the width of the frequency bands considered; defaults to 0 (single frequency line).

num	frequency band width
0	single frequency line
1	octave band
3	third-octave band
n	1/n-octave band

Type:: integer

fidx

List of tuples containing the start and end indices of the frequency bands to be considered. Is determined automatically from attr:f and attr:num.

Type:: list of tuples

set_freq_limits(): Set the frequency limits of the beamformer so that the result is only calculated for necessary frequencies.

get_feature_func(): Return the callable for calculating the sourcemap.

class acoupipe.datasets.synthetic.SpectrogramFeature

Bases: SpectraFeature

SpectrogramFeature class for handling spectrogram features.

name

Name of the feature (default=’spectrogram’).

Type:: str

freq_data

The object which calculates the spectrogram data.

Type:: instance of class acoular.FFTSpectra

f

the frequency (or center frequency) of interest

Type:: float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:: int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:: list of tuples

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.TargetmapFeature

Bases: BaseFeatureCatalog

BaseFeatureCatalog base class for handling feature funcs.

name

Name of the feature.

Type:: str

get_feature_func(): Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.TimeDataFeature

Bases: BaseFeatureCatalog

TimeDataFeature class for handling time data.

name

Name of the feature (default=’time_data’).

Type:: str

time_data

The source delivering the time data.

Type:: instance of class acoular.SamplesGenerator

get_feature_func(): Return the callable for calculating the time data.

class acoupipe.datasets.synthetic.PowerSpectraAnalytic

Bases: acoular.PowerSpectraImport

Provides a dummy class for using pre-calculated cross-spectral matrices.

This class does not calculate the cross-spectral matrix. Instead, the user can inject one or multiple existing CSMs by setting the csm attribute. This can be useful when algorithms shall be evaluated with existing CSM matrices. The frequency or frequencies contained by the CSM must be set via the attr:frequencies attribute. The attr:numchannels attributes is determined on the basis of the CSM shape. In contrast to the PowerSpectra object, the attributes sample_freq, time_data, source, block_size, calib, window, overlap, cached, and num_blocks have no functionality.

fftfreq()

Return the Discrete Fourier Transform sample frequencies.

Returns:: f – Array of length block_size/2+1 containing the sample frequencies.
Return type:: ndarray

acoupipe.datasets.synthetic.get_all_source_signals(source_list)

Get all signals from a list of acoular.SamplesGenerator derived objects.

Parameters:: source_list (list) – list of acoular.SamplesGenerator derived objects
Returns:: list of all acoular.SignalGenerator derived objects
Return type:: list

acoupipe.datasets.synthetic.get_uncorrelated_noise_source_recursively(source)

Recursively get all uncorrelated noise sources from a acoular.TimeInOut object.

Parameters:: source (instance of class acoular.TimeInOut) – the source object
Returns:: list of all uncorrelated noise sources
Return type:: list

class acoupipe.datasets.synthetic.DatasetSynthetic(mode='welch', mic_pos_noise=True, mic_sig_noise=True, snap_to_grid=False, random_signal_length=False, signal_length=5, fs=13720.0, min_nsources=1, max_nsources=10, tasks=1, logger=None, config=None)

Bases: acoupipe.datasets.base.DatasetBase

DatasetSynthetic is a purely synthetic microphone array source case generator.

DatasetSynthetic relies on synthetic source signals from which the features are extracted and has been used in different publications, e.g. [KHS19], [KS22], [FZHX22]. The default virtual simulation setup consideres a 64 channel microphone array and a planar observation area, as shown in the default measurement setup figure.

Default environmental properties

Default Environmental Characteristics
Environment	Anechoic, Resting, Homogeneous Fluid
Speed of sound	343 m/s
Microphone Array	Vogel’s spiral, \(M=64\), Aperture Size 1 m
Observation Area	x,y in [-0.5,0.5], z=0.5
Source Type	Monopole
Source Signals	Uncorrelated White Noise (\(T=5\,s\))

Default FFT parameters

The underlying default FFT parameters are:

FFT Parameters
Sampling Rate	He = 40, fs=13720 Hz
Block size	128 Samples
Block overlap	50 %
Windowing	von Hann / Hanning

Default randomized properties

Several properties of the dataset are randomized for each source case when generating the data. Their respective distributions, are closely related to [HS17]. As such, the the microphone positions are spatially disturbed to account for uncertainties in the microphone placement. The number of sources, their positions, and strength is randomly chosen. Uncorrelated white noise is added to the microphone channels by default.

Randomized properties
Sensor Position Deviation [m]	Bivariate normal distributed (\(\sigma = 0.001)\)
No. of Sources	Poisson distributed (\(\lambda=3\))
Source Positions [m]	Bivariate normal distributed (\(\sigma = 0.1688\))
Source Strength (\([{Pa}^2]\) at reference position)	Rayleigh distributed (\(\sigma_{R}=5\))
Relative Noise Variance	Uniform distributed (\(10^{-6}\), \(0.1\))

Example

from acoupipe.datasets.synthetic import DatasetSynthetic

dataset = DatasetSynthetic()
dataset_generator = dataset.generate_dataset(
    features=["sourcemap", "loc", "f", "num"], # choose the features to extract
    f=[1000,2000,3000], # choose the frequencies to extract
    split='training', # choose the split of the dataset
    size=10, # choose the size of the dataset
    )

# get the first data sample
data = next(dataset_generator)

# print the keys of the dataset
print(data.keys())

Initialization Parameters

Initialize the DatasetSynthetic object.

The input parameters are passed to the DatasetSyntheticConfig object, which creates all necessary objects for the simulation of microphone array data.

Parameters:

mode (str) – Type of calculation method. Can be either welch, analytic or wishart. Defaults to welch.
mic_pos_noise (bool) – Apply positional noise to microphone geometry. Defaults to True.
mic_sig_noise (bool) – Apply additional uncorrelated white noise to microphone signals. Defaults to True.
snap_to_grid (bool) – Snap source locations to grid. The grid is defined in the config object as config.grid. Defaults to False.
random_signal_length (bool) – Randomize signal length. Defaults to False. If True, the signal length is uniformly sampled from the interval [1s,10s].
signal_length (float) – Length of the signal in seconds. Defaults to 5 seconds.
fs (float) – Sampling frequency in Hz. Defaults to 13720 Hz.
min_nsources (int) – Minimum number of sources in the dataset. Defaults to 1.
max_nsources (int) – Maximum number of sources in the dataset. Defaults to 10.
tasks (int) – Number of parallel tasks. Defaults to 1.
logger (logging.Logger) – Logger object. Defaults to None.
config (DatasetSyntheticConfig) – Configuration object. Defaults to None. If None, a default configuration object is created.

get_feature_collection(features, f, num)

Get the feature collection of the dataset.

Returns:: BaseFeatureCollection object.
Return type:: BaseFeatureCollection

acoupipe.datasets.synthetic.sample_signal_length(rng)

class acoupipe.datasets.synthetic.DatasetSyntheticConfig(**kwargs)

Bases: acoupipe.datasets.base.ConfigBase

Default Configuration class.

fs

Sampling frequency in Hz.

Type:: float

signal_length

Length of the source signals in seconds.

Type:: float

max_nsources

Maximum number of sources.

Type:: int

min_nsources

Minimum number of sources.

Type:: int

mode

Type of CSM calculation method.

Type:: str

mic_pos_noise

Apply positional noise to microphone geometry.

Type:: bool

mic_sig_noise

Apply signal noise to microphone signals.

Type:: bool

snap_to_grid

Snap source locations to grid.

Type:: bool

random_signal_length

Randomize signal length (Default: uniformly sampled signal length [1s,10s]).

Type:: bool

fft_params

FFT parameters with default items block_size=128, overlap="50%", window="Hanning" and precision="complex64".

Type:: dict

env

Instance of acoular.Environment defining the environmental coditions, i.e. the speed of sound.

Type:: ac.Environment

mics

Instance of acoular.MicGeom defining the microphone array geometry.

Type:: ac.MicGeom

noisy_mics

a second instance of acoular.MicGeom defining the noisy microphone array geometry.

Type:: ac.MicGeom

obs

Instance of acoular.MicGeom defining the observation point which is used as the reference position when calculating the source strength.

Type:: ac.MicGeom

grid

Instance of acoular.RectGrid defining the grid on which the Beamformer calculates the source map and on which the targetmap feature is calculated.

Type:: ac.RectGrid

source_grid

Instance of acoular.Grid. Only relevant if snap_to_grid is True. Then, the source locations are snapped to this grid. Default is a copy of grid.

Type:: ac.Grid

beamformer

Instance of acoular.BeamformerBase defining the beamformer used to calculate the sourcemap.

Type:: ac.BeamformerBase

steer

Instance of acoular.SteeringVector defining the steering vector used to calculate the sourcemap.

Type:: ac.SteeringVector

freq_data

Instance of acoular.PowerSpectra defining the frequency domain data. Only used if mode is welch. Otherwise, an instance of acoupipe.datasets.spectra_analytic.PowerSpectraAnalytic is used.

Type:: ac.PowerSpectra

fft_spectra

Instance of acoular.FFTSpectra used to calculate the spectrogram data. Only used if mode is welch.

Type:: ac.FFTSpectra

fft_obs_spectra

Instance of acoular.PowerSpectra used to calculate the source strength at the observation point given in obs.

Type:: ac.PowerSpectra

signals

List of signals.

Type:: list

sources

List of sources.

Type:: list

mic_noise_signal

Noise signal configuration object.

Type:: ac.SignalGenerator

mic_noise_source

Noise source configuration object.

Type:: ac.UncorrelatedNoiseSource

micgeom_sampler

Sampler that applies positional noise to the microphone geometry.

Type:: sp.MicGeomSampler

location_sampler

Source location sampler that samples the locations of the sound sources.

Type:: sp.LocationSampler

rms_sampler

Signal RMS sampler that samples the RMS values of the source signals.

Type:: sp.ContainerSampler

nsources_sampler

Number of sources sampler.

Type:: sp.NumericAttributeSampler

mic_noise_sampler

Microphone noise sampler that creates random uncorrelated noise at the microphones.

Type:: sp.ContainerSampler

signal_length_sampler

Signal length sampler that samples the length of the source signals. Only used if random_signal_length is True.

Type:: sp.ContainerSampler

get_sampler()

Return dictionary containing the sampler objects of type acoupipe.sampler.BaseSampler.

this function has to be manually defined in a dataset subclass. It includes the sampler objects as values. The key defines the idx in the sample order.

e.g.: >>> sampler = { >>> 0 : BaseSampler(…), >>> 1 : BaseSampler(…), >>> … >>> }

Returns:: dictionary containing the sampler objects
Return type:: dict