acoupipe.datasets.synthetic

Contains classes for the generation of microphone array data from synthesized signals for acoustic testing applications.

Currently, the following dataset generators are available:

  • DatasetSynthetic: A simple and fast method that relies on synthetic white noise signals and spatially stationary sources radiating under anechoic conditions.

../../../../_images/msm_layout.png

Default measurement setup used in the acoupipe.datasets.synthetic module.

Module Contents

class acoupipe.datasets.synthetic.ConfigBase

Bases: traits.api.HasPrivateTraits

Configuration base class for generating microphone array datasets.

get_sampler()

Return dictionary containing the sampler objects of type acoupipe.sampler.BaseSampler.

this function has to be manually defined in a dataset subclass. It includes the sampler objects as values. The key defines the idx in the sample order.

e.g.: >>> sampler = { >>> 0 : BaseSampler(…), >>> 1 : BaseSampler(…), >>> … >>> }

Returns:

dictionary containing the sampler objects

Return type:

dict

class acoupipe.datasets.synthetic.DatasetBase(config=None, tasks=1, logger=None)

Bases: traits.api.HasPrivateTraits

Base class for generating microphone array datasets with specified features and labels.

config

Configuration object for dataset generation.

Type:

ConfigBase

tasks

Number of parallel tasks for data generation. Defaults to 1 (sequential calculation).

Type:

int

get_feature_collection(features, f, num)

Get the feature collection of the dataset.

Returns:

BaseFeatureCollection object.

Return type:

BaseFeatureCollection

generate(features, split, size, f=None, num=0, start_idx=0, progress_bar=True)

Generate dataset samples iteratively.

Parameters:
  • features (list) – List of features included in the dataset. The features “seeds” and “idx” are always included.

  • split (str) – Split name for the dataset (‘training’, ‘validation’ or ‘test’).

  • size (int) – Size of the dataset (number of source cases).

  • f (float) – The center frequency or list of frequencies of the dataset. If None, all frequencies are included.

  • num (integer) –

    Controls the width of the frequency bands considered; defaults to 0 (single frequency line).

    num

    frequency band width

    0

    single frequency line

    1

    octave band

    3

    third-octave band

    n

    1/n-octave band

  • start_idx (int, optional) – Starting sample index (default is 0).

  • progress_bar (bool, optional) – Whether to show a progress bar (default is True).

Yields:

data (dict) – Generator that yields dataset samples as dictionaries containing the feature names as keys.

Examples

Generate features iteratively.

>>> from acoupipe.datasets.synthetic import DatasetSynthetic
>>> # define the features
>>> features = ["csm", "source_strength_analytic", "loc"]
>>> f = 1000
>>> num = 3
>>> # generate the dataset
>>> generator = DatasetSynthetic().generate(
        f=f, num=num, split="training", size=2, features=features)
>>> # iterate over the dataset
>>> for data in generator:
        print(data)
save_h5(features, split, size, name, f=None, num=0, start_idx=0, progress_bar=True)

Save dataset to a HDF5 file.

Parameters:
  • features (list) – List of features included in the dataset. The features “seeds” and “idx” are always included.

  • split (str) – Split name for the dataset (‘training’, ‘validation’ or ‘test’).

  • size (int) – Size of the dataset (number of source cases).

  • name (str) – Name of the HDF5 file.

  • f (float) – The center frequency or list of frequencies of the dataset. If None, all frequencies are included.

  • num (integer) –

    Controls the width of the frequency bands considered; defaults to 0 (single frequency line).

    num

    frequency band width

    0

    single frequency line

    1

    octave band

    3

    third-octave band

    n

    1/n-octave band

  • start_idx (int, optional) – Starting sample index (default is 0).

  • progress_bar (bool, optional) – Whether to show a progress bar (default is True).

Return type:

None

Examples

Save features to a HDF5 file.

>>> from acoupipe.datasets.synthetic import DatasetSynthetic
>>> # define the features
>>> features = ["csm", "source_strength_analytic", "loc"]
>>> f = 1000
>>> num = 3
>>> # save the dataset
>>> dataset = DatasetSynthetic().save_h5(
        f=f, num=num, split="training", size=10, features=features,name="/tmp/example.h5")
class acoupipe.datasets.synthetic.AnalyticNoiseStrengthFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:

str

freq_data

The frequency data to calculate the feature for.

Type:

instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:

float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:

int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:

list of tuples

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.AnalyticSourceStrengthFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:

str

freq_data

The frequency data to calculate the feature for.

Type:

instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:

float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:

int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:

list of tuples

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.BaseFeatureCollection

Bases: traits.api.HasPrivateTraits

BaseFeatureCollection base class for handling feature funcs.

feature_funcs

List of feature_funcs.

Type:

list

add_feature_func(feature_func)

Add a feature_func to the BaseFeatureCollection.

Parameters:

feature_func (str) – Feature to be added.

get_feature_funcs()

Get all feature_funcs of the BaseFeatureCollection.

Returns:

List of feature_funcs.

Return type:

list

class acoupipe.datasets.synthetic.BaseFeatureCollectionBuilder

Bases: traits.api.HasPrivateTraits

BaseFeatureCollectionBuilder base class for building a BaseFeatureCollection.

feature_collection

BaseFeatureCollection object.

Type:

BaseFeatureCollection

add_custom(feature_func)

Add a custom feature to the BaseFeatureCollection.

Parameters:

feature_func (str) – Feature to be added.

build()

Build a BaseFeatureCollection.

Returns:

BaseFeatureCollection object.

Return type:

BaseFeatureCollection

class acoupipe.datasets.synthetic.CSMFeature

Bases: SpectraFeature

CSMFeature class for handling cross-spectral matrix calculation.

name

Name of the feature (default=’csm’).

Type:

str

freq_data

The object which calculates the cross-spectral matrix.

Type:

instance of class acoular.PowerSpectra

f

the frequency (or center frequency) of interest

Type:

float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:

int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:

list of tuples

static calc_csm1(sampler, freq_data, name)

Calculate the cross-spectral matrix (CSM) from time data.

Parameters:

freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature

Returns:

The complex-valued cross-spectral matrix with shape (numfreq, num_mics, num_mics).

Return type:

numpy.array

static calc_csm2(sampler, freq_data, fidx, name)

Calculate the cross-spectral matrix (CSM) from time data.

Parameters:
  • freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature

  • fidx (list of tuples, optional) – list of tuples containing the start and end indices of the frequency bands to be considered, by default None

Returns:

The complex-valued cross-spectral matrix with shape (numfreq, num_mics, num_mics) with numfreq depending on the number of frequencies in fidx.

Return type:

numpy.array

get_feature_func()

Return the callable for calculating the cross-spectral matrix.

class acoupipe.datasets.synthetic.CSMtriuFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:

str

freq_data

The frequency data to calculate the feature for.

Type:

instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:

float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:

int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:

list of tuples

static calc_csmtriu1(sampler, freq_data, name)

Calculate the cross-spectral matrix (CSM) from time data.

Parameters:

freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature

Returns:

The real-valued cross-spectral matrix with shape (numfreq, num_mics, num_mics).

Return type:

numpy.array

static calc_csmtriu2(sampler, freq_data, fidx, name)

Calculate the cross-spectral matrix (CSM) from time data.

Parameters:
  • freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature

  • fidx (list of tuples, optional) – list of tuples containing the start and end indices of the frequency bands to be considered, by default None

Returns:

The real-valued cross-spectral matrix with shape (numfreq, num_mics, num_mics) with numfreq depending on the number of frequencies in fidx.

Return type:

numpy.array

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.EigmodeFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:

str

freq_data

The frequency data to calculate the feature for.

Type:

instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:

float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:

int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:

list of tuples

static calc_eigmode1(sampler, freq_data, name)

Calculate the eigenvalue-scaled eigenvectors of the cross-spectral matrix (CSM) from time data.

Parameters:

freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature

Returns:

The eigenvalue scaled eigenvectors with shape (numfreq, num_mics, num_mics).

Return type:

numpy.array

static calc_eigmode2(sampler, freq_data, fidx, name)

Calculate the eigenvalue-scaled eigenvectors of the cross-spectral matrix (CSM) from time data.

Parameters:
  • freq_data (instance of class acoular.PowerSpectra) – power spectra to calculate the csm feature

  • fidx (list of tuples, optional) – list of tuples containing the start and end indices of the frequency bands to be considered, by default None

Returns:

The eigenvalue scaled eigenvectors with shape (numfreq, num_mics, num_mics) with numfreq depending on the number of frequencies in fidx.

Return type:

numpy.array

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.EstimatedNoiseStrengthFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:

str

freq_data

The frequency data to calculate the feature for.

Type:

instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:

float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:

int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:

list of tuples

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.EstimatedSourceStrengthFeature

Bases: SpectraFeature

Handles the calculation of features in the frequency domain.

name

Name of the feature.

Type:

str

freq_data

The frequency data to calculate the feature for.

Type:

instance of class acoular.BaseSpectra

f

the frequency (or center frequency) of interest

Type:

float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:

int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:

list of tuples

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.LocFeature

Bases: BaseFeatureCatalog

BaseFeatureCatalog base class for handling feature funcs.

name

Name of the feature.

Type:

str

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.SourcemapFeature

Bases: BaseFeatureCatalog

SourcemapFeature class for handling the generation of sourcemaps obtained with microphone array methods.

name

Name of the feature (default=’sourcemap’).

Type:

str

beamformer

The beamformer to calculate the sourcemap.

Type:

instance of class acoular.BeamformerBase

f

The center frequency or list of frequencies of the dataset. If None, all frequencies are included.

Type:

float

num

Controls the width of the frequency bands considered; defaults to 0 (single frequency line).

num

frequency band width

0

single frequency line

1

octave band

3

third-octave band

n

1/n-octave band

Type:

integer

fidx

List of tuples containing the start and end indices of the frequency bands to be considered. Is determined automatically from attr:f and attr:num.

Type:

list of tuples

set_freq_limits()

Set the frequency limits of the beamformer so that the result is only calculated for necessary frequencies.

get_feature_func()

Return the callable for calculating the sourcemap.

class acoupipe.datasets.synthetic.SpectrogramFeature

Bases: SpectraFeature

SpectrogramFeature class for handling spectrogram features.

name

Name of the feature (default=’spectrogram’).

Type:

str

freq_data

The object which calculates the spectrogram data.

Type:

instance of class acoular.FFTSpectra

f

the frequency (or center frequency) of interest

Type:

float

num

the frequency band (0: single frequency line, 1: octave band, 3: third octave band)

Type:

int

fidx

List of tuples containing the start and end indices of the frequency bands to be considered.

Type:

list of tuples

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.TargetmapFeature

Bases: BaseFeatureCatalog

BaseFeatureCatalog base class for handling feature funcs.

name

Name of the feature.

Type:

str

get_feature_func()

Will return a method depending on the class parameters.

class acoupipe.datasets.synthetic.TimeDataFeature

Bases: BaseFeatureCatalog

TimeDataFeature class for handling time data.

name

Name of the feature (default=’time_data’).

Type:

str

time_data

The source delivering the time data.

Type:

instance of class acoular.SamplesGenerator

get_feature_func()

Return the callable for calculating the time data.

class acoupipe.datasets.synthetic.PowerSpectraAnalytic

Bases: acoular.PowerSpectraImport

Provides a dummy class for using pre-calculated cross-spectral matrices.

This class does not calculate the cross-spectral matrix. Instead, the user can inject one or multiple existing CSMs by setting the csm attribute. This can be useful when algorithms shall be evaluated with existing CSM matrices. The frequency or frequencies contained by the CSM must be set via the attr:frequencies attribute. The attr:numchannels attributes is determined on the basis of the CSM shape. In contrast to the PowerSpectra object, the attributes sample_freq, time_data, source, block_size, calib, window, overlap, cached, and num_blocks have no functionality.

fftfreq()

Return the Discrete Fourier Transform sample frequencies.

Returns:

f – Array of length block_size/2+1 containing the sample frequencies.

Return type:

ndarray

acoupipe.datasets.synthetic.get_all_source_signals(source_list)

Get all signals from a list of acoular.SamplesGenerator derived objects.

Parameters:

source_list (list) – list of acoular.SamplesGenerator derived objects

Returns:

list of all acoular.SignalGenerator derived objects

Return type:

list

acoupipe.datasets.synthetic.get_uncorrelated_noise_source_recursively(source)

Recursively get all uncorrelated noise sources from a acoular.TimeInOut object.

Parameters:

source (instance of class acoular.TimeInOut) – the source object

Returns:

list of all uncorrelated noise sources

Return type:

list

class acoupipe.datasets.synthetic.DatasetSynthetic(mode='welch', mic_pos_noise=True, mic_sig_noise=True, snap_to_grid=False, random_signal_length=False, signal_length=5, fs=13720.0, min_nsources=1, max_nsources=10, tasks=1, logger=None, config=None)

Bases: acoupipe.datasets.base.DatasetBase

DatasetSynthetic is a purely synthetic microphone array source case generator.

DatasetSynthetic relies on synthetic source signals from which the features are extracted and has been used in different publications, e.g. [KHS19], [KS22], [FZHX22]. The default virtual simulation setup consideres a 64 channel microphone array and a planar observation area, as shown in the default measurement setup figure.

Default environmental properties

Default Environmental Characteristics

Environment

Anechoic, Resting, Homogeneous Fluid

Speed of sound

343 m/s

Microphone Array

Vogel’s spiral, \(M=64\), Aperture Size 1 m

Observation Area

x,y in [-0.5,0.5], z=0.5

Source Type

Monopole

Source Signals

Uncorrelated White Noise (\(T=5\,s\))

Default FFT parameters

The underlying default FFT parameters are:

FFT Parameters

Sampling Rate

He = 40, fs=13720 Hz

Block size

128 Samples

Block overlap

50 %

Windowing

von Hann / Hanning

Default randomized properties

Several properties of the dataset are randomized for each source case when generating the data. Their respective distributions, are closely related to [HS17]. As such, the the microphone positions are spatially disturbed to account for uncertainties in the microphone placement. The number of sources, their positions, and strength is randomly chosen. Uncorrelated white noise is added to the microphone channels by default.

Randomized properties

Sensor Position Deviation [m]

Bivariate normal distributed (\(\sigma = 0.001)\)

No. of Sources

Poisson distributed (\(\lambda=3\))

Source Positions [m]

Bivariate normal distributed (\(\sigma = 0.1688\))

Source Strength (\([{Pa}^2]\) at reference position)

Rayleigh distributed (\(\sigma_{R}=5\))

Relative Noise Variance

Uniform distributed (\(10^{-6}\), \(0.1\))

Example

from acoupipe.datasets.synthetic import DatasetSynthetic

dataset = DatasetSynthetic()
dataset_generator = dataset.generate_dataset(
    features=["sourcemap", "loc", "f", "num"], # choose the features to extract
    f=[1000,2000,3000], # choose the frequencies to extract
    split='training', # choose the split of the dataset
    size=10, # choose the size of the dataset
    )

# get the first data sample
data = next(dataset_generator)

# print the keys of the dataset
print(data.keys())

Initialization Parameters

Initialize the DatasetSynthetic object.

The input parameters are passed to the DatasetSyntheticConfig object, which creates all necessary objects for the simulation of microphone array data.

Parameters:
  • mode (str) – Type of calculation method. Can be either welch, analytic or wishart. Defaults to welch.

  • mic_pos_noise (bool) – Apply positional noise to microphone geometry. Defaults to True.

  • mic_sig_noise (bool) – Apply additional uncorrelated white noise to microphone signals. Defaults to True.

  • snap_to_grid (bool) – Snap source locations to grid. The grid is defined in the config object as config.grid. Defaults to False.

  • random_signal_length (bool) – Randomize signal length. Defaults to False. If True, the signal length is uniformly sampled from the interval [1s,10s].

  • signal_length (float) – Length of the signal in seconds. Defaults to 5 seconds.

  • fs (float) – Sampling frequency in Hz. Defaults to 13720 Hz.

  • min_nsources (int) – Minimum number of sources in the dataset. Defaults to 1.

  • max_nsources (int) – Maximum number of sources in the dataset. Defaults to 10.

  • tasks (int) – Number of parallel tasks. Defaults to 1.

  • logger (logging.Logger) – Logger object. Defaults to None.

  • config (DatasetSyntheticConfig) – Configuration object. Defaults to None. If None, a default configuration object is created.

get_feature_collection(features, f, num)

Get the feature collection of the dataset.

Returns:

BaseFeatureCollection object.

Return type:

BaseFeatureCollection

acoupipe.datasets.synthetic.sample_signal_length(rng)
class acoupipe.datasets.synthetic.DatasetSyntheticConfig(**kwargs)

Bases: acoupipe.datasets.base.ConfigBase

Default Configuration class.

fs

Sampling frequency in Hz.

Type:

float

signal_length

Length of the source signals in seconds.

Type:

float

max_nsources

Maximum number of sources.

Type:

int

min_nsources

Minimum number of sources.

Type:

int

mode

Type of CSM calculation method.

Type:

str

mic_pos_noise

Apply positional noise to microphone geometry.

Type:

bool

mic_sig_noise

Apply signal noise to microphone signals.

Type:

bool

snap_to_grid

Snap source locations to grid.

Type:

bool

random_signal_length

Randomize signal length (Default: uniformly sampled signal length [1s,10s]).

Type:

bool

fft_params

FFT parameters with default items block_size=128, overlap="50%", window="Hanning" and precision="complex64".

Type:

dict

env

Instance of acoular.Environment defining the environmental coditions, i.e. the speed of sound.

Type:

ac.Environment

mics

Instance of acoular.MicGeom defining the microphone array geometry.

Type:

ac.MicGeom

noisy_mics

a second instance of acoular.MicGeom defining the noisy microphone array geometry.

Type:

ac.MicGeom

obs

Instance of acoular.MicGeom defining the observation point which is used as the reference position when calculating the source strength.

Type:

ac.MicGeom

grid

Instance of acoular.RectGrid defining the grid on which the Beamformer calculates the source map and on which the targetmap feature is calculated.

Type:

ac.RectGrid

source_grid

Instance of acoular.Grid. Only relevant if snap_to_grid is True. Then, the source locations are snapped to this grid. Default is a copy of grid.

Type:

ac.Grid

beamformer

Instance of acoular.BeamformerBase defining the beamformer used to calculate the sourcemap.

Type:

ac.BeamformerBase

steer

Instance of acoular.SteeringVector defining the steering vector used to calculate the sourcemap.

Type:

ac.SteeringVector

freq_data

Instance of acoular.PowerSpectra defining the frequency domain data. Only used if mode is welch. Otherwise, an instance of acoupipe.datasets.spectra_analytic.PowerSpectraAnalytic is used.

Type:

ac.PowerSpectra

fft_spectra

Instance of acoular.FFTSpectra used to calculate the spectrogram data. Only used if mode is welch.

Type:

ac.FFTSpectra

fft_obs_spectra

Instance of acoular.PowerSpectra used to calculate the source strength at the observation point given in obs.

Type:

ac.PowerSpectra

signals

List of signals.

Type:

list

sources

List of sources.

Type:

list

mic_noise_signal

Noise signal configuration object.

Type:

ac.SignalGenerator

mic_noise_source

Noise source configuration object.

Type:

ac.UncorrelatedNoiseSource

micgeom_sampler

Sampler that applies positional noise to the microphone geometry.

Type:

sp.MicGeomSampler

location_sampler

Source location sampler that samples the locations of the sound sources.

Type:

sp.LocationSampler

rms_sampler

Signal RMS sampler that samples the RMS values of the source signals.

Type:

sp.ContainerSampler

nsources_sampler

Number of sources sampler.

Type:

sp.NumericAttributeSampler

mic_noise_sampler

Microphone noise sampler that creates random uncorrelated noise at the microphones.

Type:

sp.ContainerSampler

signal_length_sampler

Signal length sampler that samples the length of the source signals. Only used if random_signal_length is True.

Type:

sp.ContainerSampler

get_sampler()

Return dictionary containing the sampler objects of type acoupipe.sampler.BaseSampler.

this function has to be manually defined in a dataset subclass. It includes the sampler objects as values. The key defines the idx in the sample order.

e.g.: >>> sampler = { >>> 0 : BaseSampler(…), >>> 1 : BaseSampler(…), >>> … >>> }

Returns:

dictionary containing the sampler objects

Return type:

dict