DatasetSynthetic#

DatasetSynthetic is a purely synthetic microphone array source case generator. It relies on synthetic source signals from which the features are extracted and has been used in different publications, e.g. [KHS19], [KS22], [FZHX22]. The default virtual simulation setup considers a 64 channel microphone array and a planar observation area, as shown in the figure below.

../../_images/msm_layout.png

Default measurement setup used in the acoupipe.datasets.synthetic module.#

Default environmental properties#

Default Environmental Characteristics#

Environment

Anechoic, Resting, Homogeneous Fluid

Speed of sound

343 m/s

Microphone Array

Vogel’s spiral, \(M=64\), Aperture Size 1 m

Observation Area

x,y in [-0.5,0.5], z=0.5

Source Type

Monopole

Source Signals

Uncorrelated White Noise (\(T=5\,s\))

Default FFT parameters#

The underlying default FFT parameters are:

FFT Parameters#

Sampling Rate

He = 40, fs=13720 Hz

Block size

128 Samples

Block overlap

50 %

Windowing

von Hann / Hanning

Randomized properties#

Several properties of the dataset are randomized for each source case when generating the data. Their respective distributions are closely related to [HS17]. As such, the microphone positions are spatially disturbed to account for uncertainties in the microphone placement. The number of sources, their positions, and strength are randomly chosen. Uncorrelated white noise is added to the microphone channels by default.

Randomized properties#

Sensor Position Deviation [m]

Bivariate normal distributed (\(\sigma = 0.001)\)

No. of Sources

Poisson distributed (\(\lambda=3\))

Source Positions [m]

Bivariate normal distributed (\(\sigma = 0.1688\))

Source Strength (\([{Pa}^2]\) at reference position)

Rayleigh distributed (\(\sigma_{R}=5\))

Relative Noise Variance

Uniform distributed (\(10^{-6}\), \(0.1\))

Example#

from acoupipe.datasets.synthetic import DatasetSynthetic

dataset = DatasetSynthetic()
dataset_generator = dataset.generate(
    features=['sourcemap', 'loc', 'f', 'num'],  # choose the features to extract
    f=[1000, 2000, 3000],  # choose the frequencies to extract
    split='training',  # choose the split of the dataset
    size=10,  # choose the size of the dataset
)

# get the first data sample
data = next(dataset_generator)

# print the keys of the dataset
print(data.keys())

The generator yields one sample at a time as a dictionary. It includes the chosen features and the helper fields idx (sample index) and seeds (random seeds) for reproducibility in multi-processing scenarios.

API reference: acoupipe.datasets.synthetic.DatasetSynthetic