Simulating realistically-spatialised simultaneous speech using video-driven speaker detection and the CHiME-5 dataset

Download from Google Drive.