openhsv package


openhsv.main module

Module contents

class openhsv.OpenHSV(app, base_folder='C:/openhsv', verbose=False)

Bases: PyQt5.QtWidgets.QWidget

OpenHSV is the main class for recording high-speed videoendoscopy footage and audio. It interacts with the audio interface and the camera, performs deep neural network based analysis and saves the data.

  • app (QtWidgets.QApplication) – To init OpenHSV, you only need to pass the QApplication instance
  • base_folder (str, optional) – Location where data is stored
  • verbose (boolean, optional) – Prints additional information to the Python console, defaults to False.

Opens a window to select patient from database.


shows the window maximized and updates the range indicator


Takes a screenshot from the current camera image and saves it as png file.


Opens settings dialog and saves settings

initSettings(exposureTime=245, videoSamplingRate=4000, audioSamplingRate=80000, audioBlockSize=4000, audioBufferSize=3, baseFolder='', saveRaw=True)

Initializes camera, audio and saving settings

  • exposureTime (int, optional) – camera exposure time in us, defaults to 245
  • videoSamplingRate (int, optional) – frames per second, defaults to 4000
  • audioSamplingRate (int, optional) – audio sampling rate in Hz, defaults to 80000
  • audioBlockSize (int, optional) – audio block size transmitted from interface, defaults to 4000
  • audioBufferSize (int, optional) – audio buffer size (multiples of block size), defaults to 3
  • baseFolder (str, optional) – base folder for data saving, defaults to ‘’
  • saveRaw (bool, optional) – if raw video data should be saved as lossless compressed mp4, defaults to True

Opens interface for patient information


updates the range indicator that shows the subselection of the video for download or analysis


Initializes camera connection. Open camera, do basic configuration and set advanced settings, such as exposure time and video sampling rate (fps). If camera connection could be established, enable further buttons.

Parameters:force_init (bool, optional) – forces (re-)initialization of camera, defaults to False
setImage(im, restore_view=True, restore_levels=False)

Shows image in the camera preview window. It further can restore the previous view, i.e. zoom and location, as well as the levels (contrast, brightness) of the previous image. Currently, the restoring level feature is by default deactivated, because in the examination procedure it is quite common that there is no signal (away from patient) or oversaturated (very close to the mouth/tongue).

  • im (numpy.ndarray) – image to be shown in preview
  • restore_view (bool, optional) – if view should be restored from previous image, defaults to True
  • restore_levels (bool, optional) – if contrast/brightness should be restored from previous image, defaults to False

initialize audio recorder and empties the audio queue and data list. It selects the first audio interface found (in OpenHSV the Focusrite Scarlet 2i2), selects both channels (by default, channel 1 is the camera reference signal and channel 2 the actual audio signal). Every audio data block is passed to the callback function. The callback works already on a separate thread, no need to move it to a different one. It also immmediately starts the recorder.


Stops audio recording and saves data from queue to internal memory

F0(channel_for_F0=1, intensity_threshold=5)

Calculates fundamental frequency from audio signal. It further saves the audio data to internal memory.

  • channel_for_F0 (int, optional) – selected audio channel for F0 calculation. In our setting, channel 0 is for the reference signal, channel 1 for the audio signal, defaults to 1
  • intensity_threshold (int, optional) – intensity threshold for calculating F0, defaults to 5

Starts camera (and audio) feed. If grabbing is already active, it stops grabbing from the camera and stops streaming audio data. A full screen preview is shown to provide maximum view. It starts fundamental frequency calculation and saves audio data in memory.


Analyzes the selected range of video data. The selected frames will be downloaded from the camera and subsequently processed, i.e. segmented by the neural network.


Saves the recorded and selected data. In particular, we save the metadata, including audio, video and patient metadata, audio data together with camera reference signal and video data.

Parameters:save_last_seconds (int) – the last seconds from recording end to be saved. We record one second after the stop-trigger, and we usually record one second of footage, thus, we need at least two seconds to ensure saving all relevant audio data. To adjust for some uncertainties, we recommend recording a few more seconds. Defaults to 4.


Saving the data in an appropriate format is not trivial. We both need to consider portability, cross-functionality and quality. Therefore, we save metadata as structured JSON file format, a common file format that can be opened and viewed with any text editor, but easily processed by a variety of data analysis software.

Further, audio data is saved as common wav files, as well as packed as HDF5 file. HDF5 is a very common container format that allows storing of complex data in a very efficient and convenient way.

Video data, however, is saved as mp4 file format, as this is highly portable and can be viewed with a common video viewers. The h264 codec also allows saving the video data in a lossless file format, needed for accurate data analysis while keeping the file size at a reasonable level and still ensure the ability to preview the video.

If there’s any segmentation already available, the segmentation maps are stored as well in HDF5 file format as binary maps.

close(self) → bool