Welcome to OpenHSV’s documentation!¶
OpenHSV is an open platform for laryngeal high-speed videoendoscopy. This documentation will help you to setup OpenHSV and to do data analysis.

Note
OpenHSV is only a research tool and is not FDA approved. You may consult your local ethics committee before using OpenHSV in your environment/clinic.
Introduction¶
OpenHSV is an open source system for high-speed videoendoscopy (HSV). In contrast to videostroboscopy, with HSV we are able to see each vocal fold oscillation cycle with high temporal resolution. To not only visualize but also quantify the vocal fold oscillation behavior, OpenHSV contains also a data analysis part that segments the glottal area in each frame and computes quantitative parameters thereof. We further acquire simultaneously audio signals that are synchronized to the video signal. Quantitative parameters derived from the audio signal are also computed.
In this user guide, we provide the steps to built the system yourself.
Steps:
- Hardware setup
- Software setup
- First acquisition
- Data analysis
Hardware setup¶
The OpenHSV system consists of an audio, video, illumination and acquisition module.

Audio¶
The audio signal is acquired using a high-quality lavalier microphone, e.g. DPA 4060. This is connected to a conventional USB-powered two-channel audio interface. Two channels, because we need one channel for the microphone and one for the camera synch. The microphone is mounted on a custom 3D-printed clamp that is mounted on the endoscope.
Video¶
Follow these steps:
- Mount objective/video coupler to camera
- Connect video coupler to endoscope
- Connect light guide to endoscope
- Mount microphone mount on endoscope
- Connect camera to breakout board
- Connect microphone to audio interface
- Connect camera synch out to audio interface (you may need a BNC -> TSSR adapter)
- Connect camera to computer (patch cable)
You can start and stop the acquisition using the OpenHSV software; however, stopping the acquisition using a foot switch is better suited in an examination setting. For that, connect a foot switch to the TRIG IN port. Ensure that the foot switch has a BNC connector.
Note
Foot switch may come with loose ends. This may need some additional tinkering, such as soldering a BNC connector to the cable. As replacement, you may also use a screw-type BNC connector.
Illumination¶
For high-speed videoendoscopy, a powerful illumination unit, such as STORZ TL 300, is unevitable. Connect the power cable and the light guide accordingly. You may need 100% of light output, depending on your video coupler and recording sampling rate.
Acquisition¶
We use a conventional personal computer (PC) installed with Windows 10. Depending on your PC, you may add another Gigabit Ethernet adapter to have both, network access and camera data transfer. With IDT cameras, the Ethernet adapter needs to have a similar IP address. Check with the camera’s IP address and change it accordingly for your case. E.g. the camera has 100.0.0.100, then you may use 100.0.0.99 on your PC.
Software setup¶
We assume that Windows 10 is running on your system, and you have basic knowledge in Python.
Requirements¶
Ensure that the following software is installed on your computer:
- Python 3.x, tested with Python 3.6 and Anaconda package
- IDT SDK (to access drivers, get from distributor)
Supported cameras¶
OpenHSV currently provides only support for the CCM series of IDT high-speed cameras, but can be extended to your individual case.
Install and Operate OpenHSV¶
Clone our Github repository and install the openhsv
package with pip
:
pip install setup.py
Run OpenHSV by executing the main.py
script.
python main.py
An easy way to create a shortcut to the OpenHSV software is to create a bat
file with the following content:
# Activate anaconda3 environment
call "path/to/activate.bat" "path/to/anaconda3"
# Go to openhsv directory
pushd "path/to/openhsv"
# Execute main.py without showing console
python.exe -u "main.py"
You may then place a shortcut to that file on the Desktop to allow easy access for examiners.
A common problem is that the camera drivers are not found. Ensure that the driver files are either available in the system PATH or directly in the OpenHSV directory. You may need all DLLs from the IDT SDK.
Testing environment¶
If no camera is available or other parts of OpenHSV should be tested, we supply a dummy camera that loops through the example video shipped with OpenHSV.
You only need to change in the __init__.py
file the following line:
from openhsv.hardware.camera import IdtCamera as Camera
to
from openhsv.hardware.camera import DummyCamera as Camera
Your fist acquisition¶
Follow these steps:
- Connect everything according to Hardware setup.
- Ensure all software is installed according to Software setup.
- Turn on computer, illumination, and camera.
- Open the OpenHSV software.
- Click on “Initialize camera”. This opens a connection to the camera using the default settings.
- Click on “Start Camera Feed”. Now, the camera should stream images as live preview and the audio signals should show the reference signal and an acoustic trace.
- Click on “Stop Camera Feed” or use the optional foot switch to stop the recording.
- Use one of the sliders below to browse through the acquired images.
- Select a region of interest using the two range sliders. The bright blue area indicates the selected files.
- You may analyze your data by clicking on “Start Analysis - Glottis Segmentation” after selecting the glottis using the rectangle in the preview window. See also Data Analysis.
- Click on “Save data” to save the audio, video and meta data to the local file system. The frames are downloaded from the camera and stored in conventional formats.
Saving data¶
Data is stored in several ways:
Data source | File type |
---|---|
Video data | mp4 (compressed) |
mp4 (lossless, recommended for further analysis) | |
Audio data | wav (uncompressed) |
Meta data | json (human-readable, uncompressed) |
Analysis | hdf5 (if available) |
Parameters | csv (if available) |
Data Analysis¶
Data analysis is performed for audio (a) and video (b) data and follows different paths:

Audio analysis¶
Audio data is cropped to the last 4 seconds (arbitrary, can be changed). The landmarks for acquisition and synchronization are detected using several peak finding algorithms (see code). The audio data that corresponds to the video data is subsequently analyzed (raw signal in above Figure, panel a).
Quantitative parameters are computed on the signal. Currently, these are:
- mean-jitter
- jitter-percent
- mean-shimmer
- shimmer-percent
- fundamental frequency (F0) and standard deviation (STD)
- Harmonics-to-Noise-Ratio (HNR)
- Cepstral Peak Prominence (CPP)
Video analysis¶
The glottal area is segmented using a deep convolutional neural network that was trained on the BAGLS https://www.bagls.org/ dataset. Next, we use the segmented glottal area to
- compute the glottal area waveform (GAW)
- estimate the glottal midline to
- compute GAW for left and right vocal fold
- compute phonovibrogram (PVG)
With that, we follow the analysis pipeline as shown in the above Figure, panel b, to finally compute quantitative parameters. Currently, these are:
- Open Quotient (OQ)
- Closing Quotient (CQ)
- Asymmetry Quotient (AQ)
- Rate Quotient (RQ)
- Speed Quotient (SQ)
- Speed Index (SI)
- Fundamental frequency (F0)
- Amplitude perturbation factor (APF)
- Amplitude perturbation quotient (APQ)
- Glottis gap index (GGI)
- Amplitude Quotient (AQ)
- Stiffness
- Amplitude Symmetry Index (ASI)
- Phase Asymmetry Index (PAI)
Analysis¶
Parameters are shown in a separate window using a tabular view. All computed parameters can be exported as CSV file.
3D printed parts¶
We 3D-printed the following parts to fit our needs. Everything was printed in Prusa PLA using the Prusa Mini FDM 3D-printer. We use typically an infill of 15% and 200 um layer size. All parts were designed in Fusion 360.
Microphone mount¶
The microphone mount consists of two parts: Part1 and Part2. You further need M3 screws and nuts to assemble it. Depending on your endoscope and the examiner this mount may cause some discomfort in handling the imaging unit.

Droplet protection shield¶
Also known as Corona shield. It is mounted with M4 screws on the IDT camera to protect the camera. On the side we provide an integrated microphone mount. The microphone cable enters on the side and is then pushed inwards.

openhsv¶
openhsv package¶
Submodules¶
openhsv.main module¶
Module contents¶
-
class
openhsv.
OpenHSV
(app, base_folder='C:/openhsv', verbose=False)¶ Bases:
PyQt5.QtWidgets.QWidget
OpenHSV is the main class for recording high-speed videoendoscopy footage and audio. It interacts with the audio interface and the camera, performs deep neural network based analysis and saves the data.
Parameters: - app (QtWidgets.QApplication) – To init OpenHSV, you only need to pass the QApplication instance
- base_folder (str, optional) – Location where data is stored
- verbose (boolean, optional) – Prints additional information to the Python console, defaults to False.
-
findpatient
()¶ Opens a window to select patient from database.
-
showMaximized
()¶ shows the window maximized and updates the range indicator
-
screenshot
()¶ Takes a screenshot from the current camera image and saves it as png file.
-
settings
()¶ Opens settings dialog and saves settings
-
initSettings
(exposureTime=245, videoSamplingRate=4000, audioSamplingRate=80000, audioBlockSize=4000, audioBufferSize=3, baseFolder='', saveRaw=True)¶ Initializes camera, audio and saving settings
Parameters: - exposureTime (int, optional) – camera exposure time in us, defaults to 245
- videoSamplingRate (int, optional) – frames per second, defaults to 4000
- audioSamplingRate (int, optional) – audio sampling rate in Hz, defaults to 80000
- audioBlockSize (int, optional) – audio block size transmitted from interface, defaults to 4000
- audioBufferSize (int, optional) – audio buffer size (multiples of block size), defaults to 3
- baseFolder (str, optional) – base folder for data saving, defaults to ‘’
- saveRaw (bool, optional) – if raw video data should be saved as lossless compressed mp4, defaults to True
-
patient
()¶ Opens interface for patient information
-
updateRangeIndicator
()¶ updates the range indicator that shows the subselection of the video for download or analysis
-
initCamera
(force_init=False)¶ Initializes camera connection. Open camera, do basic configuration and set advanced settings, such as exposure time and video sampling rate (fps). If camera connection could be established, enable further buttons.
Parameters: force_init (bool, optional) – forces (re-)initialization of camera, defaults to False
-
playStop
()¶
-
setImage
(im, restore_view=True, restore_levels=False)¶ Shows image in the camera preview window. It further can restore the previous view, i.e. zoom and location, as well as the levels (contrast, brightness) of the previous image. Currently, the restoring level feature is by default deactivated, because in the examination procedure it is quite common that there is no signal (away from patient) or oversaturated (very close to the mouth/tongue).
Parameters: - im (numpy.ndarray) – image to be shown in preview
- restore_view (bool, optional) – if view should be restored from previous image, defaults to True
- restore_levels (bool, optional) – if contrast/brightness should be restored from previous image, defaults to False
-
nextFrame
()¶
-
initAudio
()¶ initialize audio recorder and empties the audio queue and data list. It selects the first audio interface found (in OpenHSV the Focusrite Scarlet 2i2), selects both channels (by default, channel 1 is the camera reference signal and channel 2 the actual audio signal). Every audio data block is passed to the callback function. The callback works already on a separate thread, no need to move it to a different one. It also immmediately starts the recorder.
-
stopAudio
()¶ Stops audio recording and saves data from queue to internal memory
-
F0
(channel_for_F0=1, intensity_threshold=5)¶ Calculates fundamental frequency from audio signal. It further saves the audio data to internal memory.
Parameters: - channel_for_F0 (int, optional) – selected audio channel for F0 calculation. In our setting, channel 0 is for the reference signal, channel 1 for the audio signal, defaults to 1
- intensity_threshold (int, optional) – intensity threshold for calculating F0, defaults to 5
-
startCamera
()¶ Starts camera (and audio) feed. If grabbing is already active, it stops grabbing from the camera and stops streaming audio data. A full screen preview is shown to provide maximum view. It starts fundamental frequency calculation and saves audio data in memory.
-
analyze
()¶ Analyzes the selected range of video data. The selected frames will be downloaded from the camera and subsequently processed, i.e. segmented by the neural network.
-
save
(save_last_seconds=4)¶ Saves the recorded and selected data. In particular, we save the metadata, including audio, video and patient metadata, audio data together with camera reference signal and video data.
Parameters: save_last_seconds (int) – the last seconds from recording end to be saved. We record one second after the stop-trigger, and we usually record one second of footage, thus, we need at least two seconds to ensure saving all relevant audio data. To adjust for some uncertainties, we recommend recording a few more seconds. Defaults to 4. Note
Saving the data in an appropriate format is not trivial. We both need to consider portability, cross-functionality and quality. Therefore, we save metadata as structured JSON file format, a common file format that can be opened and viewed with any text editor, but easily processed by a variety of data analysis software.
Further, audio data is saved as common wav files, as well as packed as HDF5 file. HDF5 is a very common container format that allows storing of complex data in a very efficient and convenient way.
Video data, however, is saved as
mp4
file format, as this is highly portable and can be viewed with a common video viewers. The h264 codec also allows saving the video data in a lossless file format, needed for accurate data analysis while keeping the file size at a reasonable level and still ensure the ability to preview the video.If there’s any segmentation already available, the segmentation maps are stored as well in HDF5 file format as binary maps.
-
close
(self) → bool¶
openhsv.analysis package¶
Module contents¶
Submodules¶
openhsv.analysis.nn module¶
-
class
openhsv.analysis.nn.
Analysis
(app=None)¶ Bases:
PyQt5.QtWidgets.QWidget
Analysis widget that shows the segmentation process of the neural network.
Parameters: - QWidget (PyQt5.QtWidgets.QWidget) – Inherits from QWidget
- app (PyQt5.QtWidgets.QWidget, optional) – QApplication, needed to process events to avoid freezing of the GUI, defaults to None
-
segmentSequence
(ims, normalize=True, reinit=True)¶ segments an image sequence, such as a video, frame by frame.
Parameters: - ims (list of numpy.ndarray, or numpy.ndarray) – collection of images
- normalize (bool, optional) – normalize 0..255 to -1..1, defaults to True
- reinit (bool, optional) – deletes any previous segmentation information, defaults to True
-
segment
(im)¶ Segments an endoscopic image using a deep neural network
Parameters: im – np.ndarray (HxWx3) Returns:
-
setAudio
(audio)¶
-
syncAudio
(start_frame, end_frame, total_frames, debug=False)¶
-
computeParameters
(dt_audio=1.25e-05, dt_video=0.00025, debug=False)¶ Compute parameters from GAW
Parameters: - dt_audio (float, optional) – audio sampling time in seconds, defaults to 1/80000
- dt_video (float, optional) – video sampling time in seconds, defaults to 1/4000
- debug (bool, optional) – shows debugging information and plots, defaults to False
-
get
()¶ returns GAW and segmentation maps for video
Returns: GAW and segmentations Return type: tuple(list, list(numpy.ndarray))
-
class
openhsv.analysis.nn.
ROIDialog
(ims)¶ Bases:
PyQt5.QtWidgets.QDialog
openhsv.analysis.parameters module¶
-
openhsv.analysis.parameters.
movingAverage
¶ computes moving average from x with a window of n
Parameters: - x (numpy.ndarray) – input array
- n (int, optional) – window, defaults to 3
Returns: filtered array
Return type: numpy.ndarray
-
openhsv.analysis.parameters.
detectMaximaMinima
(s, distance=5, rel_height=0.01, use_prominence=True, clean_f0=None)¶ Detect maxima and minima from a signal
s
.Parameters: - s (numpy.ndarray) – signal
- distance (int, optional) – distance between two peaks in samples, defaults to 5
- rel_height (float, optional) – minimum relative height of a found peak, defaults to .35
- use_prominence (bool, optional) – uses peak prominence for peak detection, defaults to True
Returns: tuple of maxima and minima locations
Return type: tuple(numpy.ndarray, numpy.ndarray)
-
openhsv.analysis.parameters.
detectOpeningAndClosingEvents
(signal, p_max, t=0.02)¶ Detects glottis opening and closing events relative to maximum opening events.
Parameters: - signal (numpy.ndarray) – glottal area waveform
- p_max (list of indexes) – maxima of glottal area waveform
- t (float, optional) – threshold for finding signal bottom, defaults to 0.02
Returns: opening and closing point for each maximum
Return type: tuple(list(int), list(int))
-
openhsv.analysis.parameters.
computeOpenAndClosedIntervals
(t, opening, closed)¶ computes the opened and closed intervals during each cycle.
Parameters: - t (numpy.ndarray) – time
- opening (list(int)) – indices of opening points
- closed (list(int)) – indices of closing points
Returns: duration of opening and closed phases
Return type: tuple(list(float), list(float))
-
openhsv.analysis.parameters.
computeOCandCOTransitions
(t, opening, closed, p_max)¶ Computes Open->Closed (OC) and Closed-Open (CO) transitions.
Parameters: - t (numpy.ndarray) – time
- opening (list(int)) – indices of opening points
- closed (list(int)) – indices of closing points
- p_max (numpy.ndarray) – indices of cycle maxima
Returns: CO and OC durations
Return type: tuple(numpy.ndarray, numpy.ndarray)
-
openhsv.analysis.parameters.
F0fromCycles
(T, verbose=False, epsilon=1e-09)¶ determine fundamental frequency (F0) based on period lengths
Parameters: - T (numpy.ndarray) – periods
- verbose (bool, optional) – prints F0 mean and standard deviation, defaults to False
Returns: mean of F0 and std of F0
Return type: tuple(float, float)
-
openhsv.analysis.parameters.
F0fromFFT
(fft, freqs, freq_lower=75, freq_higher=500)¶ fundamental frequency from power spectrum determined assuming that fundamental frequency is equal to the dominant frequency. Frequency is determined in physiological range (75-500 Hz).
Parameters: - fft (numpy.ndarray) – fast fourier transformation of signal
- freqs (numpy.ndarray) – frequencies corresponding to FFT
- freq_lower (int, optional) – lower border for F0 prediction, defaults to 75
- freq_higher (int, optional) – higher border for F0 prediction, defaults to 500
Returns: fundamental frequency (F0)
Return type: float
-
openhsv.analysis.parameters.
F0fromAutocorrelation
(signal, freq=40000)¶ Fundamental frequency using autocorrelation.
- Steps:
- rFFT of signal
- Corresponding frequencies
- Autocorrelation
- Find peaks in autocorrelation
- Remove first unrelevant peaks
- Use first peak as fundamental frequency
Parameters: - signal (numpy.ndarray) – Signal (e.g. audio trace or GAW)
- freq (int) – sampling frequency, defaults to 40000 Hz
Returns: fundamental frequency (F0)
Return type: float
-
openhsv.analysis.parameters.
asymmetryQuotient
(CO, OC)¶ Asymmetry Quotient (AsyQ)
\[\text{AsyQ} = \frac{1}{N} \sum_{i=1}^{N} \frac{\frac{CO_i}{OC_i}}{1+\frac{CO_i}{OC_i}}\]Parameters: - CO (numpy.ndarray) – Closed->Open transitions
- OC (numpy.ndarray) – Open->Closed transitions
Returns: asymmetry quotient (AQ), a.u.
Return type: float
-
openhsv.analysis.parameters.
closingQuotient
(CO, t_open)¶ Closing Quotient (CQ)
\[\text{CQ} = \frac{1}{N} \sum_{i=1}^{N} \frac{CO_i}{t_{open,i}}\]Parameters: - CO (numpy.ndarray) – Closed->Open transitions
- t_open – Open interval
Returns: closing quotient (CQ), a.u.
Return type: float
-
openhsv.analysis.parameters.
openQuotient
(t_open, t_closed)¶ Open Quotient (OQ)
\[\text{OQ} = \frac{1}{N} \sum_{i=1}^{N} \frac{t_{open,i}}{t_{open,i}+t_{closed,i}}\]Parameters: - t_open (numpy.ndarray) – Open intervals
- t_closed (numpy.ndarray) – Closed intervals
Returns: open quotient (OQ), a.u.
Return type: float
-
openhsv.analysis.parameters.
rateQuotient
(CO, OC, t_closed)¶ Rate Quotient (RQ)
\[\text{RQ} = \frac{1}{N} \sum_{i=1}^{N} \frac{t_{closed,i} - CO_i}{OC_i}\]Parameters: - CO (numpy.ndarray) – Closed->Open transitions
- OC (numpy.ndarray) – Open->Closed transitions
- t_closed (numpy.ndarray) – closed intervals
Returns: rate quotient (RQ), a.u.
Return type: float
-
openhsv.analysis.parameters.
speedIndex
(CO, OC, t_open)¶ Speed Index (SI)
\[\text{SI} = \frac{1}{N} \sum_{i=1}^{N} \frac{CO_i - OC_i}{t_{open,i}}\]Parameters: - CO (numpy.ndarray) – Closed->Open transitions
- OC (numpy.ndarray) – Open->Closed transitions
- t_open (numpy.ndarray) – open intervals
Returns: speed index (SI), a.u.
Return type: float
-
openhsv.analysis.parameters.
speedQuotient
(CO, OC)¶ Speed Quotient (SQ)
\[\text{SQ} = \frac{1}{N} \sum_{i=1}^{N} \frac{CO_i}{OC_i}\]Parameters: - CO (numpy.ndarray) – Closed->Open transitions
- OC (numpy.ndarray) – Open->Closed transitions
Returns: speed quotient (SQ), a.u.
Return type: float
-
openhsv.analysis.parameters.
meanJitter
(T)¶ Calculating the mean jitter in ms from signal periods
\[\text{mean-Jitter} = \frac{\sum_{i=1}^{N-1}|T_i - T_{i-1}|}{N-1}\]Parameters: T (numpy.ndarray or list) – The signal periods Returns: mean jitter in ms Return type: float
-
openhsv.analysis.parameters.
jitterPercent
(T)¶ Calculating the jitter in percent from signal periods.
\[\text{Jitter[%]} = \frac{\frac{1}{N-1}\sum_{i=1}^{N-1}|T_i - T_{i-1}|}{\frac{1}{N}\sum_{i=0}^{N-1}T_i} \cdot 100.\]Parameters: T (numpy.ndarray or list) – The signal periods Returns: jitter in percent Return type: float
-
openhsv.analysis.parameters.
meanShimmer
(A, epsilon=1e-09)¶ Calculating the mean shimmer in dB from the signal amplitude maxima.
\[\text{mean-Shimmer [db]} = \frac{20}{N-1}\sum_{i=0}^{N-2}|\log_{10} \left[ \frac{A_i}{A_{i+1}} \right]|.\]Parameters: A (numpy.ndarray or list) – The signal amplitude maxima Returns: mean shimmer in db Return type: float
-
openhsv.analysis.parameters.
shimmerPercent
(A, e=1e-05)¶ Calculating the shimmer in percent from the signal amplitude maxima.
\[\text{Shimmer[%]} = \frac{\frac{20}{N-1}\sum_{i=0}^{N-2}|\log_{10} \left[ \frac{A_i}{A_{i+1}} \right]|}{\frac{20}{N}\sum_{i=0}^{N-1}|\log_{10} A_i|} \cdot 100.\]Parameters: A (numpy.ndarray or list) – The signal amplitude maxima Returns: shimmer in percent Return type: float
-
openhsv.analysis.parameters.
periodPerturbationFactor
(T)¶ Calculating the Period Perturbation Factor (PPF) in arbitrary units using the signal periods.
\[\text{PPF} = \frac{1}{N-1} \sum_{i=1}^{N-1}|\frac{T_i-T_{i-1}}{T_i}| \cdot 100\]Parameters: T (list) – periods Returns: PPF in percent Return type: float
-
openhsv.analysis.parameters.
glottalGapIndex
(signal, opening, epsilon=1e-09)¶ Glottal Gap Index (GGI) that computes the relation between minimum and maximum glottal area in each glottal cycle.
\[GGI = \frac{1}{N} \sum_i^N \frac{\min(a_i)}{\max(a_i)}\]Parameters: - signal (numpy.ndarray) – glottal area waveform
- opening (list) – points where the glottis opens
- epsilon (float, optional) – numerical stability, defaults to 1e-9
Returns: glottal gap index mean and std
Return type: tuple(float, float)
-
openhsv.analysis.parameters.
amplitudePerturbationFactor
(A)¶ Amplitude perturbation factor.
\[APF = \frac{1}{N} \sum_{i=1}^{N-1}|\frac{A_i-A_{i-1}}{A_i}| \cdot 100\]Parameters: A (list or numpy.ndarray) – amplitudes in each cycle Returns: APF mean and std Return type: tuple(float, float)
-
openhsv.analysis.parameters.
amplitudePerturbationQuotient
(A, k=3)¶ Amplitude perturbation quotient
\[l = \frac{k-1}{2}\]\[APQ_k = \frac{1}{N-k} \sum_{i=l}^{N-l-1}|1-\frac{k \cdot A_i}{\sum_{j=-l}^{l}A_{i+j}}| \cdot 100\]Parameters: - A (list or numpy.ndarray) – amplitudes in each cycle
- k (int, optional) – range, defaults to 3
Returns: APQ mean and std
Return type: tuple(float, float)
-
openhsv.analysis.parameters.
amplitudeQuotient
(signal, opening)¶ Amplitude Quotient
\[AQ = \frac{A_i}{|\min_j \frac{d}{dj}f_i(j)|}\]Parameters: - signal (numpy.ndarray) – audio or GAW signal
- opening (list) – indices where glottis starts to open
Returns: AQ mean and std
Return type: tuple(float, float)
-
openhsv.analysis.parameters.
stiffness
(signal, opening)¶ Stiffness
\[Stiffness = \frac{|\max_j |\frac{d}{dj}f_i(j)||}{A_i}\]Parameters: - signal (numpy.ndarray) – audio or GAW signal
- opening (list) – indices where glottis starts to open
Returns: stiffness mean and std
Return type: tuple(float, float)
-
openhsv.analysis.parameters.
harmonicNoiseRatio
(signal, freq, freq_lower=50, freq_higher=450, filter_autocorrelation=False, epsilon=1e-09)¶ Computes Harmonic-Noise-Ratio (HNR) using autocorrelation approximation. First, it computes the fundamental frequency using the power spectrum of
signal
. Next, it computes the autocorrelation in Fourier space. Then, local maxima in the autocorrelation are found, the HNR computed and the maximum HNR and the corresponding frequency is returned.\[R_{xx} = \frac{1}{N} \sum_{k=l}^{N-1} x[k]x[k-l]\]\[HNR = \frac{R_{xx}[T_0]}{R_{xx}[0]-R_{xx}[T_0]}\]Parameters: - signal (numpy.ndarray) – audio signal
- freq (int) – sampling rate/frequency, e.g. 44100
- freq_lower (int, optional) – lower frequency cut-off, defaults to 50
- freq_higher (int, optional) – higher frequency cut-off, defaults to 350
Returns: HNR [dB], F0_FFT [Hz], F0_Autocorr [Hz]
Return type: tuple(float, float, float)
-
openhsv.analysis.parameters.
cepstralPeakProminence
(signal, freq, freq_lower=70, freq_higher=350, plot=False)¶ Computes cepstral peak prominence from signal using Fourier transformations.
- Steps:
- Compute FFT from signal
- Compute fundamental frequency from power spectrum
- Compute cepstrum from FFT, filter with moving average (window = 3)
- Find maximum peak in cepstrum
- Find corresponding quefrency
- Fit line to cepstrum
- Compute distance from peak to line –> Cepstral Peak Prominence
Parameters: - signal (numpy.ndarray) – audio signal
- freq (int) – sampling rate/frequency, e.g. 44100
- freq_lower (int, optional) – lower frequency cut-off, defaults to 70
- freq_higher (int, optional) – higher frequency cut-off, defaults to 350
- plot (bool, optional) – plots the cepstrum, the line and the peak prominence, defaults to False
Returns: CPP [dB], F0_FFT [Hz], F0_Cepstrum [Hz]
Return type: tuple(float, float, float)
-
openhsv.analysis.parameters.
phaseAsymmetryIndex
(left_gaw, right_gaw, opening)¶ Phase Asymmetry Index (PAI)
\[PAI = \frac{|argmax_j GA_i^L(j) - argmax_j GA_i^R(j)|}{N_i}\]Note
We use here the maximum instead of the minimum, as it is more likely that there are multiple time points where the value is minimal or close to the minimum (i.e. when the glottis is closed for prolonged time).
Parameters: - left_gaw (numpy.ndarray) – GAW of left vocal fold
- right_gaw (numpy.ndarray) – GAW from right vocal fold
- opening (list) – indices where glottis starts to open (i.e. new cycle)
Returns: PAI mean and std
Return type: tuple(float, float)
-
openhsv.analysis.parameters.
amplitudeSymmetryIndex
(left_gaw, right_gaw, opening, epsilon=1e-05)¶ Amplitude Symmetry Index (ASI)
\[ASI = \frac{\min(\max_j(GA_i^L(j)), \max_j(GA_i^R(j)))}{\max(\max_j(GA_i^L(j)), \max_j(GA_i^R(j)))}\]Parameters: - left_gaw (numpy.ndarray) – GAW of left vocal fold
- right_gaw (numpy.ndarray) – GAW from right vocal fold
- opening (list) – indices where glottis starts to open (i.e. new cycle)
- epsilon (float, optional) – numerical stability, defaults to 1e-5
Returns: ASI mean and std
Return type: tuple(float, float)
-
class
openhsv.analysis.parameters.
Signal
(raw_signal, dt=0.00025, debug=True)¶ Bases:
object
Inits the signal class with the raw signal, e.g. audio data or glottal area waveform.
Parameters: raw_signal (numpy.ndarray) – Audio signal, GAW or alike -
computeFFT
(use_filtered_signal=True, use_hanning=True, lowpass_filter=20)¶
-
computeCepstrum
()¶
-
filterSignal
(cutoff_frequency=0.1)¶
-
detectCycles
(method='peaks', peak='max', use_filtered_signal=True)¶ Detects cycles using different methods.
Parameters: - method (str, optional) –
method to detect cycles, defaults to ‘peaks’
- peaks
- Using a peak finding algorithm on the raw signal
- autocorrelation
- Detects cycles and period using autocorrelation
- use_filtered_signal (bool, optional) – uses filtered signal, if available.
- method (str, optional) –
-
detectPhases
(use_filtered_signal=True)¶ Detects opening and close phase in each cycle.
Parameters: use_filtered_signal (bool, optional) – Event detection on raw (False) or filtered (True) signal.
-
getPowerSpectrum
()¶ Returns power spectrum from signal
Returns: Frequencies and Amplitude Return type: tuple(np.ndarray, np.ndarray)
-
getCepsturm
()¶ Returns cepstrum from signal
Returns: quefrencies and cepstrum Return type: tuple(np.ndarray, np.ndarray)
-
-
class
openhsv.analysis.parameters.
Audio
(raw_signal, dt=1.25e-05, use_filtered_signal=True, use_hanning=True, debug=False)¶ Bases:
openhsv.analysis.parameters.Signal
The Audio class handles audio data to compute respective parameters.
Parameters: - raw_signal (numpy.ndarray) – the raw audio signal
- dt (float, optional) – time between samples in seconds, defaults to 1/80000
- use_filtered_signal (bool, optional) – use filtered signal for computations, defaults to True
- use_hanning (bool, optional) – use hanning window for FFT, defaults to True
- debug (bool, optional) – enable debugging mode, defaults to False
-
detectCycles
(method='peaks', peak='max', use_filtered_signal=True)¶ Detects cycles using different methods.
Parameters: - method (str, optional) –
method to detect cycles, defaults to ‘peaks’
- peaks
- Using a peak finding algorithm on the raw signal
- autocorrelation
- Detects cycles and period using autocorrelation
- use_filtered_signal (bool, optional) – uses filtered signal, if available.
- method (str, optional) –
-
filterSignal
(freq_range=20)¶
-
computeParameters
(use_filtered_signal=False)¶
-
class
openhsv.analysis.parameters.
GAW
(raw_signal, dt=0.00025, use_filtered_signal=False, use_hanning=True, debug=False)¶ Bases:
openhsv.analysis.parameters.Signal
The GAW class handles the glottal area waveform (GAW) to compute respective parameters.
Parameters: - raw_signal (numpy.ndarray) – raw GAW signal
- dt (float, optional) – time between samples in seconds, defaults to 1/4000
- use_filtered_signal (bool, optional) – use filtered signal for computations, defaults to False
- use_hanning (bool, optional) – use hanning window for FFT, defaults to True
- debug (bool, optional) – enable debugging mode, defaults to False
-
setLeftRightGAW
(left_gaw, right_gaw)¶
-
computeParameters
()¶
-
class
openhsv.analysis.parameters.
AnalysisPlatform
(raw_signal, dt)¶ Bases:
PyQt5.QtWidgets.QWidget
openhsv.analysis.midline module¶
-
class
openhsv.analysis.midline.
Midline
(seg, maxima=None)¶ Bases:
object
Midline prediction module.
Midline is predicted based on segmentation from neural net for each peak. Midline is interpolated between peaks. Computations are based on Kist et al., biorxiv 2020.
Parameters: - seg (numpy.ndarray) – segmentation masks (T x H x W)
- maxima (list, optional) – detected maxima in GAW, defaults to None
-
predict
(method='pca', time_range=5)¶ Predicts midline with given method on each GAW peak.
Parameters: - method (str, optional) – ‘pca’ or ‘moments’, defaults to ‘pca’
- time_range (int, optional) – time range around peak to improve prediction, defaults to 5
-
side
()¶ Returns left and right GAW based on midline in each frame
Returns: left and right GAW as array T x 2 Return type: numpy.ndarray
-
pvg
(steps=64)¶ Computes PVG in discrete steps for each side.
Parameters: steps (int, optional) – resolution along each axis, defaults to 64 Returns: phonovibrogram as T x steps*2 Return type: numpy.ndarray
-
openhsv.analysis.midline.
imageMoments
(im, transpose=True, angle_correction=1.5707963267948966)¶ Predicts midline using image moments.
Parameters: - im ([type]) – [description]
- transpose (bool, optional) – [description], defaults to True
- angle_correction ([type], optional) – [description], defaults to +np.pi/2
Returns: [description]
Return type: [type]
-
openhsv.analysis.midline.
principalComponents
(im, use_2nd=False)¶ Midline prediction using principal component analysis.
Parameters: - im (numpy.ndarray) – input image
- use_2nd (bool, optional) – use second principal component, defaults to False
Returns: slope and intercept of midline
Return type: tuple(float, float)
openhsv.analysis.pvg module¶
-
openhsv.analysis.pvg.
get_labels
(x_low, x_high, coef, intercept, image_shape, steps=64)¶ Function for getting left/right step-id’d labels based on the AP axis
Parameters: - x_low (float) – lower x coordinate
- x_high (float) – higher x coordinate
- coef (float) – coefficient of linear regression AP
- intercept (float) – intercept of linear regression AP
- image_shape (tuple(int, int)) – image size (HxW) for label
- steps (int) – steps between A and P
Returns: label map
Return type: numpy.ndarray
-
openhsv.analysis.pvg.
compute_pvg
(s, labels, steps=64)¶ Calculates Phonovibrogram based on labels.
Parameters: - s (numpy.ndarray) – segmented area (TxYxX)
- labels (numpy.ndarray) – labelled image
- steps (int, optional) – PVG resolution, defaults to 64
Returns: PVG, time x 2*steps
Return type: numpy.ndarray
openhsv.gui package¶
Submodules¶
openhsv.gui.misc module¶
openhsv.gui.patient module¶
openhsv.gui.settings module¶
Module contents¶
openhsv.hardware package¶
Submodules¶
openhsv.hardware.camera module¶
-
class
openhsv.hardware.camera.
Camera
(verbose=True)¶ Bases:
abc.ABC
An abstract camera class
Parameters: - ABC (object) – abstract class
- verbose (bool, optional) – prints to console, defaults to True
-
openCamera
()¶ opens camera
-
configCam
(*args, **kwargs)¶ configures camera
-
setSettings
(exposure, fps, *args, **kwargs)¶ sets camera settings
Parameters: - exposure (int) – exposure time
- fps (int) – frames per second
-
isIdle
()¶ returns if camera is recording or not (boolean)
-
startGrab
()¶ Starts acquisition on camera
-
stopGrab
()¶ stops acquisition on camera
-
live
()¶ returns live image preview as numpy array
-
updateTriggerPosition
()¶ Updates the trigger position for internal memory view
-
getMemoryFrame
(frame_index, by_trigger)¶ gets memory frame from camera onboard memory
Parameters: frame_index (int) – frame index in memory
-
closeCamera
()¶ closes camera connection
-
class
openhsv.hardware.camera.
DummyCamera
(is_color=True, verbose=True)¶ Bases:
openhsv.hardware.camera.Camera
-
openCamera
()¶ opens camera
-
configCam
()¶ configures camera
-
setSettings
(*args, **kwargs)¶ sets camera settings
Parameters: - exposure (int) – exposure time
- fps (int) – frames per second
-
isIdle
()¶ returns if camera is recording or not (boolean)
-
startGrab
()¶ Starts acquisition on camera
-
stopGrab
()¶ stops acquisition on camera
-
live
()¶ returns live image preview as numpy array
-
getMemoryFrame
(frame_index, by_trigger=True)¶ gets memory frame from camera onboard memory
Parameters: frame_index (int) – frame index in memory
-
closeCamera
()¶ closes camera connection
-
updateTriggerPosition
()¶ Updates the trigger position for internal memory view
-
-
class
openhsv.hardware.camera.
IdtCamera
(verbose=True)¶ Bases:
openhsv.hardware.camera.Camera
The IdtCamera class uses the abstract
Camera
class to interact with the IDT high-speed camera API. In particular it starts and stops recording, fetches frames from the internal camera memory, and sets settings, such as exposure time and framerate.Parameters: verbose (bool, optional) – Additional information is printed to the command window. Maybe important for debugging purposes. Defaults to True. -
openCamera
()¶ Searches for attached cameras and opens the first found one
Returns: success in opening the camera Return type: bool
-
configCam
(px_gain=1, camera_gain=1)¶ Basic camera configuration, such as gain.
Parameters: - px_gain (int, optional) – Pixel gain (selects 8 from 10 bits, lower (0), middle (1) and upper (2) 8 bits), defaults to 1
- camera_gain (int, optional) – Camera gain, defaults to 1
-
setSettings
(exposure, fps, roi=(1024, 1024), rec_mode=1, sync=True)¶ Sets camera settings.
Parameters: - exposure (int) – Exposure time in us (microseconds)
- fps (int) – Camera sampling rate in frames per second
- roi (tuple or None, optional) – if camera image should cropped to ROI (i.e. may run faster). If not desired, set roi=None, otherwise (height, width). Defaults to (1024, 1024)
- rec_mode (XsCamera.XS_REC_MODE, optional) – Recording mode, defaults to XsCamera.XS_REC_MODE.XS_RM_CIRCULAR
- sync (bool, optional) – If recording should be synchronized to trigger, defaults to True
-
getStatus
()¶ Returns camera status. Status indicates if the camera is recording on a circular buffer or the camera is in idle mode or the recording is done.
Returns: business and status Return type: bool, XsCamera.XS_STATUS
-
isIdle
()¶ Determines if camera is not recording via XS_STATUS
-
startGrab
()¶ Starts recording images on camera using previous set settings
-
stopGrab
()¶ Stops data acquisition
Returns: number of recorded frames Return type: int
-
live
()¶ Returns live image using XsMemoryPreview :return:
-
correctForTrigger
(frame_index)¶ Correct frame_index for trigger index, otherwise indexing is not historically
Parameters: frame_index (int) – absolute frame index Returns: frame index relative to trigger Return type: int
-
getMemoryFrame
(frame_index, by_trigger=True)¶ Creates a buffer and retrieves a camera frame by index from the camera’s onboard memory.
Parameters: - frame_index (int) – The frame index
- by_trigger (bool, optional) – If the frame index should be relative to the trigger, defaults to True
Returns: the camera frame in RGB (height, width, 3) or grayscale (height, width, 1)
Return type: numpy.ndarray
-
updateTriggerPosition
()¶ updates internally trigger position
Returns: trigger position Return type: int
-
closeCamera
()¶ Closes camera handle
-