Whisper OpenAI CUDA 12

Description

This workspace can be used to work with the Whisper and WhisperX toolkits for transcribing, translation and diarization of human speech. On this workspace a Python environment is created using the environment manager ‘conda’. The ‘conda’ environment contains Whisper, WhisperX and other required packages. The environment is configured to be able to use a GPU when available.

The workspace is a JupyterHub workspace, and comes with a template notebook to transcribe, translate and diarize audio files. The workspace is mainly aimed for users that have at least some very basic Python experience using Jupyter notebooks. However the workspace also provides a terminal (command line application) to run Whisper using the command line.

Creation

Create a storage volume

If desired, first create a storage volume before creating the workspace.

See the Getting started page for more info about how and why to create a storage volume.

Create a workspace

In the Research Cloud portal click the ‘Create a new workspace’ button and follow the steps in the wizzard.

See the workspace creation manual page for more guidance.

Access

This workspace can be accessed via the yellow ‘Access’ button.

Data transfer options

First create a working directory on the Storage Volume

The JupyterHub dashboard has an Upload button to directly upload data from your computer.

Usage

Navigate to your home directory and start the template Jupyter notebook whisper_template.ipynb using the Whisper kernel. If you double click to start the notebook, make sure in the top right you see whisper (ipykernel) instead of Python 3 (ipykernel). If you see Python 3 (ipykernel), just click on it to change it to the Whisper kernel.

Tips