Overview
Scribba is a platform that leverages artificial intelligence to transcribe any type of media—audio or video—in over 30 languages.
Key Features
- •Flexible Importing: Easily import any audio or video file using drag & drop or by providing a public URL.
- •Multiple Output Formats: Download transcriptions in your preferred format, including SRT, VTT, PDF, Word, XLSX, and more.
- •Precise Timestamps: Benefit from word-level timestamp tracking, ensuring you can pinpoint the exact moment each word is spoken.
- •Editable Workspace: View and edit your transcriptions directly within the platform's workspace.
Technological Challenges
Scribba relies on OpenAI's Whisper model as its transcription engine. This model requires intensive computational power for inference, necessitating the use of GPUs. Due to the high cost of maintaining a dedicated GPU, the project utilizes an on-demand GPU execution platform, specifically RunPod.
RunPod enables users to load a Docker image with a custom entry point designed for running AI model inference. This setup offers extensive customization, allowing the installation of any libraries or dependencies required by the project. In my case, I deployed a modified version of OpenAI's Whisper, tailored to deliver precise, word-for-word transcription.
Additionally, the platform employs webhooks to notify users about events occurring during a transcription job, allowing for asynchronous processing. This system informs users when their transcription is complete or if an error occurs, ensuring a smooth and responsive experience.
Conversion Testing and Enhancements
Initially, we offered a 30-minute free transcription trial. However, this attracted users primarily interested in a free service, without the intent for long-term engagement.
To address this, we experimented by reducing the free trial duration to 10 minutes. Despite the shorter trial, users were still allowed to process the entire clip—even if its duration exceeded the free trial limit. After the transcription was completed, only the minutes available under the trial were visible. At this stage, several clear call-to-actions encouraged users to either unlock the full transcription for a fixed fee or subscribe to a service plan.
This strategy proved effective in increasing conversion rates, as users were able to preview the quality and output of their transcription without the need to immediately enter credit card information.