Question 1

What exactly is transcriptfy?

Accepted Answer

transcriptfy converts your audio and video files to text using artificial intelligence. You upload the recording, we process it in seconds and return the text with timestamps, speaker identification and exportable in standard formats (TXT, SRT, VTT, JSON). It's designed for journalists, podcasters, researchers, lawyers, students and anyone who spends too much time manually writing down what someone said.

Question 2

What audio and video formats do you support?

Accepted Answer

We accept the most common formats: MP3, WAV, M4A, AAC, OGG, OPUS, WMA and FLAC for audio; MP4, MOV, MKV, WebM, AVI and WMV for video. If you upload a video, we automatically extract the audio track — no need to convert it first.

Question 3

How long does a transcription take?

Accepted Answer

It depends on the length of the file and the options you activate, but in most cases a 30-minute audio is transcribed in 1 to 3 minutes. Options such as speaker recognition or subsequent translation add some time. Before clicking «Transcribe» we show you a speed estimate based on the file and chosen options.

Question 4

What is the maximum file size?

Accepted Answer

It depends on whether you have an active subscription: up to 2 GB per file and 1 file per batch without a subscription (guest or free account), up to 5 GB per file and 3 simultaneous files with any active package. If your recording is larger, split it into segments or contact us.

Question 5

Do you recognize multiple speakers?

Accepted Answer

Yes. With the «Recognize speakers» option enabled we automatically label who speaks in each intervention. It works well for up to about 10 different speakers. Afterwards you can rename each one («Speaker 1» → «Maria Torres») and the change is applied to the entire transcription, translation and summary.

Question 6

What languages do you transcribe?

Accepted Answer

More than +99 languages, including Spanish, English, French, German, Portuguese, Italian, Mandarin Chinese, Japanese, Arabic and all major European and Asian languages. By default we automatically detect the language with over 95% accuracy, but you can select it manually if you know it — it improves quality in short or noisy audio.

Question 7

Can I edit and correct the text afterwards?

Accepted Answer

Yes. Each transcription includes an editing tab where you correct the text word by word while maintaining segments and speakers. Changes are archived in a revision history you can return to at any time — so you can experiment without fear of losing the previous version.

Question 8

Can I translate my transcription?

Accepted Answer

Yes, into more than 20 target languages. We translate segment by segment respecting timestamps and speakers, with a two-column view (original on the left, translation on the right), synchronized scrolling and hover-mirror that highlights the equivalent segment in the other column. You can have several active translations at the same time for the same file — for example Spanish → English and Spanish → French.

Question 9

Can I export subtitles?

Accepted Answer

Yes. We export in SRT and VTT — the standard formats compatible with YouTube, Premiere, Final Cut, web players and virtually all video editors. You can also download in TXT (plain text), JSON (full structure with timestamps, speakers and metadata) or the original audio in one click.

Question 10

Is it safe to upload my files?

Accepted Answer

Your files travel encrypted to Cloudflare R2, with access via temporary signed URLs. The upload from your browser goes directly to storage, without passing through intermediate servers where they could be exposed. We do not use your content to train AI models or share it with third parties beyond the processing needed to generate the transcription, translation or summary you requested.

Question 11

Can I try without registering?

Accepted Answer

Yes. In guest mode you can transcribe a 30-second sample per file and see the result before deciding. If you like it, when you register the sample is automatically linked to your account and you process the full file — without losing what you had already started.

Question 12

How does pricing work?

Accepted Answer

We work with minute packages: you choose the one that best fits your monthly volume and pay a per-minute price that decreases according to the package. The available packages, the per-minute price for each and the included features are explained on the pricing page.

Transcribe audio and video
to text instantly

Three simple steps

Upload your file

We transcribe with AI

Download your text

Tools to transcribe without limits

Identify who speaks

Export in any format

Professional translation editor

Edit with full history

Clickable timestamps

Have questions?

Transcribe audio and video to text instantly