Paper transcribes audio files with Whisper and treats the transcript as a regular source: it can be summarised, used to make flashcards, or chatted with.
Upload audio
- Open a page.
- In the Sources panel, click + Add source โ Upload a file.
- Pick the file. Common audio formats work -
.mp3,.m4a,.wav,.ogg,.flac.
There are two paths depending on size:
- Short recordings (up to 25 MB - roughly an hour of mono MP3) are sent straight to Whisper.
- Longer recordings (up to 500 MB) are uploaded to storage first, then the audio is extracted and transcribed on the server. This takes a few minutes.
After transcription
- Read the transcript alongside an audio player. Click any line to jump to that point.
- Chapters are auto-generated - Paper groups the transcript into 5โ15 topic sections with timestamps and a short summary of each.
- Summarise to get the key points without listening end-to-end.
- Quote in chat - citations include timestamps so you can verify.
Tips
- If the audio is noisy, accuracy drops. Fix obvious errors in the transcript before generating flashcards or summaries - they pick up your corrections.
- For your own voice memos, recording closer to the mic and avoiding background noise makes a big difference.
- For a series of recordings (multi-part interviews, weekly lectures), add each as its own source in the same page so chat can reason across them all.