Back to all articles
Taylor Brooks

Descript Pricing: Translate Old Minutes to New Costs

Convert old minutes into accurate Descript pricing-practical cost breakdowns for podcasters, solo creators, and transcribers.

Introduction

For independent podcasters, solo creators, and freelance transcribers, the recent shift from “transcription minutes” to “media minutes + AI credits” in platforms like Descript is more than a superficial pricing update—it fundamentally changes how you model costs, forecast bills, and structure your workflow. If you’ve been tracking monthly usage with a rough tally of hours transcribed, you’ll need a new framework to translate those old numbers into the new units. This is especially true for transcript-heavy workflows, where every file length, re-upload, and derivative export can push your consumption up in ways that weren’t billable before.

In this guide, we’ll walk through practical steps to map historic transcription usage into the new pricing model. We’ll break down common scenarios—like a five-episode podcast month, a batch of interviews, and a mixed-media course—showing exactly how many media minutes and AI credits each consumes. We’ll also discuss transcript-specific behaviors that can inflate bills and how implementing transcript-first policies (and using tools like clean, structured transcript generation from SkyScribe) can reduce waste and keep costs predictable.


Understanding the New Descript Pricing Model

The legacy model billed you for audio or video processing in “transcription minutes”: if you uploaded a one-hour interview, you were charged for those sixty minutes of transcription time.

The new model replaces that with two linked units:

  1. Media Minutes — the total duration of the uploaded file, regardless of how you use it after import. A 60‑minute video counts as 60 media minutes whether you pull audio from it or just extract text.
  2. AI Credits — consumed whenever you perform AI-driven tasks after the initial upload, such as re‑exporting transcripts, cleaning segments, generating summaries, or producing subtitles.

This change reflects the heavier load on platforms’ processing infrastructure, but it also means certain behaviors—like re‑uploading files for formatting tests—double bill you.

For podcasters who routinely run 5–10 episodes a month, this hybrid model creates a risk of “hidden consumption”: actions that feel minor (re‑exports, resegmentations) now translate into billable credits.


Mapping Historic Minutes to the New Units

To forecast future costs, start by extracting a 3–6 month average of how many hours you processed under the old model. Then convert those hours into media minutes and overlay estimated AI credit usage.

A good baseline conversion is:

  • Media minutes: Old transcription minutes × 1.4 (this accounts for the full media track, not just the spoken audio segment)
  • AI credits: Media minutes × 0.3 (credits for speaker identification, cleanup, and derivative exports)

For example, if you processed 300 transcription minutes monthly, the new model could register ~420 media minutes plus ~126 AI credits. Freelancers will notice that this bumps the base bill even if they don’t change editing habits.


Scenario 1: A Five‑Episode Podcast Month

You run a weekly interview show, each episode averaging 45 minutes.

  • Media minutes: 5 episodes × 45 min = 225 media minutes
  • AI credits:
  • Speaker identification at import: 225 × 0.1 = 22.5 credits
  • Subtitle export for YouTube: 225 × 0.2 = 45 credits
  • Summary generation for show notes: 225 × 0.1 = 22.5 credits Total credits: ~90 credits

Under the old model, you’d have been billed for a flat 225 minutes; under the new, media minutes are unchanged (225), but layered AI credits add cost.

One way to control this is to do multi-purpose exports from a single transcript. Services like SkyScribe allow you to pull a precisely labeled transcript with aligned timestamps from your original media upload, letting you generate subtitles and summaries offline without incurring multiple rounds of credit use.


Scenario 2: Ten Interviews at 30 Minutes Each

Let’s say a freelance transcriber processes 10 interview recordings for different clients.

  • Media minutes: 10 × 30 = 300 media minutes
  • AI credits:
  • Clean‑read exports: 300 × 0.15 = 45 credits
  • SRT subtitle generation: 300 × 0.15 = 45 credits
  • Minor speaker correction after initial upload: 300 × 0.05 = 15 credits Total credits: ~105 credits

A common trap here is re‑uploading corrected versions for each client style guide. In the new billing model, that’s another 300 media minutes per upload. Instead, use a single master transcript, clean it once, and adapt copies locally. Automated resegmentation tools (I use auto resegmentation in SkyScribe for this) let you split dialogue into subtitle-length or narrative blocks without triggering new media-minute charges.


Scenario 3: Producing a Mixed‑Media Course

You record a course with 2 hours of video lectures plus six 20‑minute Q&A sessions.

  • Media minutes: 120 (lectures) + 120 (Q&As) = 240 media minutes
  • AI credits:
  • Chapter summaries: 240 × 0.15 = 36 credits
  • Multi‑language subtitle exports: 240 × 0.3 = 72 credits
  • Speaker separation for Q&A: 120 × 0.15 = 18 credits Total credits: ~126 credits

Mixed‑media projects compound consumption because video files bill for full runtime—including visuals—while every derivative format pulls AI credits. Translating transcripts for multilingual audiences, for example, can double credit usage in an instant. Tools that can translate transcripts while maintaining timestamps help ensure those credits go toward final, usable assets rather than iterative, wasteful exports.


Identifying High-Waste Behaviors

The most common cost-inflating behaviors under the new model include:

  • Multiple uploads of the same asset — Each counts full media minutes even if identical.
  • Redundant exports — Creating separate exports for each internal test.
  • Fine-grain segmentation — Producing subtitle-length fragments for the same transcript multiple times.
  • Direct-to-video workflows — Exporting subtitles or edits without previewing via transcript first.

Any action that triggers fresh AI processing will add credits to your bill. If you loop through imports for subtitling, speaker tweaks, and summaries separately, expect those credits to stack up.


Implementing Transcript-First Policies

A “transcript-first” rule means you extract, clean, and structure the transcript immediately after upload—and only then generate all secondary assets from that base. This minimizes repeated uploads and processing.

Practical steps:

  1. Upload once and verify speaker labels and timestamps in the initial transcript.
  2. Clean early — fix punctuation, casing, and filler words before derivative formats.
  3. Batch derivative exports — produce subtitles, summaries, and highlights in a single session.
  4. Local refinements — adapt transcripts offline instead of re-submitting the media file.

Platforms like SkyScribe make this easier by producing clean transcripts with speaker labels and timestamps at the moment of import, so you can build all variations without “re-billing” your minutes for cosmetic changes.


Rules of Thumb for Forecasting First Migration Bills

Until you have precise usage reports, use these heuristics:

  • Multiply old transcription minutes by 1.4 to get expected media minutes.
  • Add credits at ~0.3× media minutes for basic edits and exports.
  • Buffer 20–25% above that if you perform frequent format changes or translations.

Example: If your old monthly average was 400 transcription minutes:

  • Media minutes = 400 × 1.4 = 560
  • AI credits = 560 × 0.3 = 168
  • With buffer = 168 × 1.25 ≈ 210 credits

This gives you a ballpark figure before the first bill hits.


Conclusion

Descript’s move from transcription minutes to a media minutes + AI credit model changes more than the billing line—it changes creator behavior. Podcasters, course producers, and freelance transcribers who keep iterative, multi‑export habits will see higher consumption and unpredictable costs. By mapping historical usage to the new units, recognizing high‑waste behaviors, and applying transcript-first workflows, you can keep bills in check.

And the shift isn’t all bad: with more structured planning, the new model can still fit indie budgets—especially if you rely on tools like SkyScribe to produce clean, timestamped transcripts in one go. By forecasting with the conversion formulas in this guide, you’ll gain a clearer picture of monthly consumption and avoid surprise overages while adapting smoothly to the new pricing reality.


FAQ

1. How do “media minutes” differ from “transcription minutes”? Media minutes measure the full duration of the uploaded file, including video track, whereas transcription minutes in the old model covered only the audio processed.

2. What counts as an AI credit? AI credits are consumed when you run AI-driven actions such as speaker identification, generating summaries, exporting subtitles, or translating transcripts.

3. Why are mixed-media projects more expensive under the new model? Videos bill for full runtime in media minutes, and every post‑processing action that uses AI—like separating speakers—costs credits.

4. What is a transcript-first policy, and why does it save money? It’s a workflow where you create and finalize your transcript from the initial upload, then generate all derivatives from that single transcript. This avoids repeated uploads and AI processes that add costs.

5. How can SkyScribe help reduce billing waste? SkyScribe generates clean, accurately labeled transcripts at upload, supports transcript resegmentation without re-uploading, and allows translation while preserving timestamps, cutting down on repeated media-minute charges.

Agent CTA Background

Get started with streamlined transcription

Free plan is availableNo credit card needed