This consumer says you don't get a red cent then!
It's already a plague on youtube where half of the docu style vids are AI narrated already. I quit them in disgust. It's so frustrating. It has eroded my perception of Youtube in short time.
This is a most excellent place for technology news and articles.
This consumer says you don't get a red cent then!
It's already a plague on youtube where half of the docu style vids are AI narrated already. I quit them in disgust. It's so frustrating. It has eroded my perception of Youtube in short time.
AI voice synth is pretty solidly-useful in comparison to, say, video generation from scratch. I think that there are good uses for voice synth
e.g. filling in for an aging actor/actress who can't do a voice any more, video game mods, procedurally-generated speech, etc
but audiobooks don't really play to those strengths. I'm a little skeptical that in 2025, it's at the point where it's a good drop-in replacement for audiobooks. What I've heard still doesn't have emphasis on par with a human.
I don't know what it costs to have a human read an audiobook, but I can't imagine that it's that expensive; I doubt that there's all that much editing involved.
kagis
https://www.reddit.com/r/litrpg/comments/1426xav/whats_the_average_narrator_cost/
So I produced my own audiobooks for my Nova Roma series so I know the exact numbers for you:
$250 per finished hour for the narrator. Books ranged from about 200k words-270k words, which came out to 22 hours, 20 hours, and 25 hours.
So books 1-3 cost me $5,500, $5,000, and $6,250. I'm contracted for two more books with my narrator, so I expect to spend another 5k-6k for each of those.
So for a five book series, each one 200k+ words, the total cost out of pocket for me will be about $27,000 give or take to make the series into audiobooks.
That's actually lower than I expected. Like, if a book sells at any kind of volume, it can't be that hard to make that back.
EDIT: I can believe that it's possible to build a speech synth system that does do better, mind
I certainly don't think that there are any fundamental limitations on this. It'd guess that there's also room for human-assisted stuff, where you have some system that annotates the text with emphasis markers, and the annotated text gets fed into a speech synth engine trained to convert annotated text to voice. There, someone listens to the output and just tweaks the annotated text where the annotation system doesn't get it quite right. But I don't think that we're really there today yet.
More jobless, desperate people.
Oh, goody! I hope they use that TikTok lady's voice! It's my favorite!