A partnership with OpenAI will let podcasters replicate their voices to automatically create foreign-language versions of their shows.

  • @[email protected]
    link
    fedilink
    English
    2049 months ago

    Honestly, as long as the person whose voice it is gives full permission it’s probably one great use for AI.

    That being said, you could just hire people who actually know the language to translate.

    • @[email protected]
      link
      fedilink
      English
      69
      edit-2
      9 months ago

      I am for hiring people who know the language and the target audience. Mainly to avoid AI taking away possible jobs and to avoid something literally translated that either doesn’t make sense or ends up being offensive by accident.

      • @[email protected]
        link
        fedilink
        English
        739 months ago

        You will never ever in any case be able to stop technology from progressing. Instead of fearing the loss of jobs, how about making sure that we can properly handle and integrate AI into our society with everyone benefitting from it?

        Stop the defeatist attitude, get politically active and help kick conservatives and fascists into the ditch where they belong.

        • @[email protected]
          link
          fedilink
          English
          359 months ago

          As long as money’s involved, there’s no way AI tech benefits society.

          That kinda shit will only benefit the wealthy and the owning classes.

          • @[email protected]
            link
            fedilink
            English
            179 months ago

            Might as well go back to the fields the with all the other Luddites then.

            We live in a capitalist society, every bit of progress benefits the rich first. It’s always been like that, it has nothing to do with the AI part.

            • @[email protected]
              link
              fedilink
              English
              69 months ago

              You’d better get into the factory with the other 1984 drones then. 🤷

              We all can play that stupid game. Theft and copyright infringement aren’t progress.

          • @[email protected]
            link
            fedilink
            English
            12
            edit-2
            9 months ago

            So, like… a claim so broad as “As long as money’s involved, there’s no way AI tech benefits society” is obviously untrue, right? Even if we accept a premise like “On the whole, AI will hurt society more than it helps”, it’s basically just dogma to blanket deny any practical usefulness. Take firearms, for example: they’re often strictly controlled, but rarely if ever completely purged – almost all societies accept that some situations exist where the utility sufficiently justifies the harm.

            To be honest, I feel really weird pushing back against this because we seem rather ideologically aligned. I think we both feel that technologies which promote economic development will – by default – disproportionately empower those rich and powerful few. With that being said, from an ideological perspective, technological developments are not in fundamental opposition to Marxist philosophy (yes, even technological developments which render some skilled labor obsolete).

            On the contrary; if we are to believe that the next step of economic development lies in casting aside class division, then we must necessarily concede that the only way forward is to recruit novel technological developments toward that purpose. It is self-undermining and shortsighted to argue that simply allowing a development will inherently undermine anti-capital interests, because how then could such a system so apparently incompatible with future technologies also claim to itself be the future?

          • @[email protected]
            link
            fedilink
            English
            79 months ago

            Unless, you know, it’s properly regulated and stuff. Regulation works through laws. Laws are passed by the government. The government is elected by the people.

            So get the proper people into government.

            • @[email protected]
              link
              fedilink
              English
              139 months ago

              That’s naive and delusional. At least in the USA, there’s no chance of such regulations coming about, regardless of who is put in power. The RNC and DNC both are far more swayed by the money of those eliminating their work force than the plight of the worker. That isn’t changing any time soon.

              I’ll eat my hat if they pass a law that actually protects workers and bans use of AI to replace human jobs.

              • @[email protected]
                link
                fedilink
                English
                9
                edit-2
                9 months ago

                And now refer back to my first comment, let that defeatist attitude go, and work on getting those things changed. If you were right, we’d still be living under kings and owning classical slaves ;)

                I’m not saying it’s easy or quick, I’m saying that your thinking makes it reality because you just accept getting assfucked… Which is exactly “their” goal.

            • @[email protected]
              link
              fedilink
              English
              39 months ago

              The government is elected by the people.

              And controlled by the wealthy. You don’t really think your local representative cares what you think, do you? Because that would be laughably naive.

              They care what their lobbyists and major donors think.

              • @[email protected]
                link
                fedilink
                English
                19 months ago

                First of all that is a very simplistic and therefore incomplete view of the things. Second of all, that’s why you work on getting people there who do care and want to fix that.

        • @[email protected]
          link
          fedilink
          English
          49 months ago

          Uh, no. You are not all powerful and abusive technology is not an inevitability we have to submit to. We’ll never submit to garbage that steals shit from people.

          • @[email protected]
            link
            fedilink
            English
            2
            edit-2
            9 months ago

            The AI doesn’t steal anything, the people creating it do. This is something that can and should and must be regulated.

            To add my personal opinion to that, I don’t think there is a problem with models being trained on all possible data, but it must not be used by a single company to profit some few people. It must be available to anyone and everyone, since it learned from anyone and everyone. We all learn from others and AI is no different - the problem is in the centralization and further abuse of its power.

      • vortic
        link
        fedilink
        English
        89 months ago

        As the other person said, we’re not going to be able to avoid this kind of change and 8 don’t think we should want to. There are more podcasts to translate than can possibly be done without AI.

        A better use of translators, in my opinion, is as editors. Listen to the AI result while reading the English transcript to fix the types of problems that you mention.

      • @[email protected]
        link
        fedilink
        English
        39 months ago

        If it was feasible to do that we would’ve been doing it already.

        An AI makes to cost effective to translate audio for an audience of just a few people.

        In cases where it has been cost effective to pay a translator in the past I expect it will continue to be so. I’m aware that AI generated audio is pretty good, but translations are often pretty poor.

    • Carlos Solís
      link
      fedilink
      English
      269 months ago

      It can be both at the same time - getting a professional voice actor to translate the script, then apply AI magic to have the voices match the original as exactly as possible.

    • arefx
      link
      fedilink
      English
      179 months ago

      Or instead of hiring people you could use AI and then pocket that money because you’re a greedy CEO/shareholder and fuck everyone but yourself.

      • @[email protected]
        link
        fedilink
        English
        7
        edit-2
        9 months ago

        I mean.

        Would you not like to hear the OG voice but in your language? Movies dubbed in Spanish sound straight up awful to me because the voice actors sound wonky compared to the original.

        Not everything has to be about a greedy CEO, sometimes the proposal could actually be good if done right. We seriously need to chill with this narrative in every fucking thread.

      • @[email protected]
        link
        fedilink
        English
        29 months ago

        It sounds like you have a problem with tax rates more than the technology. Are we also fed up with being able to translate web pages with a browser extension?

  • Flying Squid
    cake
    link
    fedilink
    English
    1249 months ago

    Ah. So now people can listen to Joe Rogan in the original Russian.

  • FireWire400
    link
    fedilink
    English
    519 months ago

    That’s just weird… Part of the reason I listen to podcasts is that I just enjoy people talking about things and AI voices still have this uncanny quality to me

      • @[email protected]
        link
        fedilink
        English
        29 months ago

        That’s obviously way better than any TTS before it, but I still wouldn’t want to listen to it for more than a few minutes. In these two sentences I can already hear some of the “AI quirks” and the longer you listen, the more you start to notice them.
        I listen to a lot of AI celeb impersonations and they all sound like the same machine with different voice synthesizers. There’s something about the prosody that gives it away, every sentence has the same generic pattern.
        Humans are generally more creative, or more monotonous, but AI is in a weird inbetween space where it’s never interested and never bored, always soulless.

        • @[email protected]
          link
          fedilink
          English
          29 months ago

          Having listened to it, I could not identify any sort of “AI quirk”. It sounded perfectly fine.

    • @[email protected]
      link
      fedilink
      English
      259 months ago

      A large language model took a 3 second snippet of a voice and extrapolated from that the whole spoken English lexicon from that voice in a way that was indistinguishable from the real person to banking voice verification algorithms.

      We are so far beyond what you think of when we say the word AI, because we replaced the underlying thing that it is without most people realizing it. The speed of large language models progress at current is mind boggling.

      These models when shown FMRI data for a patient, can figure out what image the patient is looking at, and then render it. Patient looks at a picture of a giraffe in a jungle, and the model renders it having never before seen a giraffe… from brain scan data, in real time.

      Not good enough? The same FMRI data was examined in real time by a large language model while a patient was watching a short movie and asked to think about what they saw in words. The sentence the person thought, was rendered as English sentences by the model, in real time, looking at fMRI data.

      That’s a step from reading dreams and that too will happen inside 20 months.

      We, are very much there.

      • @[email protected]
        link
        fedilink
        English
        99 months ago

        I don’t think what you’re saying is possible. Voxels used in fMRI measure in millimeters (down to one of I recall) and don’t allow for such granular analysis. It is possible to ‘see’ what a person sees but the image doesn’t resemble the original too closely.

        At least that’s what I have learned a few years ago. I’m happy to look at new sources, if you have some though.

        • @[email protected]
          link
          fedilink
          English
          19 months ago

          Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding: https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/

          High-resolution image reconstruction with latent diffusion models from human brain activity: https://www.biorxiv.org/content/10.1101/2022.11.18.517004v3

          Semantic reconstruction of continuous language from non-invasive brain recordings: https://www.biorxiv.org/content/10.1101/2022.09.29.509744v1

        • @[email protected]
          link
          fedilink
          English
          19 months ago

          I like how I said, the problem is progress is moving so far you don’t even realize what you don’t know about the subject as a layman… and then this comment appears saying things are not possible.

          Lol.

          How timely.

          I the speed at which things are changing and redefining what is possible in this space is moving faster than any other are of research. It’s insane to the point that if you are not actively reading white papers every day, you miss major advances.

          The layman had this idea of what “AI” means, but we have truly no good way to make the word align to its meaning and capabilities with how fast we change what it means underneath.

          • @[email protected]
            link
            fedilink
            English
            3
            edit-2
            9 months ago

            I looked at your sources or at least one of them. The problem is, that, as you said, I am a layman at least when it comes To AI. I do know how fMRI works though.

            And I stand corrected. Some of those pictures do closely resemble the original. Impressive, although not all subjects seem to produce the same level of detail and accuracy. Unfortunately, I have no way to verify the AI side of the paper. It is mind boggling that such images can be constructed from voxels of such size. 1.8mm contain close to 100k neurons and even more synapses. And the fMRI signal itself is only ablood oxygen level overshoot in these areas and no direct measurement of neural activity. It makes me wonder what constraints and tricks had to be used to generate these images. I guess combining the semantic meaning of the image in combination with the broader image helped. Meaning inferring pixel color (e.g. Mostly blue with some gray on the middle) and then adding the sematic meaning (plane) to then combine these two.

            Truly amazing, but I do remain somewhat sceptical.

            • @[email protected]
              link
              fedilink
              English
              19 months ago

              The model inferred meaning much the same way it infers meaning from text. Short phrases can generate intricate images accurate to author intent using stable diffusion.

              The models themselves in those studies leveraged stable diffusion as the mechanism of image generation, but instead of text prompts, they use fMRI data training.

      • @[email protected]
        cake
        link
        fedilink
        English
        49 months ago

        Interesting and scary to think ai understands the black box of human neurology more than we understand the black box of ai.

    • rigatti
      link
      fedilink
      English
      89 months ago

      It won’t take long until that uncanny quality is worked out.

      • danielbln
        link
        fedilink
        English
        79 months ago

        Imho it has already been worked out. There is probably selection bias at play as you don’t even recognize the AI voices that are already there.

      • @[email protected]
        link
        fedilink
        English
        19 months ago

        Following up on the other comment.

        The issue is that widely available speech models are not yet offering the quality that is technically possible. That is probably why you think we’re not there yet. But we are.

        Oh, I’m looking forward to just translate a whole audiobook into my native language and any speaking style I like.

        Okay, perhaps we would still have difficulties with made up fantasy words or words from foreign languages with little training data.

        Mind, this is already possible. It’s just that I don’t have access to this technology. I sincerely hope that there will be no gatekeeping to the training data, such that we can train such models ourselves.

    • @[email protected]
      link
      fedilink
      English
      35
      edit-2
      9 months ago

      What’s your beef with this?

      In what world does someone who only speaks Spanish being able to listen to and enjoy a podcast that was recorded in English end up being such a terrible thing?

      “Broader accessibility of information? No, please make it stop!!”

      • @[email protected]
        link
        fedilink
        English
        99 months ago

        You could argue that for major languages, where the translations would drive revenue, they should prefer to hire people to do the translations from within the target market - it would create some amount of economic opportunity rather than just being another way for the developed countries to suck up money on services from developing ones in particular.

        • @[email protected]
          link
          fedilink
          English
          79 months ago

          But that would be just translating the transcript. To make it comparable to what Spotify is planning is if it also contains hiring voice actors to essentially redo the entire podcast in a different language.

          No offense but depending on the podcast and the target audience this solution could cost per episode more than the entire production cost of the podcast per episode.

          • @[email protected]
            link
            fedilink
            English
            19 months ago

            Yeah, I could imagine that, if we’re just counting the baseline minimum of what that production would cost. I think for the most popular podcasts they could easily afford it, though. It would certainly cost much less than what they’re paying Joe Rogan.

      • @[email protected]
        link
        fedilink
        English
        89 months ago

        My beef with this, is that Spotify is relentless with pushing podcasts. I’m not interested in podcasts. I just want them permanently gone from my Spotify for all of eternity, but alas, I can’t get rid of them. When they start pumping out AI generated translations of popular podcasts, I can’t even imagine how hard they’ll push it.

        I can choose “Music” and “Podcasts & Shows” on Home page on the mobile app at least, but that changes the feed massively and makes it useless. Spotify is such a trash app already, and I’m just waiting for an alternative that works in my country, but alas…

  • Th4tGuyII
    link
    fedilink
    37
    edit-2
    9 months ago

    The problem with this is the same problem news websites has when they started switching out their foreign language writers with AI.

    Just because you can translate what is literally being said word by word, doesn’t mean you’re translating the intent of what was being said.

    Idioms, phrases, jokes, pleasantries, etc. won’t translate into foreign languages no matter how well you can translate the literal words being said.

    If you want good quality translation, you should get someone who knows the language and the culture to do it, as they can translate what’s between the lines.

    • @[email protected]
      link
      fedilink
      English
      9
      edit-2
      9 months ago

      I honestly think this a non-issue with the new llms coming out. Gpt 4 definitely understands idioms.

      Hardest part with be getting the tone down and adding proper emotion to it.

      • @[email protected]
        link
        fedilink
        English
        49 months ago

        Honestly, I agree. Machine translation isn’t by necessity limited to “literal” translations anymore.

      • @[email protected]
        link
        fedilink
        English
        29 months ago

        There’s probably a strong English bias to that currently, but other languages will come with time

    • @[email protected]
      link
      fedilink
      English
      79 months ago

      Shows with the budget/intent to create good quality translations will have them reviewed/refined by humans before they put it back in the voice of the host, I don’t see why they couldn’t do that.

      Shows without the budget or that just don’t care will use full-auto and I’m sure it will indeed suck.

    • @[email protected]
      link
      fedilink
      English
      39 months ago

      I’m with the person in this thread that pointed out that, with this, instead of translators handling an impossible amount of work, now they can edit the output to match correctly and get more done.

      Fighting the tech will fail, as history has shown. Integrating it in a healthy, useful way is what is needed.

  • 👁️👄👁️
    link
    fedilink
    English
    359 months ago

    That’s going to cause so many lawsuits. Also wonder since the WAG strike finally finished and are creating a contract, if this will affect it?

    • @[email protected]
      link
      fedilink
      English
      459 months ago

      Why do you think that? It sounds like it’s a feature that a Podcaster can choose to use if they want to. It doesn’t sound like they are just going to do it to every podcast without permission.

      Honestly, as dumb as the AI hype can be, I see this as an actual good use of the tech, but I could be wrong.

          • misery mansion
            link
            fedilink
            English
            89 months ago

            Yes exactly, as long as that adds up to the same compensation percentage the original voice actor signed up for then I don’t see an issue with this.

            I’m almost 100% sure that won’t be the case without a fight

            • The Barto
              link
              fedilink
              English
              19 months ago

              Just send Joe Rogan to strong arm them or talk about dmt until they give in.

  • @[email protected]
    link
    fedilink
    English
    319 months ago

    After discovering my first AI covers (specifically Barbie Girl by Johnny Cash) a couple of weeks ago my first thought was “Yep, this is how Star Trek’s universal translator is about to come to pass.”

  • Андрей Быдло
    link
    fedilink
    English
    309 months ago

    This pseudoAI is a new kind of plastic: sometimes useful, misused to infest everything with it. As it rolls, there would be less and less genuine content in a sea of garbage. That few, it’d become a luxury.

    Technological advance is in hands of those who own the means of production.

  • @[email protected]
    link
    fedilink
    English
    279 months ago

    Is this good or bad. I can see this being used to steal your voice and use it without your permission.

    • @[email protected]
      link
      fedilink
      English
      269 months ago

      Assuming that nothing nefarious happens, I can still see this being a problem if the translations aren’t top quality. Imagine that speakers of another language are offended or you’re embarrassed in front of them because something you said was incorrectly translated; then it’s rendered in your voice so it seems you said it.

      • Capt. Wolf
        link
        fedilink
        English
        20
        edit-2
        9 months ago

        Handle it just like horror podcasts usually do. Disclaimers before and after the podcast. Disclaimers in the podcast description. Notices in the ToS.

        “This podcast has been translated into *your language* with the help of OpenAI. This is an automated service. As such, it may contain transcription and translation errors which may result in dialogue not intended by the original podcaster. Please report errors to *support link here*.”

        Be more concerned about this being like what Hollywood just pulled, where Spotify includes a usage clause that gives them the rights to the podcaster’s voice in perpetuity.

      • Chariotwheel
        link
        fedilink
        79 months ago

        And, it doesn’t even need to be wrong. Sometimes very innocent things have a specific meaning or connotation in certain languages. Be it innuendos or euphemisms.

        Using 3/5 in connection with Black people would mean basically nothing in Germany, but would perk up ears in the USA. On the other hand 18 and 88 is not that well known in the USA as anything particular, but in Germany you can’t have it easily on your car plate, especially if you’re from Hamburg (HH).

        So you could quite correctly translate things, but they still get a different connotation depending on culture and language.

      • TheSaneWriter
        link
        fedilink
        English
        19 months ago

        Perhaps that could be resolved by a disclaimer. Something like, “The following lyrics were generated by an AI and thus may be mistranslated.” It wouldn’t be perfect, but it might help.

    • NaN
      link
      fedilink
      English
      79 months ago

      Currently it’s an opt-in tool, and I don’t think it is likely OpenAI or Spotify blatantly steal voices. The fact that the tech exists enables that though, a podcast is a perfect training tool for it. But you can’t really uncreate it.

      It’s also the sort of thing that unions have been fighting. It improves the technology and makes it an easier sell for any studios or producers to use it elsewhere, like to replace the need to pay a celebrity to come in and record radio station call outs, and long term this specifically takes away jobs from people who translate and dub audio.

      IMO it’s good it’s opt-in but ultimately anti-human.

      • FiveMacs
        link
        fedilink
        English
        49 months ago

        100% chance they already stole voices and sold them to either data harvesting or to sell and train ai models and never passed that money along.

        • NaN
          link
          fedilink
          English
          29 months ago

          Im sure OpenAI has downloaded a ton of podcasts for training, but more specifically when I talk about stealing I am mostly talking about using their voices for other unauthorized work, like suddenly they are announcing train stops.

    • Otter
      link
      fedilink
      English
      6
      edit-2
      9 months ago

      It would help with accessibility, and it might help protect some lesser spoken languages because those people can grow an audience as well.

      The tech will develop regardless and people will abuse it for other means, at least this one feels like a positive use as opposed to say, a company making its own podcast series with a stolen voice.

      If the creator can choose to generate other languages for their own voice, that’s probably fine?

      • @[email protected]
        link
        fedilink
        English
        69 months ago

        In the short term, AI is only trained on popular languages like Spanish. It will not help less common ones.

    • Nix
      link
      fedilink
      English
      49 months ago

      Anyone can copy there voices without permission currently. Seems more like a useful service as long as the terms and conditions don’t include anything about signing your rights away by using it

      • @[email protected]
        link
        fedilink
        English
        29 months ago

        Oh sure it has that provision that it becomes property of Spotify and they can use it however they like.

  • @[email protected]
    link
    fedilink
    English
    239 months ago

    If it does a good job and people get paid fairly then this seems like a great thing to me.

          • @[email protected]
            link
            fedilink
            English
            2
            edit-2
            9 months ago

            If they get paid by number of people listening… yeah bigger exposure means more money.

            Not that I imply it is how it works, maybe Spotify has a dump sum of money for a year or similar stuff, no idea.

        • @[email protected]
          link
          fedilink
          English
          39 months ago

          So now destroying smaller local communities as large scale productions can now be spread to markets they weren’t originally intended for?

        • @[email protected]
          link
          fedilink
          English
          19 months ago

          Bigger audience sure, but I’d imagine there’s a fee for the translation as well, or some other catch.

          • @[email protected]
            link
            fedilink
            English
            19 months ago

            I don’t like making assumptions, but knowing Spotify almost certainly.

            They might offer it for free at first but once a couple of people publicly make bank they’ll make it a premium charge and dangle it front of hopefuls, no doubt in my mind.

  • @[email protected]
    link
    fedilink
    English
    23
    edit-2
    9 months ago

    I hate how many ads they push for podcasts and singles on the premium tier. Full screen. IDGAF, I just wanna listen to my music. Bracing for a wave of new duo ads, podcasts about a woman who sat on a fork or some BS like that, and artists I dislike. Now with AI translations :|

    • @[email protected]
      link
      fedilink
      English
      169 months ago

      You pay for premium and they’re still serving you ads?

      Every day I feel better about never having used Spotify.

      • @[email protected]
        link
        fedilink
        English
        79 months ago

        There is a recommended for you section on the main page, but you can ignore it. They aren’t inserting ads into the listening part.

  • @[email protected]
    link
    fedilink
    English
    199 months ago

    Nope. I don’t support blatantly public facing AI’s that take creative jobs away from people. I don’t care if it’s opt-in. I don’t care if the podcast creator themselves activates it. Exploiting the technology will only make it normalized, meaning we’ll care less about allowing humans to be creative in the future.

    • @[email protected]
      link
      fedilink
      English
      99 months ago

      It seems easy to take this position as a native English speaker, but what if you aren’t proficient in English, perhaps only in a smaller regional language that doesn’t have the same nearly infinite pool of content? This is a potential game changer for that, allowing you to listen to thousands of podcasts you never could before. No jobs were lost because there was never anyone doing the translations in the first place. When viewed this way, it’s an accessibility feature.

      • @[email protected]
        link
        fedilink
        English
        9
        edit-2
        9 months ago

        Bing bang boom.

        I think people are totally steeped in capitalist rhetoric and are completely used to living it. I 100% support creative work and I will die paying humans cold hard cash for their artistic output. But everything else should 100% be automated where it can be with the expectation that humans no longer HAVE to work to be comfortable.

        This is the same thing to me as worrying about accountants and HR when a bunch of them got displaced with computers. It disproportionately takes away jobs without equivalent replacements from people that are trained and educated with this specialization in mind, but it also moves us toward a world where we don’t have to sell our waking moments to someone else.

        It absolutely sucks ass that we aren’t already preparing for a post-capitalist or semi-post-capitalist world and people are stressed, hungry, and unsheltered. But every time I see something like this, it feels like we’re making some kind of progress toward that because not only does it remove a space for humans to be exploited for labor, but it enables previously-unfathomable levels of accessibility that has been locked behind economic barriers (e.g. hiring people to translate Ologies into Pidgin languages would be totally unprofitable and therefore would almost never happen).

        /rant

        • @[email protected]
          link
          fedilink
          English
          19 months ago

          It makes sense and is good from a technological standpoint, humans have always wanted to advance. But that means we must be even more politically active to save ourselves from exploitation in the future.

      • @[email protected]
        link
        fedilink
        English
        29 months ago

        Ronald would like me to tell you that Seamus told him that Dean was told by Parvati that Hagrid’s looking for you.

  • @[email protected]
    link
    fedilink
    English
    179 months ago

    I have a strong feeling the terms of usage for this opt-in will include something along the lines of “we can use your voice for our future projects” and then in a few years they will just create podcasts using podcasters’ voices without their true consent and make a ton off their backs while increasing their competition.

    • @[email protected]
      link
      fedilink
      English
      59 months ago

      That is of course the danger… as it is it’s pretty benign, allowing more people to consume podcasts in their own language. But the terms need to be clear.

      • @[email protected]
        link
        fedilink
        English
        69 months ago

        And I am certain the terms will be clear and concise, definitely less than 50 pages and no vague and contradicting statements all over.