Does anyone know of a local audio upscaler? Preferably Android based.

  • Atemu@lemmy.ml
    link
    fedilink
    arrow-up
    14
    ·
    10 months ago

    What exactly do you want to “upscale” and what effect is that supposed to cause?

          • harald_im_netz@feddit.de
            link
            fedilink
            arrow-up
            11
            ·
            10 months ago

            Hello, Audio Engineer with some little knowledge regarding AI here.

            What you think of is restoring frequencies, this is possible, and commonly used in plugins for audio restaurization. I might be mistaken, but this does not improve the bitrate, but the perceived quality (which is still lossy).

            I don’t think that there is a real interest to upscale quality (not perceived quality), especially for longer (> 1 minute) material.

            • sabreW4K3@lemmy.tfOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              10 months ago

              It’s funny you say that. I, as most people bought a bunch of CDs back in the day and ripped a bunch before I gave up my CD drive. At the time, storage was expensive and so I did what I could at the time with MP3. As storage gets cheaper (though not cheap enough for me to go lossless), I’d like to be able to upscale my music while keeping a similar file size and have my collection mature with me until storage becomes cheap enough for me to go lossless.

              I can’t be the only person who’s thought of this.

              • comicallycluttered@beehaw.org
                link
                fedilink
                arrow-up
                11
                ·
                edit-2
                10 months ago

                You’re better off buying a cheap USB optical drive, re-ripping those CDs, and transcoding the files to something like Opus, which offers comparable quality to 320kbps MP3 files at lower bitrates (which also means smaller file sizes).

                Or you can just “download” the FLAC versions, transcode those, and delete them after.

                Also, kind of funny how this was posted just after someone complained about the same thing in the audio engineering subreddit.

                • noodlejetski@lemm.ee
                  link
                  fedilink
                  arrow-up
                  2
                  ·
                  edit-2
                  10 months ago

                  wow, so now reddit won’t let you see the post without logging in even if you open it through the old.reddit domain?

          • MangoPenguin@lemmy.blahaj.zone
            link
            fedilink
            English
            arrow-up
            6
            ·
            10 months ago

            It wouldn’t be the original audio, the AI would just be making up new content to fill in the blanks like it does with a photo.

              • Zoom@lemmy.studio
                link
                fedilink
                arrow-up
                3
                ·
                10 months ago

                If you’re OK listening to a derivative work of your input. Otherwise, it’s bad.

                • sabreW4K3@lemmy.tfOP
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  10 months ago

                  Hmmm. I feel like this is one of those long-term studies that would be quite exciting? Am I wrong to be a little bit excited about programs learning how to guess correctly what should be where and subsequently how things should sound?

        • sabreW4K3@lemmy.tfOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          10 months ago

          Can AI or machine learning not do in the same way that it does with pictures?

          • Atemu@lemmy.ml
            link
            fedilink
            arrow-up
            9
            ·
            10 months ago

            It cannot bring back lost data. It can hallucinate something that is statistically likely given the context but I’m not aware of any tool which can do that to a useful degree.

            What’s the context? Why can’t you just get a better encode where the data isn’t lost?

              • Zpiritual@lemm.ee
                link
                fedilink
                arrow-up
                6
                ·
                10 months ago

                Old mixtapes and such can be noisy with hizzes, pops and such. It is possible to filter out those artefacts but thats removing stuff, just as digitally compressing audio is removing stuff. You can’t create data from nowhere for digitally compressed files and you can’t simply add back the hizz, noise, and pops to the mixtapes if you remove that.

      • aleph@lemm.ee
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        10 months ago

        Going from 192kbps to 320kbps would be audibly negligible unless you used a really bad codec to begin with, in which case adding AI into the mix would likely just compound the problem.

        Probably not even worth it, tbh.

  • doctorzeromd
    link
    fedilink
    arrow-up
    10
    ·
    10 months ago

    I work in audio and had a thesis in DSP, so I’ll try to explain this. It is an interesting idea, and in some cases could work, but wouldn’t be practically useful in most.

    So there’s 2 types of audio encoding: Lossy and Lossless. All audio starts as lossless, and in many cases is converted to lossy to reduce the file size. The processing for this is NOT like compression, and is somewhat context aware in that it removes frequencies you wouldn’t hear because something else is more present and causing your ear not to really hear it (this is called masking).

    If you were to upscale something that is lossless, it would probably work. Barring any inter sample peaks, you’d be inferring additional points in a waveform and that’s fine. They’re actually some audio plugins that do this as an intermediate step when processing a signal.

    If you try to upscale something that is lossy, you can’t recreate what was removed, because there isn’t a way to infer that information anymore. It would be like if you were trying to upscale a photo but you’d already removed a dog that was somewhat obscured by a man’s hand. Even if you upscale the picture you can’t add the dog without somebody telling you that it was there before removal.

    The other part of the equation is “why?”, and while I’m a bit of an audiophile and I have my collection of lossless audio, the limitations of the system are typically the human ear. CD quality, (16-bit at 44.1 Khz), is really all you’d ever need. Most people can’t hear above 20 kilohertz (if you’re over 18, you’re lucky if you even get close to that). In digital audio, you can reproduce any frequency in equal to or less than half of the sample rate. With 44.1Khz, that frequency is 20,050hz. If you want to go really crazy, DVD quality (24 bit at 48Khz). I consider anything about that nice to have from a archival and measurement standpoint, but there’s no point in terms of human listening.

    • sabreW4K3@lemmy.tfOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      10 months ago

      I feel like you enjoyed writing this and even if you didn’t, I enjoyed reading it. Thank you for taking the time and putting in the effort.

      • doctorzeromd
        link
        fedilink
        arrow-up
        4
        ·
        10 months ago

        I absolutely did! I was really hoping to teach a class about audio at a nearby university, but was told last minute that they can’t give it to me for bureaucratic reasons. It’s my dream to teach someday so I’m hopeful that I will find another University that will let me teach

        • sabreW4K3@lemmy.tfOP
          link
          fedilink
          English
          arrow-up
          3
          ·
          10 months ago

          Based on the little experience I have of the way you talk about audio, I can say without a doubt that whomever you teach will count themselves lucky. You’re passionate and you speak clearly, not over complicating things nor dumbing them down. I’m going to definitely keep an eye out in the audio topics for your posts as well as keep my fingers crossed that you another chance to teach!