https://gitlab.com/christosangel/sapo3

  • Sapo3 is a suite of scripts-tools that can help the user convert a text file to an audio file.

  • It uses the tts-edge API for text-to-speech conversion.

  • Big txt files can be easily converted to audio books, using a wide range of customization capabilities.

When the user runs Sapo3, they will be presented with a menu of options:

  • o option: Fix name pronunciation with Fix Names

  • c option: Split text to chapters with Chapterize

  • v option: Convert File to audio

  • f option: Check every sentence outcome with Fix Audio option.

  • m option: Merging Audio Files

  • p option: Configuring Preferences

  • schizo
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    18 hours ago

    Neat idea, but the send-your-text-to-Microsoft bit of it is uh, well, no thank you.

    Seems like a strange choice, personally, but I’m not a fan of sending tech corporations anything avoidable.

    • christos@lemmy.worldOP
      link
      fedilink
      arrow-up
      5
      ·
      18 hours ago

      I totally undersand what you are saying. Initially, the original project used local text-to-speech, but was less than perfect, slower and cpu-costly.

      You can check it out here https://gitlab.com/christosangel/sapo

      Once a FOSS solution gets better and more usable, swapping the tts conversion is not a great deal.

          • lime!@feddit.nu
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            5 hours ago

            i believe that’s what speech-dispatcher is; a uniform interface for tts systems.

            • christos@lemmy.worldOP
              link
              fedilink
              arrow-up
              1
              ·
              5 hours ago

              speech-dispatcher

              If you are referring to locally generated speech synthesis, the respecting outcome as far as I am concerned generally sounds generally poorer, and is more difficult to manage. However you can check out the original project https://gitlab.com/christosangel/sapo, where the audio files are generated locally.

              • lime!@feddit.nu
                link
                fedilink
                English
                arrow-up
                3
                ·
                4 hours ago

                well speech-dispatcher has no synthesis component, you can plug in any tts engine that follows the interface. it’s nice to have a choice in engine just by implementing the support. personally i use piper which i feel gives a pretty good performance.

                • christos@lemmy.worldOP
                  link
                  fedilink
                  arrow-up
                  3
                  ·
                  4 hours ago

                  piper

                  Indeed piper performs very well. Thank you for the input, I will most certainly consider adding the option to select tts engine in the near future, piper sounds totally worth it.

      • schizo
        link
        fedilink
        English
        arrow-up
        1
        ·
        14 hours ago

        I’m somewhat surprised that there aren’t a lot of good alternatives but uh, yeah, there doesn’t seem to be.

        I would have expected there to be at least one or two good TTS engines but I guess that assumption is quite wrong.

        As to your other post, it’s less that I care in any specific sense that Microsoft knows what I’m reading and more of a (admittedly irrational) dislike of providing anything that an ad company could maybe later use to sell me shit.