senooken JP Social
  • FAQ
  • Login
senooken JP Socialはsenookenの専用分散SNSです。
  • Public

    • Public
    • Network
    • Groups
    • Popular
    • People

Conversation

Notices

  1. Kenzi NOIKE (mstdn.jp) (求職中) (knoike@mstdn.jp)'s status on Monday, 25-Feb-2019 05:57:37 JST Kenzi NOIKE (mstdn.jp) (求職中) Kenzi NOIKE (mstdn.jp) (求職中)

    | [1902.06797] End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model https://arxiv.org/abs/1902.06797

    In conversation Monday, 25-Feb-2019 05:57:37 JST from mstdn.jp permalink

    Attachments

    1. No result found on File_thumbnail lookup.
      End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model
      from arXiv.org
      Time-aligned lyrics can enrich the music listening experience by enabling karaoke, text-based song retrieval and intra-song navigation, and other applications. Compared to text-to-speech alignment, lyrics alignment remains highly challenging, despite many attempts to combine numerous sub-modules including vocal separation and detection in an effort to break down the problem. Furthermore, training required fine-grained annotations to be available in some form. Here, we present a novel system based on a modified Wave-U-Net architecture, which predicts character probabilities directly from raw audio using learnt multi-scale representations of the various signal components. There are no sub-modules whose interdependencies need to be optimized. Our training procedure is designed to work with weak, line-level annotations available in the real world. With a mean alignment error of 0.35s on a standard dataset our system outperforms the state-of-the-art by an order of magnitude.

    Feeds

    • Activity Streams
    • RSS 2.0
    • Atom
    • Help
    • About
    • FAQ
    • TOS
    • Privacy
    • Source
    • Version
    • Contact

    senooken JP Social is a social network, courtesy of senooken. It runs on GNU social, version 2.0.2-beta0, available under the GNU Affero General Public License.

    Creative Commons Attribution 3.0 All senooken JP Social content and data are available under the Creative Commons Attribution 3.0 license.