論文
Ryoko Arita, Joonyong Park, Wataru Nakata, Yuki Saito, and Hiroshi Saruwatari, "CodecMOS: Singing MOS prediction through the integration of self-supervised speech representations and neural audio codec features," Acoustical Science and Technology, no. e25.101, Apr. 2026.
Wataru Nakata, Yuki Saito, Kazuki Yamauch, Emiru Tsunoo, and Hiroshi Saruwatari, "DialogueSidon: Recovering full-duplex dialogue tracks from in-the-wild dialogue audio," in Proceedings of SIGDIAL, Aug., 2026.
Shunsuke Yoshida, Chen Yu-Hua, and Satoru Fukayama, "UT-AISTimprt submission for ICME 2026 Grand Challenge on Academic Text-to-Music Generation," in Proceedings of ICME, Jul. 2026.
William Chen, Shinnosuke Takamichi, Sayaka Shiota, Satoru Fukayama, Samuele Cornell, and Shinji Watanabe, "YODAS v3: Over 1 Million Hours of High-Bandwidth, Stereophonic, Multilingual Speech," in Proceedings of INTERSPEECH, Sept. 2026.
Kentaro Onda, Satoru Fukayama, Daisuke Saito, and Nobuaki Minematsu, "Leveraging Soft Distributions of SSL-Derived Discrete Speech Tokens for Downstream Inference," in Proceedings of INTERSPEECH, Sept. 2026.
Sota Koshino, Shotaro Ueji, Shinnosuke Takamichi, and Tomohiko Nakamura, "Automatic generation of audio comic from manga images," in Proceedings of INTERSPEECH, Show&Tell Session, Sept. 2026.
Woan-Shiuan Chien, Tomohiko Nakamura, Huan-Yu Chen, Fukayama Satoru, Hitoshi Suda, Jun Ogata, and Chi-Chun Lee, "Two-sided fairness transfer for gender-neutral speech emotion recognition with partially observed attributes," in Proceedings of INTERSPEECH, Sept. 2026.
Daigo Takizawa, Tomohiko Nakamura, Samuele Cornell, William Chen, Satoru Fukayama, and Shinji Watanabe, "Dissecting sensitivity to training language in self-supervised speech learning using neural audio codec tokens," in Proceedings of INTERSPEECH, Sept. 2026.
Ren Uchida, Kohei Yatabe, and Tomohiko Nakamura, "Encoder-masking-decoder networks using orthogonal convolutional layer as invertible linear encoder," Acoustical Science and Technology, no. e26.10, May 2026.
Kentaro Onda, Satoru Fukayama, Daisuke Saito, and Nobuaki Minematsu, "Advanced Modeling of Interlanguage Speech Intelligibility Benefit with L1-L2 Multi-Task Learning Using Differentiable K-Means for Accent-Robust Discrete Token-Based ASR," in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2026.

