論文 | 知的メディア処理研究チーム | 研究チーム

論文

2026.09.27

William Chen, Shinnosuke Takamichi, Sayaka Shiota, Satoru Fukayama, Samuele Cornell, and Shinji Watanabe, "YODAS v3: Over 1 Million Hours of High-Bandwidth, Stereophonic, Multilingual Speech," in Proceedings of INTERSPEECH, Sept. 2026.

2026.09.27

Kentaro Onda, Satoru Fukayama, Daisuke Saito, and Nobuaki Minematsu, "Leveraging Soft Distributions of SSL-Derived Discrete Speech Tokens for Downstream Inference," in Proceedings of INTERSPEECH, Sept. 2026.

2026.09.27

Sota Koshino, Shotaro Ueji, Shinnosuke Takamichi, and Tomohiko Nakamura, "Automatic generation of audio comic from manga images," in Proceedings of INTERSPEECH, Show&Tell Session, Sept. 2026.

2026.09.27

Woan-Shiuan Chien, Tomohiko Nakamura, Huan-Yu Chen, Fukayama Satoru, Hitoshi Suda, Jun Ogata, and Chi-Chun Lee, "Two-sided fairness transfer for gender-neutral speech emotion recognition with partially observed attributes," in Proceedings of INTERSPEECH, Sept. 2026.

2026.09.27

Daigo Takizawa, Tomohiko Nakamura, Samuele Cornell, William Chen, Satoru Fukayama, and Shinji Watanabe, "Dissecting sensitivity to training language in self-supervised speech learning using neural audio codec tokens," in Proceedings of INTERSPEECH, Sept. 2026.

2026.09.07

Nobutaka Ito, "A unified complex spherical Student's t mixture model for directional statistics in mask-based blind speech separation," in Proceedings of International Workshop on Acoustic Signal Enhancement, Sept. 2026.

2026.09.07

Yuto Ishikawa, Norihiro Takamune, Kouei Yamaoka, Tomohiko Nakamura, and Hiroshi Saruwatari, "Joint optimization of demixing filters and asymmetric window function for independent vector analysis," in Proceedings of International Workshop on Acoustic Signal Enhancement, Sept. 2026.

2026.09.07

Ege Erdem, Shoichi Koyama, Tomohiko Nakamura, Orchisama Das, and Zoran Cvetkovic, "SF-Flow: Sound field magnitude estimation via flow matching guided by sparse measurements," in Proceedings of International Workshop on Acoustic Signal Enhancement, Sept. 2026.

2026.09.07

Tomohiko Nakamura, Wataru Nakata, Kanami Imamura, and Yuki Saito, "Neural audio codec with adjustable token temporal resolution using sampling-frequency-independent convolutional layers," in Proceedings of International Workshop on Acoustic Signal Enhancement, Sept. 2026.

2026.09.01

Kengo Takemoto, Tomohiko Nakamura, and Hiroshi Saruwatari, "Diffusion-based music audio editing system using differentiable digital signal processing mixture model," in Proceedings of International Conference on Digital Audio Effects, Demo Session, Sept. 2026.

1 2 3 4 5 6 7 >

PageTop