Publications
Tomohiko Nakamura, Kwanghee Choi, Keigo Hojo, Yoshiaki Bando, Satoru Fukayama, and Shinji Watanabe, "Discrete speech unit extraction via independent component analysis," in SALMA: Speech and Audio Language Models - Architectures, Data Sources, and Training Paradigms, IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, Apr. 2025.
Hitoshi Suda, Shunsuke Yoshida, Tomohiko Nakamura, Satoru Fukayama, and Jun Ogata. FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs. in Proceedings of the 25th International Society for Music Information Retrieval (ISMIR) Conference, 2024.
Hiroaki Hyodo, Shinnosuke Takamichi, Tomohiko Nakamura, Junya Koguchi, and Hiroshi Saruwatari, "DNN-based ensemble singing voice synthesis with interactions between singers," in Proceedings of IEEE Spoken Language Technology Workshop, Dec. 2024.
Yuto Ishikawa, Osamu Take, Tomohiko Nakamura, Norihiro Takamune, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Real-time noise estimation for lombard-effect speech synthesis in human-avatar dialogue systems," in Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Dec. 2024.
Kanami Imamura, Tomohiko Nakamura, Kohei Yatabe, and Hiroshi Saruwatari, "Neural analog filter for sampling-frequency-independent convolutional layer," APSIPA Transactions on Signal and Information Processing, Dec. 2024.
Yuto Ishikawa, Tomohiko Nakamura, Norihiro Takamune, and Hiroshi Saruwatari, "Real-time framework for speech extraction based on independent low-rank matrix analysis with spatial regularization and rank-constrained spatial covariance matrix estimation," in Proceedings of Workshop on Spoken Dialogue Systems for Cybernetic Avatars (SDS4CA), Sep. 2024
Hitoshi Suda, Aya Watanabe, and Shinnosuke Takamichi. Who finds this voice attractive? A large-scale experiment using in-the-wild data. In Proc. INTERSPEECH, 2024.
Kwanghee Choi, Ankita Pasad, Tomohiko Nakamura, Satoru Fukayama, Karen Livescu, and Shinji Watanabe, "Self-supervised speech representations are more phonetic than semantic," in Proceedings of INTERSPEECH, 2024.
Yoshiaki Bando, Tomohiko Nakamura, and Shinji Watanabe, "Neural blind source separation and diarization for distant speech recognition," in Proceedings of INTERSPEECH, 2024.
Takaaki Saeki, Shinnosuke Takamichi, Tomohiko Nakamura, Naoko Tanji, Hiroshi Saruwatari. SelfRemaster: Self-Supervised Speech Restoration for Historical Audio Resources. IEEE Access, 2023.