Publications
Hitoshi Suda, Shinnosuke Takamichi, Satoru Fukayama, "Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora," in the Proceedings of Interspeech 2025, Aug. 2025.
Aogu Wada, Tomohiko Nakamura, and Saruwatari Hiroshi, "Hyperbolic embeddings for order-aware classification of audio effect chains," in Proceedings of International Conference on Digital Audio Effects, Sep. 2025.
Tomohiko Nakamura, Kwanghee Choi, Keigo Hojo, Yoshiaki Bando, Satoru Fukayama, and Shinji Watanabe, "Discrete speech unit extraction via independent component analysis," in SALMA: Speech and Audio Language Models - Architectures, Data Sources, and Training Paradigms, IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, Apr. 2025.
Hitoshi Suda, Shunsuke Yoshida, Tomohiko Nakamura, Satoru Fukayama, and Jun Ogata. FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs. in Proceedings of the 25th International Society for Music Information Retrieval (ISMIR) Conference, 2024.
Hiroaki Hyodo, Shinnosuke Takamichi, Tomohiko Nakamura, Junya Koguchi, and Hiroshi Saruwatari, "DNN-based ensemble singing voice synthesis with interactions between singers," in Proceedings of IEEE Spoken Language Technology Workshop, Dec. 2024.
Yuto Ishikawa, Osamu Take, Tomohiko Nakamura, Norihiro Takamune, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Real-time noise estimation for lombard-effect speech synthesis in human-avatar dialogue systems," in Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Dec. 2024.
Kanami Imamura, Tomohiko Nakamura, Kohei Yatabe, and Hiroshi Saruwatari, "Neural analog filter for sampling-frequency-independent convolutional layer," APSIPA Transactions on Signal and Information Processing, Dec. 2024.
Yuto Ishikawa, Tomohiko Nakamura, Norihiro Takamune, and Hiroshi Saruwatari, "Real-time framework for speech extraction based on independent low-rank matrix analysis with spatial regularization and rank-constrained spatial covariance matrix estimation," in Proceedings of Workshop on Spoken Dialogue Systems for Cybernetic Avatars (SDS4CA), Sep. 2024
Hitoshi Suda, Aya Watanabe, and Shinnosuke Takamichi. Who finds this voice attractive? A large-scale experiment using in-the-wild data. In Proc. INTERSPEECH, 2024.
Kwanghee Choi, Ankita Pasad, Tomohiko Nakamura, Satoru Fukayama, Karen Livescu, and Shinji Watanabe, "Self-supervised speech representations are more phonetic than semantic," in Proceedings of INTERSPEECH, 2024.