Intelligent Media Processing Research Team

Team Outline

Our research team is working towards the development of AI technologies that can recognize and understand various "media" such as audio, video, text, and other sensory data in an integrated manner. Through the research and development on a variety of real-world data, we will contribute to support a wide range of fields including not only the human behavior analysis but also industrial machinery/infrastructure.

Satoru Fukayama
Team Leader

Information

2024.10.26

We will present a demo "What does your voice sound like?" on the AIST Open Day 2024 at AIST Tokyo Waterfront.
Saturday, October 26th, 2024 10am～4pm Last admission 3:30pm

2022.10.26

Yoshihiro Sato, Satoru Fukayama, and Jun Ogata will be presenting at the Seismological Society of Japan, Fall Meeting 2022.
Fault Plane Estimation from 3D Hypocenter Distribution by Two-step Clustering Considering Local Shapes

2022.09.14

Hiroki Karatsu, Satoru Fukayama will be presenting at IPSJ SIGMUS 135th meeting.
Evaluation of Data Augmentation for DeepBach-based Automatic Four-part Harmonisation

More...

List of Publications

2026.09.27

William Chen, Shinnosuke Takamichi, Sayaka Shiota, Satoru Fukayama, Samuele Cornell, and Shinji Watanabe, "YODAS v3: Over 1 Million Hours of High-Bandwidth, Stereophonic, Multilingual Speech," in Proceedings of INTERSPEECH, Sept. 2026.

2026.09.27

Kentaro Onda, Satoru Fukayama, Daisuke Saito, and Nobuaki Minematsu, "Leveraging Soft Distributions of SSL-Derived Discrete Speech Tokens for Downstream Inference," in Proceedings of INTERSPEECH, Sept. 2026.

2026.09.27

Sota Koshino, Shotaro Ueji, Shinnosuke Takamichi, and Tomohiko Nakamura, "Automatic generation of audio comic from manga images," in Proceedings of INTERSPEECH, Show&Tell Session, Sept. 2026.

More...

Researcher Profile

Name and role	Field of Expertise	E-mail address　HP
Team Leader Satoru Fukayama	Media Informatics, Acoustic Signal Processing, Music Informatics	s.fukayama[at]aist.go.jp https://sites.google.com/view/sfukayama/
Senior Researcher Tomohiko Nakamura	Signal-processing-inspired deep Learning, Audio signal processing, Music information processing	tomohiko-nakamura[at]aist.go.jp https://tomohikonakamura.github.io/Tomohiko-Nakamura/index.html
Senior Researcher Nobutaka Ito	Acoustic Signal Processing, Source Separation, Array Signal Processing	nobutaka.itou[at]aist.go.jp https://nobutaka-ito.github.io/index.html
Researcher Hitoshi Suda	Spoken language processing, Singing information processing	suda.h[at]aist.go.jp https://gavo.t.u-tokyo.ac.jp/~hitoshi/
Cross-appointment fellow Yuki Saito	speech synthesis, speech quality assessment	yuuki.saitou[at]aist.go.jp
AI Engineer Daigo Takizawa	AI System	daigo.takizawa[at]aist.go.jp
Post-Doctoral Researcher Kai Hiraiwa	Acoustic Signal Processing, Music Information Processing	kai.hiraiwa[at]aist.go.jp
Post-Doctoral Researcher Yu-Hua Chen	Music information processing, Audio effect modeling, guitar music information retrieval	yh.chen[at]aist.go.jp
Research Assistant Hiroki Karatsu	Music Information Processing	karatsu-hiroki[at]aist.go.jp
Research Assistant Kanami Imamura	Audio signal processing	kanami-imamura[at]aist.go.jp
Research Assistant Kentaro Onda	Audio signal processing, Speech synthesis	k.onda[at]aist.go.jp
Research Assistant Shun Takahashi	Spoken Language Processing	takahashi.shun.tq9[at]aist.go.jp
Research Assistant Riki Takizawa	Sound synthesis	takizawa.riki[a]aist.go.jp
Research Assistant Natsuki Toda	Speech production	natsuki.toda[at]aist.go.jp
Research Assistant Ryoko Arita	Speech synthesis，Singing voice synthesis	ryoko1119-arita[at]aist.go.jp
Research Assistant Wataru Nakata	Speech synthesis	nakata-wataru855[at]aist.go.jp
Invited Researcher Transferred temporarily to METI Jun Ogata	Spoken Language Processing, Time Series Processing	jun.ogata[at]aist.go.jp
Invited Researcher (Prof. of Waseda University) Tetsuji Ogawa	Speech and Audio Processing, Anomaly Detection
Invited Researcher (Assoc. prof. of Keio University) Shinnosuke Takamichi
Invited Researcher (Assoc. prof. of Tokyo Metropolitan University) Sayaka Shiota	Speech Signal Processing
Invited Researcher (Prof. of Chubu University) Kazuo Yamamoto	Lightning Protection, Anomaly detection of wind turbines	kyamamoto[at]chubu.ac.jp https://pfs.chubu.ac.jp/faculty/yamamoto-kazuo/
Visiting Researcher (National Institute of Informatics) Yusuke Yasuda	Speech information processing	yasuda.yusuke[at]aist.go.jp https://researchmap.jp/yusuke_yasuda