AIZAWA Kiyoharu Professor
Hongo Campus
Media Interaction
Multimedia, Image Processing, Computer Vision, Pattern Recognition
As fundamental topics, we investigate recognition and learning problems in open worlds and exploitation of large-scale vision and language models. As application issues, we investigate 360-degree video processing for real-world metaverse, food computing for FoodLog, and comic computing for comic books, etc.
Research field 1
Image recognition / learning foundation, open world
Deep Learning accurately works for closed dataset containing large number of data per class. However, in reality, unknown classes and new classes with small amount of data frequently appear. We are investigating identification and recognition techniques for such open world situation. The topics are methodology for noisy training data, out-of-distribution detection, positive-unlabeled learning, open-set data learning, new category discovery, uncertainty estimation etc.
Research field 2
360 Image/Video Processing, 3D, Movie Map, Real World Metaverse
We are investigating 360̊ image processing. Specifically, we build “movie map” for walkers to explore in a city. Using 360̊ street videos, we work on many different research issues such as hyperlaspe 360̊video, 360̊image object detection, accurate vSLAM, intersection detection, depth from 360̊ image, RoI detection, route view generation, 360̊ super-resolution, detection of possible direction of travel, avatar in 360̊ video, real-world Metaverse etc. We are prototyping a platform of virtual exploration.
Research field 3
Life Logging, Food Computing
We have been pioneering life logging technology. To pursue specific purpose lifelogging, we focus on research on capture and analysis of our daily food logs (FoodLog), Using the app we developed, food records we captured exceeds 10 million. We are investigating various processing of FoodLog data, such as personalized food recognition, recipe and food record multimodal analysis, building a new tool for athletes and dietitians, prediction of healthy index, etc.
Research field 4
Manga Comic Computing
Manga, our unique culture, is our research target. We have built a world largest scale Manga dataset, and investigate fundamentals of image processing techniques such as retrieval, segmentation, recognition, colorization, creator style transfer, etc. We are also investigating persons’ reading behaviors.
Research field 5
Deep Image Compression, Image Generation, Scene Text Recognition
We investigate new image compression techniques using deep learning. Although image compression is a long traditional area of signal processing, a lots of research issues exist. We investigate a universal deep compression that adapts to variety of content outside the training domain. As for image generation, we investigate a Diffusion Model based framework that allows easy modification of the results. We have built a dataset of manga onomatopoeia for the most challenging problem for scene text recognition.