Manga 109

A dataset of 109 Japanese manga volumes,
curated at The University of Tokyo for academic research.

東京大学 相澤・山崎・松井研究室が構築する日本漫画データセット.

109 Volumes
21,142 Pages
1970s→ Era Coverage
87 Manga109-s

A curated dataset
for manga research.

The Manga109 dataset was compiled by the Aizawa-Yamasaki-Matsui Laboratory at the Department of Information and Communication Engineering, Graduate School of Information Science and Technology, The University of Tokyo, for academic use in manga media processing research.

Manga109 consists of 109 volumes of manga drawn by professional Japanese manga artists. They were published between the 1970s and the 2010s, covering a wide range of target audiences and genres. Most of the volumes included are available on Manga Library Z (formerly known as the Out-of-Print Manga Library).

Manga109 has been authorized for research use by the authors under the condition that it is limited to academic, non-commercial purposes. In addition, 87 of the 109 volumes have been newly approved for commercial use. We provide this subset under the name Manga109-s.

At a glance
Provider
Aizawa-Yamasaki-Matsui Lab, UTokyo
Volumes
109 (Manga109) / 87 (Manga109-s)
Period
1970s – 2010s
Source
Manga Library Z
Request access →

The Manga109 series of papers has been
widely cited by the research community.

Combined citations across Manga109, Comic Onomatopoeia, Manga109Dialog, MangaUB and related publications.

2,200+ Series Total
Citations

Rich annotations
beyond raw images.

01

Manga109

Bounding-box annotations for
characters, faces, frames, and text.

02

Comic Onomatopoeia

A dedicated dataset for
onomatopoeia in Japanese comics.

03

Manga109Dialog

Speaker–dialogue associations
across manga scenes.

04

MangaLMM & MangaVQA

A specialized LMM and VQA benchmark
for multimodal manga understanding.

The people
behind Manga109.

Manga109 was previously maintained by members of the Aizawa-Yamasaki-Matsui Lab.

The current organizers are listed below.

Kiyoharu Aizawa / 相澤清晴

Kiyoharu Aizawa相澤 清晴

Professor Emeritus / Project Professor

The University of Tokyo

Homepage →
Jeonghun Baek / 白定勳

Jeonghun Baek白 定勳

Assistant Professor

The University of Tokyo

Homepage →
Atsuyuki Miyai / 宮井淳行

Atsuyuki Miyai宮井 淳行

PhD Student

The University of Tokyo

Homepage →
Shota Onohara / 小野原菖太

Shota Onohara小野原 菖太

Master's Student

The University of Tokyo

Latest updates.

  1. Our new paper on "MangaVQA and MangaLMM" has been accepted by EACL Findings 2026.

  2. A revised release with fixes for some corrupted images has been published.

  3. The comic panel viewing-time dataset (measured via experiments) has been released.

  4. Manga109Dialog, a dataset associating speakers with dialogue, has been released.

  5. The Comic Onomatopoeia dataset has been released.

  6. A revised release removing 17 duplicate images and correcting the corresponding annotations (along with two zero-area bounding boxes) has been published.

  7. A revised release correcting typos and other errors in the annotations has been published.

  8. Released manga109api, a Python API for reading the annotation data.

License & Terms.

Manga109

The permission for the use of the works contained in the Manga109 dataset (hereby referred to as "the dataset"), including modifications of parts of the images, is granted from the authors of each of the works, under the following conditions. The dataset can be used solely for academic purposes.

Permitted uses

  • To perform experiments using the dataset.
  • To print the works contained in the dataset as a part of an academic paper.
  • Recording an academic paper that contains the work of the dataset, to a digital library, such as a digital library for academic conferences.
  • Use within digital media, such as demo videos created for the purpose of presenting academic results.

Attribution & restrictions

Redistribution of any part of the dataset to third parties is forbidden. When using parts of the dataset within academic papers or videos, please note the author's usage permission notice as "courtesy of [Author's Name]" (or "© [Author's Name]") and note that the work was cited from the Manga109 dataset. In addition, for use in academic papers, please cite the related papers shown below as well.

Disclaimer

In no event shall the distributors of the Manga109 dataset, and the authors of the manga works contained in the Manga109 dataset, be liable for any claim, damages or other liability, for the use of the Manga109 dataset.

Proper use of Manga109

Users are requested to strictly observe the following rules when using Manga109:

  1. The dataset is to be used for academic purposes by non-commercial organizations.
  2. Data shall not be transferred to a third party.
  3. When including manga material in an academic paper or video, users should include the relevant author's name as "courtesy of [Author's Name]" (or "© [Author's Name]").
Manga109-s

Among the works contained in the Manga109 dataset, 87 books are available for commercial use, under the following conditions. This special subset of the Manga109 dataset is available as the Manga109-s dataset (hereby referred to as "the dataset").

Permitted uses

  • Using the Manga109-s dataset for experiments for machine learning or image processing.
  • Printing the manga images within the dataset on an academic paper.
  • Recording an academic paper that contains the manga images within the dataset to a digital library.
  • Using the manga images within the dataset inside academic demo videos and other digital media.
  • Using results, or portions of results, obtained from machine learning experiments or image processing experiments, for commercial use.

Conditions

The above uses are permitted under the following conditions:

  1. Redistribution of the Manga109-s dataset to third parties is forbidden.
  2. When publishing results (including pre-trained models) obtained from machine learning experiments or image processing experiments, the use of the Manga109-s dataset must be indicated clearly within the published work.
  3. Selling manga images within the dataset together with results obtained from machine learning or image processing experiments is forbidden.
  4. Direct copies or modifications of the manga images within the Manga109-s dataset must not be treated as products, regardless of the product being either free or being sold for a fee.
  5. For all uses from Number 1 to 5 when publishing whole pages (or modifications of whole pages) of the manga works contained within the dataset for the purpose of presenting the results of research and development, the total number of whole pages (including modifications of whole pages) to be published must not exceed 20% of the entire book (volume), for each of the books (volumes) in the dataset. Publishing over 20% of whole pages or modifications of whole pages of any book (volume) is forbidden.

Attribution & disclaimer

Redistribution of the dataset to third parties is forbidden. When using parts of the dataset within academic papers or videos, please note the author's usage permission notice as "courtesy of [Author's Name]" (or "© [Author's Name]") and note that the work was cited from the Manga109-s dataset. In addition, for use in academic papers, please cite the related papers shown below as well.

In no event shall the distributors of the Manga109-s dataset, and the authors of the manga works contained in the Manga109-s dataset, be liable for any claim, damages or other liability, for the use of the Manga109-s dataset.

Ready to dive into
109 volumes of manga?

Choose Manga109 or Manga109-s depending on your research use case.

We extend our thanks to Mr. Ken Akamatsu and to the authors for their cooperation in the creation of Manga109. We also acknowledge the support of the Strategic Information and Communications R&D Promotion Programme of the Japan Ministry of Internal Affairs and Communications (SCOPE) and KAKENHI in the construction of this dataset.