Towards accurate recognition of historical Arabic manuscripts: a novel dataset and a generalizable pipeline
Journal article
| Authors | Bouchal, H., Ahror, B. and Meziane, F. |
|---|---|
| Abstract | In today’s digital world, we are committed to digitizing thousands of handwritten transcriptions to preserve their content. Historical Arabic Handwritten Text Recognition (HAHTR) remains a challenge for computer vision systems, due to the many difficulties inherently associated with document image quality and the complexity of Arabic script. In this work, we address the problem of recognizing historical Arabic documents that adapts to different writing styles and degrees of legibility. We developed a system that is able to recognize a whole page of a historical Arabic handwritten text in two consecutive steps comprising text line detection and recognition. The proposed approach performs detection using bounding boxes followed by a neural network-based model for character-level text recognition. However, the lack of data hinders the mass digitization of Arabic historical documents. Therefore, we provide a new and freely available dataset, focusing on diverse handwriting styles |
| Keywords | Text Detection; Handwritten Text Recognition; Arabic Historical Documents; CNN-BLSTM; Arabic Dataset |
| Year | 2025 |
| Journal | ACM Transactions on Asian and Low-Resource Language Information Propcessing |
| Publisher | ACM |
| ISSN | 2375-4699 |
| Digital Object Identifier (DOI) | https://doi.org/10.1145/3744243 |
| Web address (URL) | https://dl.acm.org/doi/10.1145/3744243 |
| Accepted author manuscript | License File Access Level Open |
| Output status | Published |
| Publication dates | |
| Online | 08 Jun 2025 |
| Publication process dates | |
| Accepted | 27 May 2025 |
| Deposited | 01 Jul 2025 |
https://repository.derby.ac.uk/item/qy6v4/towards-accurate-recognition-of-historical-arabic-manuscripts-a-novel-dataset-and-a-generalizable-pipeline
Download files
84
total views10
total downloads6
views this month2
downloads this month