Synthetic Sample Extension in Implementation of Tangut Character Databases


Дәйексөз келтіру

Толық мәтін

Ашық рұқсат Ашық рұқсат
Рұқсат жабық Рұқсат берілді
Рұқсат жабық Тек жазылушылар үшін

Аннотация

The Tangut script was a logographic writing system used for the extinct Tangut language of the Western Xia Dynasty, which spanned 1038 to 1227. The technic of optical character recognition, machine learning, and computer vision will help greatly in the unscrambling of the character in the ancient scripts. But all these technics are based on the character database, which provides learning samples and test standards. In the process of building the Tangut Character Databases using the ancient Tangut scripts as a data source, it is found that the problem of imbalanced class distribution significantly compromises the performance of learning algorithms. A method of synthetic sample generation was proposed in this paper to improve the performance of learning and recognition of Tangut characters. The comparison of recognition accuracy between the learning base in the original data set and the synthetic generated data set was demonstrated, and presented an impressive superiority utilizing the researchers’ method. The organization of Tangut character databases was also introduced in this paper.

Авторлар туралы

Yifei Meng

School of Electronic and Information Engineering Beijing Jiaotong University; School of Physics and Electronic-Electrical Engineering Ningxia University

Хат алмасуға жауапты Автор.
Email: river_dance@163.com
ҚХР, Beijing; Yinchuan

Xue Yuan

School of Electronic and Information Engineering Beijing Jiaotong University

Email: river_dance@163.com
ҚХР, Beijing

Xueye Wei

School of Electronic and Information Engineering Beijing Jiaotong University

Email: river_dance@163.com
ҚХР, Beijing

Wenhui Yang

School of Physics and Electronic-Electrical Engineering Ningxia University

Email: river_dance@163.com
ҚХР, Yinchuan

Yan Chen

School of Physics and Electronic-Electrical Engineering Ningxia University

Email: river_dance@163.com
ҚХР, Yinchuan

Қосымша файлдар

Қосымша файлдар
Әрекет
1. JATS XML

© Allerton Press, Inc., 2018