Satoru Ikehara *, Jin'ichi Murakami * , Yasuhiro Kimoto * , Teturou Araki **
* ikehara, murakami, kimoto@ike.tottori-u.ac.jp
Faculty of Engineering, Tottori University
Minami 4-101, Tottori-city, 680-8552 Japan
** araki
araki@knowipc.fuee.fukui-u.ac.jp
Department of Human and Artificial Intelligent Systems
Fukui University
Bunkyou 3-9-1, Fukui, Fukui 910-8507, Japan
The attribute system consists of a tree structure with 2,710 attributes, which includes 400 thousand literal words. Using this attribute system, the generalization of vector elements can be performed easily based on upper-lower relationships of semantic attributes, so that the dimension can easily be reduced at very low cost. Synonyms are automatically assessed through semantic attributes to improve the recall performance of retrieval systems.
Experimental results applying it to BMIR-J2 database of 5,079 newspaper articles showed that the dimension can be reduced from 2,710 to 300 or 600 with only a small degradation in performance. High recall performance was also shown compared with conventional VSM.