次へ: Concluding Remarks
上へ: Experimental Results
戻る: (2) Experimental Results
In order to reduce the number of vector bases, generalizations by
granularity and weight were conducted. The relation between the number
of bases and the performance of information retrieval is shown in Fig.5. The value of evaluation function is also
shown in the same figure.
From this figure, the minimum set of vector bases at which the
performance of information retrieval does not decrease more than 10%
or 20% from the maximum value was obtained as shown in Table 1.
From these figures and the table, the following observations can be made:
- (1)
- S-VSM is robust in reducing the number of vector bases
compared to W-VSM.
- (2)
- In particular, generalization by weight is more robust than generalization by granularity.
図 5:
Determination of Minimum Number of Vector Bases
|
表 1:
Minimum Number of Vector Bases
|
On condition that the performance of information retrieval does not
decrease more than 10% to 20% from the maximum value, conventional
W-VSM requires 2,000 dimensions. In comparison with this, the number
of dimensions can be reduced to 300-600 in S-VSM.
次へ: Concluding Remarks
上へ: Experimental Results
戻る: (2) Experimental Results
Jin'ichi Murakami
平成13年10月5日