2.
A subjective evaluation experiment was conducted to analyze the factors affecting the perception of bumpiness. In this experiment, bumpiness was defined as the bumpy appearance of an object’s surface, and each observer was asked to rate the bumpiness based on the subjectivity of each observer. In Section
2.1, the experimental method of the subjective evaluation experiment and in Section
2.2, the results of the analysis of the experimental results, are described.
2.2
Result
The experiments were conducted for an average of 61 min with rest at arbitrary times. Figure
2 shows the average score for each test image for the results obtained in the experiments of Section
2.1. Error bars in Fig.
2(a) and
2(b) indicate within-subject and between-subject deviations, respectively. In the figure, the mean scores are sorted in ascending order. The averages of within-subject and between-subject deviations were 0.42 and 0.55, respectively, confirming stable evaluation of the subjects. The maximum standard deviation (SD) was 1.16, and no outliers were found in the measured data when the residuals were defined as outliers with values greater than 3SD. The standard deviation was large for texture images of rock surfaces and walls made of bricks and stones. This may be attributed to the fact that the texture’s bumpiness varied from place to place and was evaluated differently depending on the region of interest. On the other hand, the SD of the evaluation values was small for images with a weak perception of bumpiness, such as images of paper, metal, and smooth-surfaced boards, independent of the region of interest.
Figure 2.
Scores of bumpiness for 50 images obtained from the experiment in Section
2.1.
The results obtained in this experiment showed that the amplitude and size of the bumps had a significant impact on the perceived bumpiness, and the perceived bumpiness was not affected by the number of bumps. There was also a tendency for sharper images to be rated higher in terms of bumpiness when the amplitude, size, and randomness of the bumps were of similar magnitude, and lower when the bumps varied smoothly.
First, a linear regression model was used to determine whether the statistics used in Ref. [
9] for gloss perception could be used to explain both bumpiness and gloss. The image statistics obtained from luminance histograms consisted of the mean (Mean), standard deviation (Std), skewness (Skew), kurtosis (Kurt), contrast (Cont), and top (Top). Contrast was defined as the SD divided by the mean of the luminance histogram, and Top was defined as the average luminance value within the top 10 [percentile]. A linear regression model was constructed using the explanatory variables, that is, the statistics of the 50 texture images used in the subjective evaluation experiment described in Section
2.1, obtaining a multiple correlation of
R = 0.757 and adjusted coefficient of determination
R2 = 0.514, which were not sufficiently accurate. Based on these results, it was difficult to use linear regression on the image statistics to adequately explain the perceived amount of bumpiness.
Hayashi et al. showed that the visual perception of bumpy shape is related to spatial frequency [
18]. In Section
2.1, in the bumpiness evaluation experiments, 22 images with evaluation values ranging from 1 to 2.5 were classified as “low bumpiness”; 13 images with evaluation values ranging from 2.5 to 3.5 were classified as “middle bumpiness”; and 15 images with evaluation values ranging from 3.5 to 5 were classified as “high bumpiness.” Some of the classified images are displayed in Figure
3. The results of averaging the power spectrum in the spatial frequency domain for each group of the three classified images are shown in Figure
4, where differences are observed in the mean values of the power spectrum in the low- and medium-frequency bands. Based on this result, it was concluded that the perception of bumpiness was related to spatial frequency, and a regression model was constructed using the statistics obtained from the multiscale images as explanatory variables. The procedure for obtaining a multiscale image is illustrated in Figure
5. Cyclic bandpass filters with a width of 20 cycles per image-width (cpi) were applied to the power spectrum obtained by applying Fourier transform to the original image. However, because the components with 0–5 cpi mainly represent the brightness of the entire image, they were treated as a single band. Eight images were obtained using the bandpass filter. The statistics obtained from these images were standardized and used as explanatory variables. The results obtained using this model had a multiple correlation of
R = 0.997 and an adjusted coefficient of determination
R2 = 0.876, confirming a significant improvement in prediction accuracy compared to the model constructed using image statistics obtained from the original images. However, as shown in Table
I, the results did not meet the 5% significance level because of the low
t-value and high
p-value.
Figure 3.
Examples of images classified into three groups.
Figure 4.
Results of averaging the power spectrum in the spatial frequency domain for each of the three groups of images classified according to bumpiness.
Table I.
Coefficients, t-values, and p-values resulting from regression modeling using image statistics obtained from multiscale images. The number after the component statistics represents the pass range of the bandpass filter.
Component | Coefficient | t-value | p-value | Component | Coefficient | t-value | p-value |
---|
Intercept | 1.10E-13 | 0 | 1 | | | | |
Mean 0–5 | −4.555 | −2.096 | 0.171 | Kurt 0–5 | −0.2964 | −1.318 | 0.318 |
Mean 5–25 | 3.296 | 1.056 | 0.402 | Kurt 5–25 | 0.6732 | 0.352 | 0.759 |
Mean 25–45 | −10.1 | −0.505 | 0.664 | Kurt 25–45 | −0.5872 | −0.334 | 0.77 |
Mean 45–65 | 13.02 | 0.625 | 0.596 | Kurt 45–65 | 3.504 | 0.535 | 0.646 |
Mean 65–85 | 26.37 | 1.074 | 0.395 | Kurt 65–85 | −1.899 | −0.317 | 0.781 |
Mean 85–105 | −56.59 | −2.407 | 0.138 | Kurt 85–105 | −2.08 | −0.811 | 0.503 |
Mean 105–125 | 39.93 | 1.723 | 0.227 | Kurt 105–125 | −2.177 | −1.037 | 0.409 |
Mean 125– | NA | NA | NA | Kurt 125– | −0.9277 | −0.895 | 0.465 |
Std 0–5 | −2.102 | −1.694 | 0.232 | Cont 0–5 | 0.9382 | 0.843 | 0.488 |
Std 5–25 | −28.98 | −1.658 | 0.239 | Cont 5–25 | 0.7309 | 1.115 | 0.381 |
Std 25–45 | 2.237 | 0.03 | 0.979 | Cont 25–45 | −0.2585 | −0.136 | 0.904 |
Std 45–565 | −6.51 | −0.06 | 0.957 | Cont 45–65 | −1.919 | −0.447 | 0.699 |
Std 65–85 | −74.82 | −1.763 | 0.22 | Cont 65–85 | 7.639 | 1.384 | 0.301 |
Std 85–105 | 216.4 | 2.667 | 0.117 | Cont 85–105 | −9.701 | −2.502 | 0.129 |
Std 105–125 | −465.6 | −0.925 | 0.453 | Cont 105–125 | 20.81 | 1.787 | 0.216 |
Std 125– | 345.7 | 0.836 | 0.491 | Cont 125– | −16.4 | −1.517 | 0.269 |
Skew 0–5 | −0.08513 | −0.471 | 0.684 | Top 0–5 | 4.338 | 2.177 | 0.161 |
Skew 5–25 | −1.498 | −1.172 | 0.362 | Top 5–25 | 26.25 | 1.553 | 0.261 |
Skew 25–45 | 0.3386 | 0.168 | 0.882 | Top 25–45 | 9.039 | 0.162 | 0.886 |
Skew 45–65 | −0.8833 | −0.287 | 0.801 | Top 45–65 | −9.148 | −0.094 | 0.934 |
Skew 65–85 | 0.7325 | 0.176 | 0.877 | Top 65–85 | 51.54 | 1.102 | 0.385 |
Skew 85–105 | 0.6179 | 0.4 | 0.728 | Top 85–105 | −161.1 | −2.028 | 0.18 |
Skew 105–125 | 2.067 | 1.468 | 0.28 | Top 105–125 | 388.2 | 0.743 | 0.535 |
Skew 125– | 0.569 | 0.54 | 0.643 | Top 125– | −308.5 | −0.677 | 0.568 |
Figure 5.
Procedure for obtaining a multiscale image. The multiscale image was visualized after amplifying the components by a factor of 5, excluding the 0–5 cpi range.
A regression model was constructed for each image statistic to reduce the number of explanatory variables. The results of multiple correlations
R and adjusted coefficient of determination
R2 calculated from luminance statistics are shown in Table
II. Higher accuracy was obtained from the regression model applied to the Mean, Std, and Top than those to the other statistics, namely, Skew, Kurt, and Cont, indicating that Skew and Kurt, in particular, did not affect the perception of bumpiness. As an example, the coefficients,
t-values, and
p-values of the linear regression model applied to the Mean are shown in Table
III for different cpi ranges. From the
t-values and
p-values of the results in Table
III, it is observed that the influence of the components within the 5–85 cpi range is large, and in particular, based on the
t-values and
p-values of the Mean, Std, and Top, it is concluded that the influence of the components within the 5–65 cpi range is large. This result indicates that the components in the low- and medium-frequency ranges influence the perception of bumpiness, as observed in the difference in the mean of the power spectrum for the three groups in Fig.
4. These cues are consistent with the results of Ho et al. [
16,
17] for mean pixel value. However, the contrasts did not agree. This might be due to the difference between Ho et al.’s method of determining contrast and the method [
9] used in this study.
Table II.
Multiple correlation R and adjusted coefficient of determination R2 when a regression model was constructed for each image statistic.
Statistic | R | R2 |
---|
Mean | 0.877 | 0.706 |
Std | 0.854 | 0.676 |
Skew | 0.328 | 0.067 |
Kurt | 0.35 | 0.049 |
Cont | 0.746 | 0.471 |
Top | 0.859 | 0.687 |
Table III.
Coefficients, t-values, and p-values of the linear model applied to the Mean.
Frequence (cpi) | Coefficient | t-value | p-value |
---|
Intercept | 1.021 | 2.391 | 0.0214 |
0–5 | 0.0029 | 1.275 | 0.309 |
5–25 | 0.278 | 4.715 | 2.67E-05 |
25–45 | −0.449 | −1.9 | 0.0644 |
45–65 | 1.465 | 3.095 | 0.0035 |
65–85 | −1.93 | −3.234 | 0.00238 |
85–105 | 0.959 | 1.36 | 0.181 |
105–125 | 0 | 65535 | NA |
125– | −0.118 | −0.254 | NA |