In this paper, we construct a model for cross-modal perception of glossiness by investigating the interaction between sounds and graphics. First, we conduct evaluation experiments on cross-modal glossiness perception using sounds and graphics stimuli. There are three types of stimuli in the experiments. The stimuli are visual stimuli (22 stimuli), audio stimuli (15 stimuli) and audiovisual stimuli (330 stimuli). Also, there are three sections in the experiments. The first one is a visual experiment, the second one is an audiovisual experiment, and the third one is an auditory experiment. For the evaluation of glossiness, the magnitude evaluation method is applied. Second, we analyze the influence of sounds on glossiness perception from the experimental results. The results suggest that the cross-modal perception of glossiness can be represented as a combination of visual-only perception and auditory-only perception. Then, based on the results, we construct a model by a linear sum of computer graphics and sound parameters. Finally, we confirm the feasibility of the cross-modal glossiness perception model through a validation experiment.