This experiment aimed at studying how high and low spatial abilities adult learners understood an animation. Two factors were manipulated when learners were studying a three-pulley system device: the controllability of the animation and the orientation of the attention of the participant by an explicit task. Off-line (comprehension questions) and on-line (eye tracking) measures were used. The comprehension test results indicated that, more than controllability, the specific orientation of the attention of the participant on the relevant features of the animation had a positive effect on the elaboration of a high quality dynamic mental model of the device. This positive effect appeared particularly when the attention of the learner was focused on the functional model and on local kinematics. The eye tracking data indicated that the learners attend more to the areas of the animations where a great amount of motion is involved along the causal chain of events. We showed an effect of the controllability of the system and of the orientation of the learner’s attention on the amount of eye fixations and on the number of transitions between areas that included the causal chain. The elaboration of the mental model of an animation seemed to require a piecemeal processing, step by step, of the operations of the system et not the direct analogue and holistic mapping of the device.