The SASE-FE dataset contains video set of 50 subjects. For each subject, there are 12 videos representing 6 basic emotions (Anger, Happiness, Sadness, Disgust, Contempt, Surprise) for real and fake expression. Each video was recorded with a high resolution camera with 100 frames per second and it's about 3-4 seconds. In order for the subjects to express these emotions, they were shown videos which are meant to induce these emotions and were acted accordingly. For the real emotion set subjects were supposed to express the same emotion which was provoked by the shown video. In the second case the expressed emotion and stimulated emotion were contrasted (e.g to record a faked surprise we've shown a calling disgust video and asked to act surprise) .
In each video, subjects started from a neutral emotion and the length of this neutral emotion is not predefined. The sequence of returning to the neutral state was introduced to depict a variation from the acted emotion to the fake emotion to be used in the training process.