Another method to compensate for the time keeping deficiency is to record some pattern-models of the external reality. That is, to record some information based on many M-type models, to build a pattern-model at a specific moment of time, and to recognize the pattern later.
Such a pattern could be associated with the function of different organs of the being, or with some other information from outside the being.
The time problem is a big one for the brain. The brain will use any external reference to keep the time as the day/night cycle, the movement of the sun and moon and for humans only, clocks.
ETA 7: Music
Music is a long-range image model, which exists only for human beings. As a newborn baby grows, firstly, speech appears (a symbolic model) and only later, the qualities associated to understanding music. As music understanding capabilities appear after the brain aquires the ability to build and operate symbolic models, it is reasonable to suppose that the symbolic models support the development of music. This idea is supported also by the fact that European music (the most advanced in the construction of symbolic models) is superior compared to any other music from the point of view of its complexity (polyphonic music was invented in Europe).
Given a sequence of a few sounds, the brain will try to predict the occurence of the next sounds. Sometimes the prediction is correct sometimes not. If the prediction is good too often, the impression is described in words like: boring, monotonous or upsetting. When the prediction is not correct(there is a large discrepancy between the prediction and IR), the sounds are uncorellated. If we have an acceptable difference (the sounds are considered corellated after modifying slightly the algorithm of generation of the sequence), then we can associate this to music.
The corellation is associated with the capacity of generation of a sequence based on an algorithm.
This automatic activity of continuous modifying the generation algorithm can produce a positive state of mind, which can be called pleasure. This means that the predictions are correct constantly, with high probability, and that the ones, which are not correct, are accepted, after an acceptable change of the algorithm. This activity is called currently music.
The corellation can be supported implicitly, as it happens in classical music or can be supported explicitly (e. g. by rhythm of drums).
If we accept the hypothesis of the existence of a facility associated with image models (a hardware facility) to build an algorithm of generation of corellated information, then we could try to see if this facilty evolved in time or not.