COMPANY

BLOGS

  • user warning: Unknown column 'u.signature_format' in 'field list' query: SELECT c.cid as cid, c.pid, c.nid, c.subject, c.comment, c.format, c.timestamp, c.name, c.mail, c.homepage, u.uid, u.name AS registered_name, u.signature, u.signature_format, u.picture, u.data, c.thread, c.status FROM comments c INNER JOIN users u ON c.uid = u.uid WHERE c.nid = 19515 AND c.status = 0 ORDER BY c.cid LIMIT 0, 50 in /var/www/www.4d.com/docs/modules/comment/comment.module on line 991.
  • warning: file_get_contents(http://www.telize.com/geoip/54.196.201.241) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /var/www/www.4d.com/docs/includes/common.inc(1762) : eval()'d code on line 4.

Timestamp, Mon Amour (Part 2)

07.09.2010
by Olivier Deschanels
Time

 

This blog post is the second in a four-part series written in French in September 2009.

 

Following the example of using two fields to mark a record with date and time, let’s now look at timestamps.

 

The principle of the timestamp is to combine in a single field, in the most compact way possible, the information needed to memorize the date and time. Of course, the less space we attribute for storage, the smaller the amplitude of possible values.

 

In fact, we’re going to reduce the range of dates so as to shrink the size of the storage field. If your database is destined to store, for example, invoices ordered on the Web, what’s the point of recording invoices from January 14, 1984, or from December 25, 2099? What you’re probably interested in is storing the information for several years to come. What’s the point of planning a system that’ll work for 150 years when information systems are renewed within a maximum of ten years? Of course, we’ll give ourselves some breathing room so that we can rest easy.

 

In order to conserve information, we’ll choose to use a simple long integer that allows us to store 2^32 different values, making it just over 4 billion.

 

The base information we’ll keep is the second. In most cases, the precision of a second is amply sufficient to mark a given instant. However, it’s possible to have greater precision, but in this case we’d have to increase the storage size to cover the same range of time.

 

We can thus store a duration between 0 and 4 billion seconds, or between approximately 0 to 136 years, which is more than enough! We now have to understand that this value is the count of seconds passed since a reference point. The value of 60 will thus be one minute after this reference point, the value 3600 an hour after, the value 86400 one day after, and so on.

 

The timestamp based in seconds is thus stored in a long integer. It takes 4 bytes to memorize it in the record, plus 8 bytes in the micro-structure. 12 bytes are thus necessary instead of the 32 bytes of the two-field system, making for a savings of 62%.

 

Before looking at the code allowing us to memorize and search for the values, let’s take a look at our processing examples and observe the eventual benefits.

 

To sort records in a waiting list, the current code is basic:

ORDER BY([Visit];[Visit]TimeStamp_Arrival;>)

 

Naturally, 4D handles this sort quicker than its two-field equivalent and this is for several reasons:

  • It’s faster to sort on one field than on two, that’s obvious.
  • Data is less voluminous, so it requires less memory to process, which lends to more speed.
  • A long integer is a much simpler format than a date, for example, and so the comparison algorithms necessary for the sort are simpler and faster.

 

Moreover, if you want to go faster, in the case of a timestamp, the search on a single field will be indexed if an index exists. Each index key will cost about 30 bytes – 57% less. Finally, the index pages will store 256 keys instead of the 128 keys for a composite index, and there will thus be fewer pages to manipulate in memory, gaining even more time.

 

To query the visits for the week between 8:30am Monday and 5:00pm Friday, the code is equally reduced to a simple query split across two long integers:

QUERY([Visit];[Visit]TimeStamp_Arrival >= $stmp_monday_0830;*)
QUERY([Visit];[Visit]TimeStamp_Arrival <= $stmp_friday_1700)

 

Again, it’s obvious to say that a query on two simple fields is faster than on four fields. Furthermore if one wants to index the query, there will be only one index implemented, versus potentially two (or one composite index) in the case of the two-field solution.

 

For the shift of the workday by 5 hours, the operation runs on a simple addition of a delay expressed in seconds. For 5 hours, we just have to add 5 x 60 x 60 = 18,000 seconds.

 

Here’s the generic code to add a delay expressed in hours ($delay):

[Workday]TimeStamp_debut:=[Workday]TimeStamp_begin+($delay+0)

 

Note that the operation ($delay+0) which allows the conversion of hours into seconds is in the 4D language. It goes without saying that the delay can simply be subtracted to advance the time of departure for a workday.

 

Finally for the last code aimed at calculating the time between two shifts, it suffices to find the difference between two timestamps (a simple substraction) and to take the absolute value. So we have the number of seconds of time in between that remains to be put in a more humanly legible form.

 

Actually, how do you convert a paired date and time into a timestamp? That will be the subject of my next post in this blog!


 

RSS 0 comment(s) to this post