Sunday, March 25, 2012

Derived Column - DateTimeStamp From String (dd-mmm-yy)

I have come across something strange. I was wondering anyone else had the same problem or is it me?

I have a source file as shown below:
1,30-Feb-06,3,Test
2,20-Feb-06,5,Test1

Second column is defined as String(9) in the flat file source. Then I have a Derived Column between source and Trash destination. The Derived Column adds a new column to the datd flow. The expression is as shown below :
(DT_DBTIMESTAMP)(SUBSTRING(Col2,1,2) + "/" + SUBSTRING(Col2,4,3) + "/" + SUBSTRING(Col2,8,2))

The data is loading into the database successfully, where I was expecting it to fail. I placed a data viewer and saw the following result set.

Source (Col2) Derived Col
-
30-Feb-06 06/02/1930 00:00:00 - WRONG result set
20-Feb-06 20/02/2006 00:00:00 - RIGHT result set

Can anyone repeat this problem or am I missing something? I have got SP1 and Hotfix installed.

Thanks

Sutha

This one feels like an old friend - albeit an irritating old friend.

Briefly, this is a result of routines in oleaut. Really, the oleaut library is trying to be friendly and to cast your string to a valid date. Knowing that there are different formats out there such as yy-mm-dd and dd-mm-yy it "assumes" that you have a valid date in one of these formats and therefore - to hugely oversimplify - finds the format that enables it to perform a cast to a valid date.

Note that this behaviour has been around for a long time when using a non-US local (there are developers who believe it led to some slight misalignment of the pyramids) and many applications actually rely on the behaviour, so it is difficult to change after a long time.

I would suggest that if you are likely to have invalid dates of this particular nature, that you validate them using some code before applying the cast or conversion. Not a great solution, but effective.

Donald

No comments:

Post a Comment