Thursday, March 29, 2012

deriving a new column from another derived column

In a Derived Column data flow transformation, why can't I refer to a derived column (added in that same transformation) in the expression for another derived column? It seems I am forced to chain 2 DC data flows, just to do something as conceptually simple as:

a = x + y; b = a2

On a related note: Can I define a variable that is scoped to each row? Can I bind a variable in an expression to avoid creating a new row, e.g.

let a = x + y; a2

as the expression for new row b?

Kevin Rodgers wrote:

In a Derived Column data flow transformation, why can't I refer to a derived column (added in that same transformation) in the expression for another derived column? It seems I am forced to chain 2 DC data flows, just to do something as conceptually simple as:

a = x + y; b = a2

Kevin,

God yeah. I so wish you could do that. Seems like such simple funcitonality doesn't it?

I have requested it here: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=127010

Please please please click through and vote for it.

Kevin Rodgers wrote:

On a related note: Can I define a variable that is scoped to each row? Can I bind a variable in an expression to avoid creating a new row, e.g.

let a = x + y; a2

as the expression for new row b?

No, you can't do that!

-Jamie

|||You can't do it in derived column, but it is very easy to do in script component - the component generates the row accessors, so the amount of code you need to write is almost the same as in derived column transform.|||

I haven't used the script component before -- I haven't even written any scripts. Could you show me a simple example of generating row accessors etc. as you've suggested?

Thanks!

|||

It's a good idea Jamie.

Do keep in mind that it adds another layer of complexity. Along with some other difficulties, the user would have to specify an order of execution for the expressions, and the UI would need a way too enable and maintain that. Chaining multiplie derived columns, while not as convenient or as pretty to look at, makes it abundantly clear what order the intermediate expressions occur in and eases some usability concerns.

It's something to look at for the future, though, certainly. Keep the suggestions coming!

Thanks
Mark

|||

Kevin Rodgers wrote:

I haven't used the script component before -- I haven't even written any scripts. Could you show me a simple example of generating row accessors etc. as you've suggested?

Thanks!

Just add a script transform where you would use Derived Column transform, select Transform in the first dialog. Check the columns you want to use in your expressions. Go to Inputs and Outputs tab, add output columns to Output 0.

Now edit the script, and type your code inside Input0_ProcessInputRow function. I quickly setup a "validation" for AdwentureWorks DB (the transform adds two colums CalculatedColumn and TotalOK to each row):

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

Dim t As Decimal
t = Row.OrderQty * Row.UnitPrice * (1 - Row.UnitPriceDiscount)
Row.CalculatedColumn = t

If (t = Row.LineTotal) Then
Row.TotalOK = True
Else
Row.TotalOK = False
End If

End Sub

So the main difference in expressions is that you have to add "Row." prefix to column name :) Of course, the syntax is different, as the script transform uses full VB.NET language, which gives you ability to create temporary variables among other things.

sql

No comments:

Post a Comment