CROSS APPLY Keeps Me Sane
SELECT t.FullName, t.TotalTasks, ---additional fields FROM ( SELECT ul.FullName, SUM (CASE WHEN ut.task_completed = 1 THEN 1 ELSE 0 END) CompletedTasks, AVG(DATEDIFF (minute, ut.create_timestamp ,ut.complete_timestamp)) AverageMinutesToComplete FROM dbo.user_list ul LEFT OUTER JOIN dbo.user_tasks ut ON ut.owner_id= ul.[user_id] WHERE ut.create_timestamp >= @startDate AND ut.create_timestamp <= @endDate ) t
Since the task table stores the creation and completion dates without the overhead of an additional column to hold the calculated “time to complete”, the only way to get the information we’re after is to perform the calculation in the moment for the smallest unit of time that we want to see and then build from there whenever we want to add hours, days, or weeks.
Seeing DATEDIFF used to get the average number of minutes it takes each user to complete their assigned tasks, it seems logical that we’d use the same approach to get those other units as they’re needed and then do the math so that instead of “1 hour and 65 minutes” the report should show “1 hour and 5 minutes”, “25 hours and 5 minutes” should be “1 day, 1 hour, and 5 minutes”, and so on. Instead, the original query used a slightly different method for getting those values.
SELECT t.FullName, t.TotalTasks, CASE WHEN CASE WHEN t.AverageMinutesToComplete >= (24* 60) THEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) ELSE 0 END > 0 THEN CAST(FLOOR (t.AverageMinutesToComplete/(24 *60.0)) as varchar(3)) + ' day' + CASE WHEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) >= 2 THEN 's ' ELSE ' ' END ELSE '' END AS AverageDaysToComplete, CASE WHEN CASE WHEN (t.AverageMinutesToComplete - ((CASE WHEN t.AverageMinutesToComplete >= (24* 60) THEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) ELSE 0 END) * ( 24*60 ))) >= 60 THEN CAST (FLOOR(( t.AverageMinutesToComplete - ( (CASE WHEN t.AverageMinutesToComplete >= (24* 60) THEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) ELSE 0 END) * (24* 60)) )/60.0) as varchar(2)) ELSE 0 END > 0 THEN CAST ( CASE WHEN (t.AverageMinutesToComplete - ((CASE WHEN t.AverageMinutesToComplete >= (24* 60) THEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) ELSE 0 END) * ( 24*60 ))) >= 60 THEN CAST (FLOOR(( t.AverageMinutesToComplete - ( (CASE WHEN t.AverageMinutesToComplete >= (24* 60) THEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) ELSE 0 END) * (24* 60)) )/60.0) as varchar(5 )) ELSE 0 END AS varchar(2)) + ' hour' + CASE WHEN CASE WHEN (t.AverageMinutesToComplete - ((CASE WHEN t.AverageMinutesToComplete >= (24* 60) THEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) ELSE 0 END) * ( 24*60 ))) >= 60 THEN CAST (FLOOR(( t.AverageMinutesToComplete - ( (CASE WHEN t.AverageMinutesToComplete >= (24* 60) THEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) ELSE 0 END) * (24* 60)) )/60.0) as varchar(2)) ELSE 0 END >= 2 THEN 's ' ELSE ' ' END ELSE '' END AS AverageHoursToComplete, ---remainder of query
While those two statements handle the requirements that the report will display the time to completion as “2 days, 1 hour, and 5 minutes” by making sure that each full day’s worth of minutes is subtracted from the total before calculating the number of hours and then going on in a third statement to adjust for the full days and hours to find the number of minutes remaining, I find it much more difficult to read and hold in my head than some of the other options.
Using DATEDIFF to get the average number of days and hours before applying the math to display the values realistically would require less typing and make more sense than the massive amount of text above, but for the purposes of this example it can serve as a placeholder for transformations that are necessarily complex and don’t have alternatives built into T-SQL. Taking the CROSS APPLY approach to shift those calculations out of the query’s field list and into the FROM clause alongside the tables and derived tables.
SELECT --field list, CASE WHEN days.AverageDaysToComplete >= 1 THEN CAST(days.AverageDaysToComplete as varchar(3)) + 'day' + CASE WHEN days.AverageDays >= 2 THEN 's ' ELSE ' ' END AverageDaysToComplete, CASE WHEN hours.AverageHoursToComplete >= 1 THEN CAST(hours.AverageHoursToComplete as varchar(2)) + 'hour' + CASE WHEN hours.AverageHoursToComplete >= 2 THEN 's ' ELSE ' ' END AverageHoursToComplete , FROM --derived table CROSS APPLY (SELECT CASE WHEN t.AverageMinutesToComplete >= (24* 60) THEN FLOOR (t.AverageMinutesToComplete/(24 *60.0)) ELSE 0 END AS AverageDaysToComplete ) days CROSS APPLY (SELECT CASE WHEN (t.AverageMinutesToComplete - (days.AverageDaysToComplete * ( 24*60 ))) >= 60 THEN FLOOR(( t.AverageMinutesToComplete - (days.AverageDaysToComplete * (24* 60)) )/60.0)) ELSE 0 END AS AverageHoursToComplete) hours
In this instance , there’s no real performance gain when using the CROSS APPLY approach vs. leaving all of the heavy lifting inline with the rest of the fields in the SELECT but there are other instances where it may outperform a correlated subquery. Apart from that, declaring those values once and treating them as columns that can be used anywhere throughout the query without the need to recalculate them every time they’re needed can prevent typos, logical errors, carpal tunnel, and migraines in this and numerous other cases.