Something I’ve encountered in multiple interviews is the need to identify gaps between dates or continuous date ranges. In the moment, it can be tough to recall all the approaches, so I’m documenting these queries here as a reference for the future.
Finding Consecutive Date Ranges
We use dense_rank to identify the order when the event happened.
AD_ID EVENTDATE RN 6245581352190 2021-10-25 1 6245581352190 2021-10-26 2 6245581352190 2021-10-27 3 6245581352190 2021-10-28 4 6245581352190 2021-10-29 5 6245581352190 2021-11-03 6
We subtract the rank minus the date when the event happened, and this will give us the initial date of the range
AD_ID EVENTDATE RN GRP_DATE 6245581352190 2021-10-25 1 2021-10-24 6245581352190 2021-10-26 2 2021-10-24 6245581352190 2021-10-27 3 2021-10-24 6245581352190 2021-10-28 4 2021-10-24 6245581352190 2021-10-29 5 2021-10-24 6245581352190 2021-11-03 6 2021-10-28
We grouped by grp_date, and get the max and min date, and then subtracts max and min to get the number of consecutive days
AD_ID INTERVAL_START INTERVAL_END CONSECUTIVE_DAYS 6245581352190 2021-10-25 2021-10-29 5 6245581352190 2021-11-03 2021-11-03 1
Finding gaps between days
Using LAG, we can get the previous date
AD_ID EVENTDATE PREVIOUS_DATE 6245581352190 2021-10-25 6245581352190 2021-10-26 2021-10-25 6245581352190 2021-10-27 2021-10-26 6245581352190 2021-10-28 2021-10-27 6245581352190 2021-10-29 2021-10-28 6245581352190 2021-11-03 2021-10-29
We can use DATEDIFF to subtract the dates
AD_ID EVENTDATE PREVIOUS_DATE GAP_BETWEEN_DAYS 6245581352190 2021-10-25 6245581352190 2021-10-26 2021-10-25 1 6245581352190 2021-10-27 2021-10-26 1 6245581352190 2021-10-28 2021-10-27 1 6245581352190 2021-10-29 2021-10-28 1 6245581352190 2021-11-03 2021-10-29 5