martes, 16 de octubre de 2018

PROBLEM: Interval of Times


The problem

Your company built an in-house calendar tool called HiCal. You want to addd a feature to see the times in a day when everyone is available.
To do this, you’ll need to know when any team is having a meeting. In HiCal, a meeting is stored as an instance of a Meeting structure with integer member variables startTime and endTime. These integers represent the number of 30-minute blocks past 9:00am.
typedef struct {
unsigned startTime;
unsigned endTime;
} Meeting;
For example:
typedef struct {
unsigned startTime;
unsigned endTime;
} Meeting;


Meeting meeting1 = {2, 3}; // meeting from 10:00 – 10:30 am
Meeting meeting2 = {6, 9}; // meeting from 12:00 – 1:30 pm
Write a function mergeRanges() that takes an array of meeting time ranges and returns an array of condensed ranges.
For example, given:
[{0, 1}, {3, 5}, {4, 8}, {10, 12}, {9, 10}]
Your function would return:
[{0, 1}, {3, 8}, {9, 12}]
Do not assume the meetings are in order. The meeting times are coming from multiple teams.


Write a solution that's efficient even when we can't put a nice upper bound on the numbers representing our time ranges. Here we've simplified our times down to the number of 30-minute slots past 9:00 am. But we want the function to work even for very large numbers, like Unix timestamps. In any case, the spirit of the challenge is to merge meetings where startTime and endTime don't have an upper bound.

Gotchas
There are some special cases that we must consider:
  1. The cases where the end of a meeting is at the exact same time as the beginning of another different meeting, they should be merged.
For example:
[{x1, x2}, {y1, y2}]
Where x2 = y1
  1. The cases where a meeting is inside the range of another meeting. This means that a meeting starts after another meeting but it also ends before the other meeting ends. A meeting’s range is contained within another meeting range.
For example:
[{x1, x2}, {y1, y2}]
Where y1 > x1 and y2 < x2
  1. The cases that the first meeting takes all of the time of meetings, this means, even if they are other meetings, the first one starts earlier and ends after every other. We still have to check this first one with every other so we don’t leave any meeting without checking it.
For example:
[{x1, x2}, {y1, y2}, {z1, z2}, …, {a1, a2}]
Where x1 < y1, z1, …, a1 and
x2 > y2, z2, …, a2
They told us not to assume that the meeting hours were sorted, so we can have the case that the first meeting in the array matches the last meeting. This means we have to check every combination, even if we don’t want to, because from what the problem gives us and in the way it’s presented, checking every combination is a must.

Breakdown

First, I wrote down the examples given in the problem, I wrote down the special cases explained in the Gotchas and some different ones.
The first way was to put an axis, let’s call it x, that goes from 0 to n, n being the biggest end time.
It would look something like this: 
Then we iterate through the complete array, Meeting by Meeting and plot it on the axis.
For example, given [{0,1}, {0,2}, {1,3}, {6,8}]
We plot and have the next result:


We observe that for the first 3 meetings, they all fit through the range [0,3], then there’s some free space without meetings and then we have the range [6,8].
Therefore, we would return [{0,3}, {6,8}].
But this solution is more of a visual one, and computers can’t make the same visual process as I did (not that I know of).
This lead me to change my approach to one that computers are familiar with, numbers.

Solution

Given [{x1, x2}, {y1, y2}]
We know that:
IF x1 <= y2 is TRUE
&& y1 <= x2 is TRUE
Then, the X meeting and Y meeting can be merged.
Why?
It’s pretty simple, if the end time of the second meeting is greater or equal than the start time of the start meeting, this means that the second meeting ends at the same time that the first meeting starts or ends after the meeting.
This is not enough, that’s why we have the AND operator to check if the end time of the first meeting is greater or equal than the start time of the first meeting. These two conditions combined, cover any case in which the two meetings overlap in any time.
This formula will cover every scenario in which the two given meetings can be merged, this includes Gotchas a, b and c.
However, the formula doesn’t tell us the range we have to output by merging two Meetings.
Therefore, we have to:
Compare x1 with y1 to get the minimum (min) value for our new range.
Compare x2 with y2 to get the maximum (max) value for our new range.
With this, we can return a newRange(min, max).
A greedy approach would be to compare every meeting with every meeting to merge as many meetings as possible. We do it with the following:
Given Meetings [{x1, x2}, {y1, y2}, {z1, z2}, {a1, a2}, {b1, b2}]



And so on, and so forth.
This means we have to start from the first element, compare with every single one of the next elements.
Then compare the second element with the next elements and so on.
We merge Meeting1 with Meeting2 if they can be merged and compare that merge with the next iterations, this will save us time and memory, because if we merge Meeting1 with Meeting2, and Meeting2 can be merged with Meeting4 but Meeting1, merging them separately would be a mistake.
Take for example:
[{2,3}, {3,5}, {5,8}]
The result must be [{2,8}].
However, if we do the mistake explained above, we would have as result
[{2,5}, {5,8}]
That takes more memory and we also have to merge that result again, taking more time.
To solve this problem, if two elements can be merged, we merge them and keep comparing that merge to the rest of the elements. After we have the final merge, there’s no need to check those elements with the rest again, so we can skip them.
A pseudo-code for the algorithm explained would be:
  1. For i = 0; i < length ; i++
  2. For j = i; j < length -1; j++
  3. If x1 <= y2 && y1 <= x2 is TRUE
    1. Get Minimum from X1 OR Y1
    2. Get Maximum from X2 or Y2
    3. Merge, newRange(Minimum, Maximum)
    4. Set i = j // so you skip this element merged on the iteration, saving time and memory as explained
  4. After iterating j, add newRange to the RESULTS array // after checking every element after the element you’re comparing, this means you return every merge available.
  5. Return array containing all newRanges
Solution complexity
We have a for loop inside another for loop:
for (i = 0; i < meetingsArray; i++){
for (j = i; j < meetingsArray-1; j++){
}
}
We have n + n-1 + n-2 + … + 1 =
Plus other calculations inside these two for loops, but they don’t add more than the complexity we have for these loops, which is O().

Better solution
If we first sorted the elements of the array from start times, we would have something like this:
[{x1,x2}, {y1,y2}, {z1,z2}]
where x1 <= y1 <= z1
After sorting this way, we would only need to iterate once through the array O(n) and merging the elements that have the same start time. And on another loop, check that, given [{x1, x2}, {y1, y2}], if x2 >= y1.
This means that the start time of the element you’re comparing to, is earlier than the end time of the element you’re comparing with.
An algorithm for this solution would be:
  1. Sort the given array
  2. From the sorted array, iterate through it and merge every element whose startTime is the same. Return newArray.
  3. Iterate through newArray, if the endTime of element1 is >= than startTime of element2, merge them.
  4. If you can’t merge them, this means you won’t be able to merge with any of the rest elements. Add element to RESULT.
  5. Continue iterating.
  6. Return RESULT.
Better solution complexity
The complexity to sort the elements by increasing order is O(n log n)
The complexity of iterating for the first time the sorted array is O(n)
The complexity of iterating the second time the sorted array is O(n)
So we have O(n log n) + O(2n) = O(n log n), which is bad, but still better than O().
Bonus
If we have our ranges with an upper bound to startTime and endTime, if we got an array of Meetings where half of them have an endTime that exceeds the limit, we can halve the computations on this scenario, it would vary from scenario to scenario but we can avoid doing computations that aren’t needed.
What we learned
This problem can easily be solved with a greedy approach, but this would have taken a lot of time for a very long array list. Given the problem specifications, even if you try to get a solution from iterating once from the array, there’s no way to do it.
The problem told us not to assume that the elements were sorted, this gave me the idea to sort them first, and see if now I can only iterate once through the array, to avoid the O() complexity.
Sorting the elements and then iterating the array twice, but at different times (not a loop inside a loop), let me reduce a little bit the complexity, which was a good finding.

No hay comentarios:

Publicar un comentario