Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.For example,
[2,3,4], the median is 3
[2,3], the median is (2 + 3) / 2 = 2.5
Design a data structure that supports the following two operations:
- void addNum(int num) – Add a integer number from the data stream to the data structure.
- double findMedian() – Return the median of all elements so far.
Example:
addNum(1) addNum(2) findMedian() -> 1.5 addNum(3) findMedian() -> 2
Follow up:
- If all integer numbers from the stream are between 0 and 100, how would you optimize it?
- If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?
[code lang="java"]
class MedianFinder {
/** initialize your data structure here. */
public MedianFinder() {
}
public void addNum(int num) {
}
public double findMedian() {
}
}
/**
* Your MedianFinder object will be instantiated and called as such:
* MedianFinder obj = new MedianFinder();
* obj.addNum(num);
* double param_2 = obj.findMedian();
*/
[code]
Idea – 1
We could use a list, addNum() takes time
[code lang="java"]
class MedianFinder {
List<Integer> data;
/** initialize your data structure here. */
public MedianFinder() {
data = new ArrayList<>();
}
public void addNum(int num) {
data.add(num);
}
public double findMedian() {
int n = data.size();
Collections.sort(data);
// assume n > 0
if(n%2 == 1)
{
return data.get(n/2);
}
else
{
// assume no overflow or underflow
return (data.get(n/2-1)+data.get(n/2))*0.5;
}
}
}
[code]
Runtime: 483 ms, faster than 7.04% of Java online submissions for Find Median from Data Stream.Memory Usage: 65.4 MB, less than 66.73% of Java online submissions for Find Median from Data Stream.
Idea – 2
In findMedian() we could use the quicksort partitioning idea to find an element of rank x, which should make the average runtime
[code lang="java"]
class MedianFinder {
List<Integer> data;
static final Random rng = new Random( System.currentTimeMillis()%Integer.MAX_VALUE );
/** initialize your data structure here. */
public MedianFinder() {
data = new ArrayList<>();
}
public void addNum(int num) {
data.add(num);
}
public double findMedian() {
int n = data.size();
int rightRank = n/2;
double rightElement = select(rightRank, 0, n-1);
if(n%2==1)
{
return rightElement;
}
else
{
double leftElement = select(rightRank-1, 0, rightRank);
return (leftElement+rightElement)*0.5;
}
}
private int select(int rank, int lo, int hi)
{
if(lo==hi)
{
return data.get(lo);
}
int pi = lo + rng.nextInt(hi-lo+1);
int pivot = data.get(pi);
int i = lo-1, k = lo;
int j = hi;
while(k <= j)
{
if(data.get(k) > pivot)
{
Collections.swap(data, k, j--);
}
else if(data.get(k) < pivot)
{
Collections.swap(data, ++i, k++);
}
else
{
++k;
}
}
int pivotRank = i+1;
if(pivotRank==rank)
{
return pivot;
}
else if(pivotRank < rank)
{
return select(rank, i+2, hi);
}
else
{
return select(rank, lo, i);
}
}
}
[code]
Unfortunately that one had time limit.
Idea – 3
We could use a binary search tree to get both addNum and findMedian run in time proportional to height or for random input on average
[code lang="java"]
class BST
{
class Node
{
int key;
int size;
Node left, right;
public Node(int key)
{
this.key = key;
this.size = 1;
}
}
Node root;
public int size()
{
return size(root);
}
private int size(Node x)
{
return x==null ? 0 : x.size;
}
public int select(int rank)
{
return select(root, rank);
}
private int select(Node x, int rank)
{
int leftSize = size(x.left);
if(leftSize==rank)
{
return x.key;
}
else if(rank < leftSize)
{
return select(x.left, rank);
}
else
{
return select(x.right, rank-leftSize-1);
}
}
public void put(int key)
{
root = put(root, key);
}
private Node put(Node x, int key)
{
if(x==null)
{
return new Node(key);
}
if(x.key <= key)
{
x.left = put(x.left, key);
}
else if(x.key > key)
{
x.right = put(x.right, key);
}
x.size = size(x.left)+size(x.right)+1;
return x;
}
}
class MedianFinder {
BST data;
/** initialize your data structure here. */
public MedianFinder() {
data = new BST();
}
public void addNum(int num) {
data.put(num);
}
public double findMedian() {
int n = data.size();
double rightElement = data.select(n/2);
if(n%2==1)
{
return rightElement;
}
double leftElement = data.select(n/2-1);
return (leftElement+rightElement)*0.5;
}
}
[code]
Runtime: 112 ms, faster than 77.72% of Java online submissions for Find Median from Data Stream.Memory Usage: 57.8 MB, less than 87.98% of Java online submissions for Find Median from Data Stream.
Idea – 4
If we can keep two groups: smaller
[code lang="java"]
class MedianFinder {
PriorityQueue<Integer> leftq;
PriorityQueue<Integer> rightq;
/** initialize your data structure here. */
public MedianFinder() {
leftq = new PriorityQueue<Integer>(new Comparator<Integer>()
{
public int compare(Integer a, Integer b)
{
return b-a;
}
});
rightq = new PriorityQueue<>();
}
public void addNum(int num) {
if(leftq.size() == rightq.size())
{
leftq.offer(num);
}
else
{
rightq.offer(num);
}
if(!rightq.isEmpty())
{
leftq.offer(rightq.poll());
rightq.offer(leftq.poll());
}
}
public double findMedian() {
int n = leftq.size()+rightq.size();
if(n%2==1)
{
return leftq.peek();
}
else
{
return (leftq.peek()+rightq.peek())*0.5;
}
}
}
[code]
Runtime: 124 ms, faster than 39.99% of Java online submissions for Find Median from Data Stream.Memory Usage: 65.6 MB, less than 66.73% of Java online submissions for Find Median from Data Stream.