O(n log n)
, therefore execution time 1 , 2 depends on it. def nlogn_median(l): l = sorted(l) if len(l) % 2 == 1: return l[len(l) / 2] else: return 0.5 * (l[len(l) / 2 - 1] + l[len(l) / 2])
lesser_els
great_els
lesser_els
, recursively lesser_els
list lesser_els
for the k- lesser_els
element.lesser_els
less than k elements, recursively greater_els
list. Instead of searching for k, we look for k-len(lesser_els)
. . . l = [9,1,0,2,3,4,6,8,7,10,5] len(l) == 11, (pivot). 3. 2. pivot: [1,0,2], [9,3,4,6,8,7,10,5] . 6-len(left) = 3, : [9,3,4,6,8,7,10,5] , pivot. 2, l[2]=6 pivot: [3,4,5,6] [9,7,10] , , : [3,4,5,6] , pivot. 1, l[1]=4 pivot: [3,4] [5,6] , , . : [5,6] , . , 5. return 5
quickselect_median
function will call quickselect
with the necessary indices. import random def quickselect_median(l, pivot_fn=random.choice): if len(l) % 2 == 1: return quickselect(l, len(l) / 2, pivot_fn) else: return 0.5 * (quickselect(l, len(l) / 2 - 1, pivot_fn) + quickselect(l, len(l) / 2, pivot_fn)) def quickselect(l, k, pivot_fn): """ k- l ( ) :param l: :param k: :param pivot_fn: pivot, :return: k- l """ if len(l) == 1: assert k == 0 return l[0] pivot = pivot_fn(l) lows = [el for el in l if el < pivot] highs = [el for el in l if el > pivot] pivots = [el for el in l if el == pivot] if k < len(lows): return quickselect(lows, k, pivot_fn) elif k < len(lows) + len(pivots): # return pivots[0] else: return quickselect(highs, k - len(lows) - len(pivots), pivot_fn)
O(n)
. Let's prove it.O(n)
. “Medium” in this context means that, on average, the algorithm will be executed in O(n)
. From a technical point of view, we may be very unlucky: at each step we can choose the largest element as the pivot. At each stage we will be able to get rid of one element from the list, and as a result we will get the speed O(n^2)
, but not O(n)
.O(n)
when used together with quickselect. This algorithm was developed in 1973 by Bloom (Blum), Floyd (Floyd), Pratt (Pratt), Rivest (Rivest) and Tarjan (Tarjan). If my explanation is not enough for you, you can study their 1973 article . Instead of describing the algorithm, I will comment on my implementation in Python in detail: def pick_pivot(l): """ pivot l O(n). """ assert len(l) > 0 # < 5, if len(l) < 5: # . # , # # . return nlogn_median(l) # l 5 . O(n) chunks = chunked(l, 5) # , . O(n) full_chunks = [chunk for chunk in chunks if len(chunk) == 5] # . , # . n/5 , # O(n) sorted_groups = [sorted(chunk) for chunk in full_chunks] # 2 medians = [chunk[2] for chunk in sorted_groups] # , , , # O(n). # n/5, O(n) # pick_pivot pivot # quickselect. O(n) median_of_medians = quickselect_median(medians, pick_pivot) return median_of_medians def chunked(l, chunk_size): """ `l` `chunk_size`.""" return [l[i:i + chunk_size] for i in range(0, len(l), chunk_size)]
3/5*1/2=3/10
. Therefore, at each stage, we get rid of at least 30% of the rows.T(n)
:O(n)
. A quick solution is to rely on the basic theorem on recurrence relations . We find ourselves in the third case of the theorem, in which work at each level dominates the work of subtasks. In this case, the total work will simply be equal to the work at each level, that is, O(n)
.O(n)
algorithm for selecting a reference element (which is good enough for quickselect). Combining them, we got an algorithm for finding the median (or the nth element in the list) in linear time!C++
uses an algorithm called introselect , in which the combination heapselect and quickselect is applied; the limit of its execution is O(n log n)
. Introselect usually allows using a fast algorithm with a bad upper limit in combination with an algorithm that is slower in practice, but has a good upper limit. Implementations start with a fast algorithm, but return to a slower one if they cannot select efficient support elements.Source: https://habr.com/ru/post/346930/
All Articles