For each label in one array set the first k occurrences to False in another array
I have two (sorted) arrays, A and B, of different lengths each containing unique labels that are repeated a number of times. The count for each label in A is less than or equal to that in B. All labels in A will be in B, but some labels in B do not appear in A.
I need an object the same length as B where, for each label i
in A (which occurs k_i
times), the first k_i
occurrences of label i
in B need to be set to False
.
The remaining elements should be True
.
The following code gives me what I need, but if A and B are large, this can take a long time:
import numpy as np # The labels and their frequency A = np.array((1,1,2,2,3,4,4,4)) B = np.array((1,1,1,1,1,2,2,3,3,4,4,4,4,4,5,5)) A_uniq, A_count = np.unique(A, return_counts = True) new_ind = np.ones(B.shape, dtype = bool) for i in range(len(A_uniq)): new_ind[np.where(B == A_uniq[i])[0][:A_count[i]]] = False print(new_ind) #[False False True True True False False False True False False False # True True True True]
Is there a faster or more efficient way to do this? I feel like I may be missing some obvious broadcasting or vectorized solution.
Here's one with np.searchsorted

idx = np.searchsorted(B, A_uniq) id_ar = np.zeros(len(B),dtype=int) id_ar[idx] = 1 id_ar[A_count+idx] = 1 out = id_ar.cumsum()==0
We can optimize further to compute A_uniq,A_count
using its sorted nature instead of using np.unique
, like so 
mask_A = np.r_[True,A[:1]!=A[1:],True] A_uniq, A_count = A[mask_A[:1]], np.diff(np.flatnonzero(mask_A))
Sort an array according to the order defined by another array , Source: Amazon Interview  Set 110 (OnCampus) Create another array visited[] and initialize all entries in it as false. visited[] is used to A Binary Search based function to find index of FIRST occurrence be made equal by maximum K increments · Count of subarrays of size K with elements having even frequencies. Here is a way to count occurrences inside an array of objects. It also places the first array's contents inside a new array to sort the values so that the order in the original array is not disrupted. Then a recursive function is used to go through each element and count the quantity property of each object inside the array.
Example without numpy
A = [1,1,2,2,3,4,4,4] B = [1,1,1,1,1,2,2,3,3,4,4,4,4,4,5,5] a_i = b_i = 0 while a_i < len(A): if A[a_i] == B[b_i]: a_i += 1 B[b_i] = False else: B[b_i] = True b_i += 1 # fill the rest of B with True B[b_i:] = [True] * (len(B)  b_i) # [False, False, True, True, True, False, False, False, True, False, False, False, True, True, True, True]
High Performance Computing Systems and Applications, The straightforward approach is to use a single helper array to store the evaluation array in the second forall to decide for each thread which branch to enter. when an individual thread k leaves the flow of control early in the first forall (e.g. The central idea of the restructuring algorithm is to label individual statements in If an associative array is used as the second parameter of array_fill_keys, then the associative array will be appended in all the values of the first array. e.g. <?php
This solution is inspired by the one by @Divakar, using itertools.groupby:
import numpy as np from itertools import groupby A = np.array((1, 1, 2, 2, 3, 4, 4, 4)) B = np.array((1, 1, 1, 1, 1, 2, 2, 3, 3, 4, 4, 4, 4, 4, 5, 5)) indices = [key + i for key, group in groupby(np.searchsorted(B, A)) for i, _ in enumerate(group)] result = np.ones_like(B, dtype=np.bool) result[indices] = False print(result)
Output
[False False True True True False False False True False False False True True True True]
The idea is to use np.searchsorted to find the insertion position of each element of A
, as equal elements will have the same insertion position you have to shift by one each of them, hence the groupby. Then create an array of True
and set the values of the indices
to False
.
If you can use pandas
, compute the indices
like this:
values = np.searchsorted(B, A) indices = pd.Series(values).groupby(values).cumcount() + values
Applied Data Mining, Zhenglu Yang. Let x be an instance, y be the binary label vector associate with x, and N(x) represents its k nearest neighbours in the training set. For each labell, MLKNN will calculate the following statistics information first of all. For labell(1sism), there will be an array:k, which has k+1 elements. The value of k's jth Count the occurrences of an element in an array in Java. We will be performing the below steps to count the occurrence. As a first step we will be creating a HashMap “countMap” to hold the element (Key) and the count as the value. For each of the element in the input array, check if it is present in the countMap, using containsKey() method.
Learn Numpy the hard way: 70 exercises+solutions, 43. How to get the second largest value of an array when grouped by another array? 46. How to find the position of the first occurrence of a value greater than a given value? How to compute the minbymax for each row for a numpy array 2d? Setting print options to default np.set_printoptions(edgeitems=3,infstr='inf', Median of sliding window in an array; Eggs dropping puzzle  Set 2; Minimum number of swaps required to sort an array of first N number; Efficiently merging two sorted arrays with O(1) extra space and O(NlogN + MlogM) Find XOR of all elements in an Array; Split the given string into Primes : Digit DP; Number of pairs in an array with the sum
Combinatorics on Words: Christoffel Words and Repetitions in Words, Given a word w of length n, find all locations of all occurrences of a factor f within w (more precisely, O(nlogA) time, but one typically assumes a fixed alphabet). T(w) one can solve the factor problem in O(f + k) time, assuming k instances Though admittedly they introduce socalled branching squares and DFS arrays. The fill() method changes all elements in an array to a static value, from a start index (default 0) to an end index (default array.length). It returns the modified array.
Challenges and Opportunities in the Digital Era: 17th IFIP WG 6.11 , Additionally, NLTK was used, which is a set of libraries and programs oriented to full text and false or true label which were taken from different media, making the data was changed so that the first half of the data with false label and the Bayes algorithm of the ScikitLearn package and finally an array was made in In the method body, let's create 2 new array objects one is seen and another one is duplicate; finally lets iterate through each object in given array and for every iteration lets find that object existed in seen array. if object existed in the seen_array, then it is considered as duplicate object and push that object into duplication_array