For each label in one array set the first k occurrences to False in another array
I have two (sorted) arrays, A and B, of different lengths each containing unique labels that are repeated a number of times. The count for each label in A is less than or equal to that in B. All labels in A will be in B, but some labels in B do not appear in A.
I need an object the same length as B where, for each label
i in A (which occurs
k_i times), the first
k_i occurrences of label
i in B need to be set to
The remaining elements should be
The following code gives me what I need, but if A and B are large, this can take a long time:
import numpy as np # The labels and their frequency A = np.array((1,1,2,2,3,4,4,4)) B = np.array((1,1,1,1,1,2,2,3,3,4,4,4,4,4,5,5)) A_uniq, A_count = np.unique(A, return_counts = True) new_ind = np.ones(B.shape, dtype = bool) for i in range(len(A_uniq)): new_ind[np.where(B == A_uniq[i])[:A_count[i]]] = False print(new_ind) #[False False True True True False False False True False False False # True True True True]
Is there a faster or more efficient way to do this? I feel like I may be missing some obvious broadcasting or vectorized solution.
Here's one with
idx = np.searchsorted(B, A_uniq) id_ar = np.zeros(len(B),dtype=int) id_ar[idx] = 1 id_ar[A_count+idx] -= 1 out = id_ar.cumsum()==0
We can optimize further to compute
A_uniq,A_count using its sorted nature instead of using
np.unique, like so -
mask_A = np.r_[True,A[:-1]!=A[1:],True] A_uniq, A_count = A[mask_A[:-1]], np.diff(np.flatnonzero(mask_A))
Sort an array according to the order defined by another array , Source: Amazon Interview | Set 110 (On-Campus) Create another array visited and initialize all entries in it as false. visited is used to A Binary Search based function to find index of FIRST occurrence be made equal by maximum K increments · Count of subarrays of size K with elements having even frequencies. Here is a way to count occurrences inside an array of objects. It also places the first array's contents inside a new array to sort the values so that the order in the original array is not disrupted. Then a recursive function is used to go through each element and count the quantity property of each object inside the array.
Example without numpy
A = [1,1,2,2,3,4,4,4] B = [1,1,1,1,1,2,2,3,3,4,4,4,4,4,5,5] a_i = b_i = 0 while a_i < len(A): if A[a_i] == B[b_i]: a_i += 1 B[b_i] = False else: B[b_i] = True b_i += 1 # fill the rest of B with True B[b_i:] = [True] * (len(B) - b_i) # [False, False, True, True, True, False, False, False, True, False, False, False, True, True, True, True]
High Performance Computing Systems and Applications, The straightforward approach is to use a single helper array to store the evaluation array in the second forall to decide for each thread which branch to enter. when an individual thread k leaves the flow of control early in the first forall (e.g. The central idea of the restructuring algorithm is to label individual statements in If an associative array is used as the second parameter of array_fill_keys, then the associative array will be appended in all the values of the first array. e.g. <?php
This solution is inspired by the one by @Divakar, using itertools.groupby:
import numpy as np from itertools import groupby A = np.array((1, 1, 2, 2, 3, 4, 4, 4)) B = np.array((1, 1, 1, 1, 1, 2, 2, 3, 3, 4, 4, 4, 4, 4, 5, 5)) indices = [key + i for key, group in groupby(np.searchsorted(B, A)) for i, _ in enumerate(group)] result = np.ones_like(B, dtype=np.bool) result[indices] = False print(result)
[False False True True True False False False True False False False True True True True]
The idea is to use np.searchsorted to find the insertion position of each element of
A, as equal elements will have the same insertion position you have to shift by one each of them, hence the groupby. Then create an array of
True and set the values of the
If you can use
pandas, compute the
indices like this:
values = np.searchsorted(B, A) indices = pd.Series(values).groupby(values).cumcount() + values
Applied Data Mining, Zhenglu Yang. Let x be an instance, y be the binary label vector associate with x, and N(x) represents its k nearest neighbours in the training set. For each labell, ML-KNN will calculate the following statistics information first of all. For labell(1sism), there will be an array:k, which has k+1 elements. The value of k's jth Count the occurrences of an element in an array in Java. We will be performing the below steps to count the occurrence. As a first step we will be creating a HashMap “countMap” to hold the element (Key) and the count as the value. For each of the element in the input array, check if it is present in the countMap, using containsKey() method.
Learn Numpy the hard way: 70 exercises+solutions, 43. How to get the second largest value of an array when grouped by another array? 46. How to find the position of the first occurrence of a value greater than a given value? How to compute the min-by-max for each row for a numpy array 2d? Setting print options to default np.set_printoptions(edgeitems=3,infstr='inf', Median of sliding window in an array; Eggs dropping puzzle | Set 2; Minimum number of swaps required to sort an array of first N number; Efficiently merging two sorted arrays with O(1) extra space and O(NlogN + MlogM) Find XOR of all elements in an Array; Split the given string into Primes : Digit DP; Number of pairs in an array with the sum
Combinatorics on Words: Christoffel Words and Repetitions in Words, Given a word w of length n, find all locations of all occurrences of a factor f within w (more precisely, O(nlog|A|) time, but one typically assumes a fixed alphabet). T(w) one can solve the factor problem in O(|f| + k) time, assuming k instances Though admittedly they introduce so-called branching squares and DFS arrays. The fill() method changes all elements in an array to a static value, from a start index (default 0) to an end index (default array.length). It returns the modified array.
Challenges and Opportunities in the Digital Era: 17th IFIP WG 6.11 , Additionally, NLTK was used, which is a set of libraries and programs oriented to full text and false or true label which were taken from different media, making the data was changed so that the first half of the data with false label and the Bayes algorithm of the ScikitLearn package and finally an array was made in In the method body, let's create 2 new array objects one is seen and another one is duplicate; finally lets iterate through each object in given array and for every iteration lets find that object existed in seen array. if object existed in the seen_array, then it is considered as duplicate object and push that object into duplication_array