Count duplicates between 2 lists

python count matching elements in two lists
python intersection of two lists with duplicates
find duplicates between lists python
python count duplicates in list
compare lists for duplicates python
count number of matches in two lists
remove common elements from two lists python
python matching items in two lists
a = [1, 2, 9, 5, 1]
b = [9, 8, 7, 6, 5]

I want to count the number of duplicates between the two lists. So using the above, I want to return a count of 2 because 9 and 5 are common to both lists.

I tried something like this but it didn't quite work.

def filter_(x, y):
    count = 0
    for num in y:
        if num in x:
            count += 1
            return count

Shorter way and better:

>>> a = [1, 2, 9, 5, 1]
>>> b = [9, 8, 7, 6, 5]
>>> len(set(a) & set(b))     # & is intersection - elements common to both
2 

Why your code doesn't work:

>>> def filter_(x, y):
...     count = 0
...     for num in y:
...             if num in x:
...                     count += 1
...     return count
... 
>>> filter_(a, b)
2

Your return count was inside the for loop and it returned without execution being complete.

How to count duplicates between two columns in Excel?, In Excel worksheet, for example, I have two columns which contains some names​, and now I want to count number of the names that both appear in column A  The following formula may help you to get the number of names that both in Column A and Column C, please do as this: Enter this formula into a blank cell, E2 for instance: =SUMPRODUCT(--(ISNUMBER(MATCH(A2:A13,C2:C13,0)))) ,and then press Enter key to get the result,

You can use set.intersection:

>>> set(a).intersection(set(b)) # or just: set(a).intersection(b)
set([9, 5])

Or, for the length of the intersection:

>>> len(set(a).intersection(set(b)))
2

Or, more concise:

>>> len(set(a) & set(b))
2

Excel formula: Count matches between two columns, If you want to compare two columns and count matches in corresponding rows, you can use the SUMPRODUCT function with a simple comparison of the two  When the rows aren't duplicates, the function returns 0. You can use this column to filter the data set, as follows: Click the Data tab and then click Filter in the Sort & Filter group to display dropdowns for each column. In Excel 2003, choose Filter from the Data menu, and then select AutoFilter.

If you wish to count multiplicitous entries, the set-based solutions will fail; you will need something like

from collections import Counter

def numDups(a, b):
    if len(a)>len(b):
        a,b = b,a

    a_count = Counter(a)
    b_count = Counter(b)

    return sum(min(b_count[ak], av) for ak,av in a_count.iteritems())

then

numDups([1,1,2,3], [1,1,1,1,1])

returns 2. The running time on this scales as O(n+m).

Also, your initial solution

for num in y:
    if num in x:
        count += 1

is wrong - applied to [1,2,3,3] and [1,1,1,1,1,3], your code will return either 3 or 6, neither of which is correct (answer should be 2).

Excel formula: Count total matches in two ranges, You don't care about the order that the items — you just want to know how many items in list 2 appear in list 1. Solution. The formula we are using in cell G7 is: =  COUNTIF is a function to count cells that meet a single criteria. COUNTIF can be used to count cells with dates, numbers, and text that match specific criteria. The COUNTIF function supports logical operators (>, The Excel AND function is a logical function used to require more than one condition at the same time.

Convert them to sets and count the intersection.

 len(set(a).intersection(set(b)))

Python, The ways to find difference of two lists has been discussed earlier, but sometimes​, we rather than using the set and ignoring the count of elements altogether. Start Excel. Press ALT+F11 to start the Visual Basic editor. On the Insert menu, select Module. Enter the following code in a module sheet: Sub Find_Matches() Dim CompareRange As Variant, x As Variant, y As Variant ' Set CompareRange equal to the range to which you will ' compare the selection.

The following solution also accounts for duplicate elements in the list:

from collections import Counter

def number_of_duplicates(list_a, list_b):
    count_a = Counter(list_a)
    count_b = Counter(list_b)

    common_keys = set(count_a.keys()).intersection(count_b.keys())
    return sum(min(count_a[key], count_b[key]) for key in common_keys)

Then number_of_duplicates([1, 2, 2, 2, 3], [1, 2, 2, 4]) results in the expected 3.


Note that @Hugh Bothwell also provided a similar solution, but it sometimes throws KeyError if an element is only contained in the shorter list.

Python: Find duplicates in a list with frequency count & index positions, 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. def getDuplicatesWithCount(​listOfElems): ''' Get frequency count of duplicate elements in the given list  If you want to compare two ranges, and count total matches between the two ranges, you can use a formula that combines the COUNTIF and SUMPRODUCT functions. Context. Suppose you have a "master" list of some kind, and also have another list that contains some of the same items.

Extract shared values between two columns, Count overlap. days(2) The picture above shows two lists, one in column B and one in column D. The array The formula above can only compare two columns, however, the lists don't have to be the same size. I want to find all the duplicate values in Row 1 and Row 2 but when i am applying above  If you want to compare two columns and count matches in corresponding rows, you can use the SUMPRODUCT function with a simple comparison of the two ranges. For example, if you have values in B5:B11 and C5:C11 and you want to count any differences, you can use this formula: =SUMPRODUCT(--(B5:B11=C5:C11))

Extract a list of duplicates from two columns combined, Step 1 - Prevent duplicate values. The COUNTIF function counts values based on a condition, in this case, I am counting values in cells above. 16.8k 2 2 gold badges 37 37 silver badges 83 83 bronze badges There's another important comment to make about using sets - if an item appears twice in f but not at all in x , it will only appear once in the output list.

Find the length of the intersection of two lists including duplicates , And to get the count of that, simply. Length@% (* 6 *). This is slower than @ciao's answer. However, this is almost 2 orders of magnitude faster: Some Excel users might need to count all the duplicate values or items within a spreadsheet column. You can also do that with the COUNTIF function. However, the function requires an absolute cell reference for the entire column you need to count all the duplicates in. Click cell B2 on your own Excel spreadsheet.

Comments
  • Notice that once it works (dedent the return twice), it has O(n * m) complexit, i.e. scales pretty horribly.
  • @delnan thanks for the tip. so using intersection scales better.
  • Yes. You can actually do even better, but that requires more than one line of code (the idea is that you only need a set of the first list, then iterate over the second and keep the items that are in the set - saves creating a second set).
  • a contains 1 twice, if b contained 1 also, should the count be incremented by 1 or 2?
  • Christ, I keep making that same mistake. Thanks!
  • I dont want 9,5. I want a count of 2.
  • It is not necessary to explicitly make a set from list b, the set intersection method supports lists as inputs.
  • this solution is better because it counts the duplicates too, thanks :)