Finding element pairs from two sets that have a predefined relation

Related searches

I have two lists

list1 = ['a', 'b', 'c', 'd']
list2 = ['e', 'f', 'g', 'h']

I know from before that some of these elements are related through another list

ref_list = [
   ['d', 'f'], ['a', 'e'], ['b', 'g'], ['c', 'f'], ['a', 'g'],
   ['a', 'f'], ['b', 'e'], ['b', 'f'], ['c', 'e'], ['c', 'g']
]

I would like to quickly identify the two groups from list1 and list2 which have all the possible pairs [list1 element, list2 element] in ref_list. In this case the solution would be

[['a', 'b', 'c'], ['e', 'f', 'g']]

I can think of some ways to do this for such small lists but need help if list1, list2 and ref_list have thousands of elements each.

Set inclusion seems pretty fast.

import random
import string

list1 = [random.choice(string.ascii_letters) + random.choice(string.ascii_letters) + random.choice(string.ascii_letters) for _ in xrange(9999)]
# len(list1) == 9999    
list2 = [random.choice(string.ascii_letters) + random.choice(string.ascii_letters) + random.choice(string.ascii_letters) for _ in xrange(9999)]
# len(list2) == 9999
ref_list = [[random.choice(string.ascii_letters) + random.choice(string.ascii_letters) + random.choice(string.ascii_letters), random.choice(string.ascii_letters) + random.choice(string.ascii_letters) + random.choice(string.ascii_letters)] for _ in xrange(9999)]
# len(ref_list) == 9999

refs1 = set([t[0] for t in ref_list])
# CPU times: user 2.45 ms, sys: 348 µs, total: 2.8 ms
# Wall time: 2.2 ms
# len(refs1) == 9656 for this run

refs2 = set([t[1] for t in ref_list])
# CPU times: user 658 µs, sys: 3.92 ms, total: 4.58 ms
# Wall time: 4.02 ms
# len(refs2) == 9676 for this run

list1_filtered = [v for v in list1 if v in refs1]
# CPU times: user 1.19 ms, sys: 4.34 ms, total: 5.53 ms
# Wall time: 3.76 ms
# len(list1_filtered) == 702 for this run

list2_filtered = [v for v in list2 if v in refs2]
# CPU times: user 3.05 ms, sys: 4.29 ms, total: 7.33 ms
# Wall time: 4.51 ms
# len(list2_filtered) == 697 for this run

Sets of pairs in C++, Sets are a type of associative containers in which each element has to be Pair is a simple container defined in <utility> header consisting of two data or hash_map are of type 'pair' by default in which all the 'first' elements are Find the Jaccard Index and Jaccard Distance between the two given sets� The Cartesian product of two sets A and B, denoted by A × B, is defined as the set consisting of all ordered pairs ( a, b) for which a ∊ A and b ∊ B. For example, if A = { x, y } and B = {3, 6, 9}, then A × B = { ( x, 3), ( x, 6), ( x, 9), ( y, 3), ( y, 6), ( y, 9)}.

You can add the elements from each pair in ref_list to sets set1 and set2, then use list1 = list(set1) and list2 = list(set2). Sets contain no duplicates, and this should be fast for thousands of elements since e in s1 for sets takes O(1) time on average.

Sets in Python, The major advantage of using a set, as opposed to a list, is that it has a highly optimized While elements of a set can be modified at any time, elements of the frozen set remain the same after creation. Time complexity of finding difference s1 – s2 is O(len(s1)) Checking relation between set3 and set4. Intersection: The common elements of two sets: A∩B = {x | (x ∈ A)∧(x ∈ B)}. If A∩B = ∅, the sets are said to be disjoint. 2. Union: The set of elements that belong to either of two sets: A∪B = {x | (x ∈ A)∨(x ∈ B)}.

You can use collections.Counter to generate counts for items in ref_list and use them to filter out items in the two lists that do not occur more than once:

from collections import Counter
[[i for i in lst if counts.get(i, 0) > 1] for lst, ref in zip((list1, list2), zip(*ref_list)) for counts in (Counter(ref),)]

This returns:

[['a', 'b', 'c'], ['e', 'f', 'g']]

Pairs of Sets | Disjoint Sets | Overlapping Sets, The relations are stated between the pairs of sets. Two sets A and B are said to be disjoint, if they do not have any element in common. us to find whether the pairs of sets are equal sets or equivalent sets, disjoint sets or overlapping sets. Relations and Functions Let’s start by saying that a relation is simply a set or collection of ordered pairs. Nothing really special about it. An ordered pair, commonly known as a point, has two components which are the x and y coordinates. This is an example of an ordered pair. Main Ideas and Ways How … Relations and Functions Read More »

Relation - Set, Pairs, Pair, and Subset, For most useful relations, the elements of the ordered pairs are naturally associated or two sets then, is a specific subset of the Cartesian product of the two sets. coordinate plane if they have ordered pairs of real numbers as their elements� A function is a special type of relation in which each element of the domain is paired with exactly one element in the range . A mapping shows how the elements are paired. Its like a flow chart for a function, showing the input and output values. A mapping diagram consists of two parallel columns.

Precalculus: Functions: Sets and Relations, They also have a rule that distinguishes their elements from other numbers. Given two sets, A and B, the set of all the possible ordered pairs in which the first � [math]\quad|\mathcal P(S\times S)|=2^{|S|^2}[/math] A relation on a set, [math]S[/math], is a subset of [math]S\times S[/math]. The total number of such relations is the cardinality of the power set, [math]\mathcal P(S\times S)[/math], the set of

Such a matching is called a bijective correpondence or one-to-one correspondence. A bijective correspondence between A and B may be expressed as a function from A to B that assigns different elements of B to all the elements of A and “uses” all the elements of B. A function that has these properties is called a bijection.

Comments
  • Your expected output ['a','b','c'] and ['e','f','g'] does not cover the pair ['d','f'].
  • Good catch. Let me edit the question. Thanks
  • Thanks but that would not work. Unlike the example I gave, the solution for larger list1 and list2 would have multiple group pairs