MySQL: Efficient way computing set powers of Venn-Diagram

set theory and relational databases
sql except
applications of set theory in computer science
how are sets used in relational databases
set theory examples
set operators in sql
difference between union and intersect in sql
applications of set theory pdf

Given the 4 tables, each containing items and representing one set, how to get the count of the items in each compartment required to draw a Venn diagram as shown below. The calculation should take place in the MySQL server avoiding transmission of items to the application server.

Example tables:

s1:         s2:         s3:         s4:
+------+    +------+    +------+    +------+
| item |    | item |    | item |    | item |
+------+    +------+    +------+    +------+
| a    |    | a    |    | a    |    | a    |
+------+    +------+    +------+    +------+
| b    |    | b    |    | b    |    | c    |
+------+    +------+    +------+    +------+
| c    |    | c    |    | d    |    | d    |
+------+    +------+    +------+    +------+
| d    |    | e    |    | e    |    | e    |
+------+    +------+    +------+    +------+
| ...  |    | ...  |    | ...  |    | ...  |

Now, I think I would calculate some set powers. Some examples with I corresponding to s1, II to s2, III to s3 and IV to s4:

If I reinterpret sx as being a set, I would write:

  1. |s1 ∩ s2 ∩ s3 ∩ s4| - the white 25 in the center
  2. |(s1 ∩ s2 ∩ s4) \ s3| - the white 15 below right in relation to the center
  3. |(s1 ∩ s4) \ (s2 ∪ s3)| - the white 5 on the bottom
  4. |s1 \ (s2 ∪ s3 ∪ s4)| - the dark blue 60 on the blue ground
  5. ... till 15.

How to calculate those powers efficiently on the MySQL server? Does MySQL provide a function aiding in the calculation?

A naive approach would be running a query for 1.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s3
INTERSECT
SELECT item FROM s4);

and another query for 2.

SELECT count(*) FROM(
SELECT item FROM s1
INTERSECT
SELECT item FROM s2
INTERSECT
SELECT item FROM s4
EXCEPT
SELECT item FROM s3);

and so on, resulting in 15 queries.

Try something like this:

with universe as (
    select * from s1 
    union
    select * from s2
    union
    select * from s3
    union
    select * from s4
),
regions as (
    select
        case when s1.item is null then '0' else '1' end
        ||
        case when s2.item is null then '0' else '1' end
        ||
        case when s3.item is null then '0' else '1' end
        ||
        case when s4.item is null then '0' else '1' end as Region
    from universe u
    left join s1 on u.item = s1.item
    left join s2 on u.item = s2.item
    left join s3 on u.item = s3.item
    left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

Disclaimer: I only tested this in SQLite. You might need to SET sql_mode='PIPES_AS_CONCAT' for the ANSI string concatenation to work in MySQL, or use the concat function instead. The WITH syntax is only supported starting from version 8.0 of MySQL, but you can use temporary tables or nested queries appropriately instead.

If the sets are very large you might want to index the item column before querying in case the SQL optimizer won't figure it out by itself.

Newest 'venn-diagram' Questions - Page 2, MySQL: Efficient way computing set powers of Venn-Diagram · mysql venn-​diagram set-intersection · Nov 25 '18 at 20:36 Rainer Rillke. 0. 0  up vote 3 down vote favorite Given the 4 tables, each containing items and representing one set, how to get the count of the ite

Following procedure:

  1. Created a stored procedure, which creates temporary in-memory tables containing the sets.
  2. Mind that MySQL does not allow you refer to a temporary in-memory table more than one time in a query.
  3. As noted, MySQL does not have an INTERSECT or EXCEPT. But you can emulate them. By removing duplicates from your raw data/ raw sets, emulation can be even more simplified.
  4. Decided to store the computed value into a variable each and output a table consisting of all 15 of those values corresponding to components.

What I came up with is currently https://gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11

From mathematics to SQL Server, a fast introduction to set theory, In mathematics, we define set theory is a branch of mathematics and represent an operation on sets, it's common to use Venn diagrams, Basically, a join is a way to get a set based on two or more tables. Example 1: Compute/Generate all possible cases for a particular The power set operation. 3 Given the 4 tables, each containing items and representing one set, how to get the count of the items in each compartment req

The question is a little complex so the answers are. Let me explain K.T.'s answer

with universe as (
    select * from s1 
    union
    select * from s2
    union
    select * from s3
    union
    select * from s4
),
regions as (
    select
        case when s1.item is null then '0' else '1' end
        ||
        case when s2.item is null then '0' else '1' end
        ||
        case when s3.item is null then '0' else '1' end
        ||
        case when s4.item is null then '0' else '1' end as Region
    from universe u
    left join s1 on u.item = s1.item
    left join s2 on u.item = s2.item
    left join s3 on u.item = s3.item
    left join s4 on u.item = s4.item
)
select Region, count(*) from regions group by Region

The universe results in the UNION of all tables (duplicates eliminated), something like

+------+
| item |
+------+
| a    |
+------+
| b    |
+------+
| c    |
+------+
| d    |
+------+
| e    |
+------+
| ...  |
+------+

Then, s1, s2, s3 and s4 are joined

+------+---------+---------+---------+---------+
| item | s1.item | s2.item | s3.item | s4.item |
+------+---------+---------+---------+---------+
| a    | a       | a       | a       | a       |
+------+---------+---------+---------+---------+
| b    | b       | b       | b       | NULL    |
+------+---------+---------+---------+---------+
| c    | c       | c       | NULL    | c       |
+------+---------+---------+---------+---------+
| d    | d       | NULL    | d       | d       |
+------+---------+---------+---------+---------+
| e    | NULL    | e       | e       | e       |
+------+---------+---------+---------+---------+
| ...  | ...     | ...     | ...     | ...     |
+------+---------+---------+---------+---------+

and converted to a binary string (0: if cell is NULL; 1: else) called Region where the first digit corresponds to s1, the second to s2 and so on

+------+--------+
| item | Region |
+------+--------+
| a    | 1111   |
+------+--------+
| b    | 1110   |
+------+--------+
| c    | 1101   |
+------+--------+
| d    | 1011   |
+------+--------+
| e    | 0111   |
+------+--------+
| ...  | ...    |
+------+--------+

and finally aggregated and grouped by Region

+--------+-------+
| Region | count |
+--------+-------+
| 1111   | 1     |
+--------+-------+
| 1110   | 1     |
+--------+-------+
| 1101   | 1     |
+--------+-------+
| 1011   | 1     |
+--------+-------+
| 0111   | 1     |
+--------+-------+
| ...    |       |
+--------+-------+

Note that regions having 0 set elements in them do not show up in the results and 0000 never will (=item not part of any set s1, s2, s3, s4) so there are 15 regions.

You Probably don't Use SQL INTERSECT or EXCEPT Often Enough , When people talk about SQL JOIN, they often use Venn Diagrams to illustrate inclusion and exclusion of the two joined sets: While these Venn diagrams are Best Practices and Lessons Learned from Writing Awesome Java and SQL Code. Don't Miss out on Awesome SQL Power with FIRST_VALUE(),  Given the source data for a Venn diagram, e.g. A=10, B=15, C=12, A+B=5, B+C=3, A+C=2, A+B+C=1, I need to draw a Venn diagram with the circle sizes proportional to A,B, and C, and their overlap

Set Theory: the Method To Database Madness - basecs, The Venn diagram was actually only incorporated into the “set theory it was such an effective way of illustrating simple relationships between sets. quite a bit in computer science: set differences and relative complements. 4 MySQL: Efficient way computing set powers of Venn-Diagram Nov 10 '18 3 OOUI: How to get reference to Tool from a ToolFactory or Toolbar identified by group Oct 4 '15 3 JavaScript Pi Spigot Algorithm not working Sep 15 '14

AWS Diagramming Icons Explained: The AWS Database Set, Learn about AWS database shapes with this AWS Database Set icon glossary. They're managed by Amazon, saving teams time on tasks like You can use this icon to represent an MySQL instance in your diagram. and scalability from Amazon to power the most demanding real-time applications. The Venn diagram was actually only incorporated into the “set theory curriculum” in the 1960’s because it was such an effective way of illustrating simple relationships between sets.

Types of Database and DBMS: Examples and Use-cases, A database is a structured set of data held in a computer or server. Structured data is organised in ways that computers (and hopefully Database Management System (DBMS) power rankings Example systems: Microsoft Access and MySQL Similarities between methodologies are bound to arise. Don't underestimate the power of those options, I already say incredibly useful I'll say again, is that it's a really efficient way of being able to explore your data and narrow down very quickly. So, for example, if I add to current selection, NAME LIKE CAFE with the same wildcard being used there.

Comments
  • If someone tells me convincingly it would be a lot easier to do it with Postgres, I would change the question accordingly. It should probably read "Open Source DBMS: ..." but that's too broad for SO.
  • There is no INTERSECT and EXCEPT in MySQL. So, you could use other RDBMS, which provides these features.
  • @MadhurBhaiya Wasn't aware of that. MariaDB introduced set operations with 10.3.
  • Current solution: gist.github.com/Rillke/c2da0921f8f2a047615f41fab8781c11