How can I normalise MySQL tables with duplicate values
how to avoid duplicate records in mysql
find duplicate rows in mysql
update duplicate rows in mysql
how to remove duplicate rows in mysql without using temporary table
how to delete duplicate records in sql using temp table
delete duplicate records in same table
php remove duplicate rows mysql
I have 2 tables fruits and colors.
In the fruits table, the
cid column references the
c_id from the color table, but the problem is that the colors table, has duplicate color names:
Is there an effective way in MySQL to remove the duplicate color rows and update the
cid in the foods table accordingly so the result will be something like this?
Assuming that there is a foreign key constraint between the tables, you first need to
fruit. For this, you can
join the tables to get the color name, and then retrieve the minimum
c_id of that color using a correlated subquery:
update fruit f inner join color c on f.cid = c.c_id set f.cid = (select min(c_id) from color c1 where c1.name = c.c_name)
Then you can safely delete the duplicate
colors while keeping the one with the lowest
delete c from color c inner join color c1 on c1.c_name = c.c_name and c1.c_id < c.c_id
Preventing duplicate records on normalized table. MySQL and PHP , For my photography site I have three tables relevant for this question. One for gallery categories, one for photographs, and one to link MySQL is a database application that stores data in rows and columns of different tables to avoid duplication. Duplicate values can occur, which can impact MySQL performance. This guide will show you how to find duplicate values in a MySQL database .
You could get yourself a result set that has the minimum
cid of matching colors for each
SELECT fruit.f_id, fruit.f_name, min(c2.c_id) as c_id FROM fruit INNER JOIN color c1 ON fruit.cid = c1.c_id INNER JOIN color c2 ON cl.c_name = c2.c_name GROUP BY fruit.f_id, fruit.f_name
That's not the most efficient query, but it will work. You can use this to set your
fruit table correct to only reference a single
color when there are duplicates.
After fixing your
fruit table you can then run a query to see which colors are unused so you know what to delete:
SELECT color.* FROM color LEFT OUTER JOIN fruit on color.c_id = fruit.cid WHERE fruit.f_id IS NULL
normalization of data and duplicate rows, No matter how many times I try to learn MySQL, every single time I try to use data from two different tables, but only include each row once, I get CREATE TABLE new_table AS SELECT * FROM original_table; Please be careful when using this to clone big tables. This can take a lot of time and server resources.
First, you need to update fruit to only reference one of each color name:
UPDATE fruit AS f INNER JOIN color As c ON f.cid = c.c_id INNER JOIN (SELECT c_name, MIN(c_id) AS firstCid FROM color GROUP BY c_name) AS firsts ON c.c_name = firsts.c_name SET f.c_id = firsts.firstCid ;
Note: this is similar to GMB's answer, but does not use a correlated subquery.
Then, the duplicates can be cleaned up with something like this ...
DELETE FROM colors WHERE c_id NOT IN ( SELECT MIN(c_id) FROM colors GROUP BY c_name )
this will preserve unused colors as well, however....
MySQL does not usually like queries that select and delete from the same table simultaneously, so it might have to be expressed like so to "trick" MySQL:
DELETE FROM colors WHERE c_id NOT IN ( SELECT * FROM ( SELECT MIN(c_id) FROM colors GROUP BY c_name ) AS firstIds )
How can I normalise MySQL tables with duplicate values, Is there an effective way in MySQL to remove the duplicate color rows and update the cid in the foods table accordingly so the result will be Normalization is a technique for organizing data in a database. It is important that a database is normalized to minimize redundancy (duplicate data) and to ensure only related data is stored in each table. It also prevents any issues stemming from database modifications such as insertions, deletions, and updates.
You can achieve this in steps -
1. Delete the duplicates-
DELETE FROM colors C1 WHERE EXISTS (SELECT 1 FROM colors C2 WHERE C2.c_name = C1.c_name AND C2.c_id > C1.c_id);
2. Reset the c_id-
UPDATE colors C1 JOIN ( SELECT @rownum:=@rownum+1 rownum, c_id, c_name FROM colors CROSS JOIN (select @rownum := 0) rn ) AS C2 ON C1.c_name = C2.c_name SET C1.c_id = C2.rownum
SQL, SQL – Remove Duplicate Rows without Temporary Table at times, because data sent is mostly from departments like HR and finance where people are not well aware of data normalization techniques [:-)]. delete-duplicate-rows-in-mysql. Here atomicity means values in the table should not be further divided. In simple terms, a single cell cannot hold multiple values. If a table contains a composite or multi-valued attribute, it violates the First Normal Form. In the above table, we can clearly see that the Phone Number column has two values. Thus it violated the 1st NF.
MySQL 5.0 Certification Study Guide, Normalizing your tables removes redundant data, makes it possible to access data more groups within rows and then removes duplicate data within columns. The find duplicate values in on one column of a table, you use follow these steps: First, use the GROUP BY clause to group all rows by the target column, which is the column that you want to check duplicate. Then, use the COUNT () function in the HAVING clause to check if any group have more than 1 element. These groups are duplicate.
Head First PHP & MySQL: A Brain-Friendly Guide, normalizing your data Strive for a bit of normalcy The process of redesigning the database to eliminate duplicate data and break apart and connect tables in a Build a temp table with all the denormalized data. It has vacant columns for the ids. Run the 2 queries for each column that needs normalizing. (2*30 queries in your case).
Normalizing the Table Design, Eliminate duplicate columns from the same table. Create separate tables for each group of related data and identify each row by using a unique column or set of Normalization removes data redundancy and update, insert and delete anomalies and gives you a normalized perfect database design that a database administrator love. To normalize a database table, follow the below given steps that highlights the role of normalization forms and its uses −
- You could just run a query for every color you remove.
UPDATE fruits SET cid='1' WHERE cid='7', then remove the
7from the table. Repeat for each color until there are no duplicates, then don't let there be duplicates again (make
c_nameunique). If it were me, I would automate this using PHP (my experience) or some other language, would be pretty trivial.
- Thank you! The Update query worked just fine, but the delete query throws an error: "c" is not valid at this position, expecting : EOF, ':'
- @Csaba: I updated the delete query, please let me know if it works better now.
- Now the code looks valid, but when I execute the code I get: "Table 'c' is specified twice, both as a target for 'DELETE' and a separate source for data"
- @Csaba: ok, I changed it to a
JOINed query. I tested it and this seems to work fine.
- I think you meant to join c1 and c2 on c_name
- @Uueerdo Sure did. Fixed. Yikes.
fruit.cidhas a foreign key constraint with on delete cascade, that will end up wiping a lot of fruit data; if it has no constraint at all, information like a banana being yellow will need manually reproduced.