ITPub博客

首页 > Linux操作系统 > Linux操作系统 > How can I Delete Duplicate Rows?

How can I Delete Duplicate Rows?

原创 Linux操作系统 作者:ilsyx 时间:2011-08-18 11:22:51 0 删除 编辑
mysql去重的语句.用到的是limit,前提是自己知道具体要删除多少行.例如 DELETE FROM table1 WHERE col1 = 'text' LIMIT 3; 会只保留第一个显示出来的结果. DELETE FROM table1 WHERE col1 = 'text' ORDER BY date_col DESC LIMIT 3;   delete与order by连接使用从mysql 4.0后开始支持.同样保留第一行的数据.对于自增长的重复列 DELETE table_a FROM table1 AS table_a INNER JOIN table1 AS table_b ON table_a.col1 = table_b.col1 AND table_a.id < table_b.id; 用表自包含的方法去重.怎么找重复列在文档 " NOTE:1023312.1 - How can I Find Duplicated Rows in a Table? " 中有说.
How can I Delete Duplicate Rows? [ID 1023311.1]

  Modified 17-AUG-2011     Type HOWTO     Migrated ID 5279     Status PUBLISHED  

Discussion

If a table has rows which are completely the same, developers sometimes have difficulty instructing MySQL to delete one row, but not both duplicate rows. For example, suppose you have a table containing duplicate rows, rows in which all of their columns have the same value as the duplicate entries. This could be in a table in which there is not a key column, so all columns could match between the duplicate rows. If you want to delete the duplicates, you can do so with the DELETE statement and a LIMIT clause.

Solution

If you have a table in which there is not a key column and all columns match between the duplicate rows, and you know which rows they are, you can enter SQL statements like the following:

SELECT COUNT(*)
   FROM table1
   WHERE col1='text';

+----------+
| COUNT(*) |
+----------+
|        4 |
+----------+

DELETE FROM table1
   WHERE col1 = 'text'
   LIMIT 3;

This is assuming you have manually inspected the data to determine that there are duplicate records for the condition given in the WHERE clause. A SELECT statement is then run with the COUNT() function to count the number of duplicate entries based on the condition in which the duplicates were first detected. This is followed by a DELETE statement with a LIMIT clause with a count one less than the number of rows found. This way one will be kept.

If you are considering some rows to be duplicates based on certain columns, but not all columns, then you may not want to randomly delete rows as shown in the previous SQL statement. Of course, if one of the columns is different, you could use that column in the WHERE clause of the DELETE statement to determine which rows to delete. An alternative to random or specific row deletion would be to add an ORDER BY clause to the DELETE statement shown above. For example, suppose that the rows contain a datetime column. You could order the rows in descending order by the datetime column like so:

DELETE FROM table1
   WHERE col1 = 'text'
   ORDER BY date_col DESC
   LIMIT 3;

This will keep the row with the oldest date and delete the others. The ORDER BY clause can be used with DELETE beginning with MySQL 4.0.0.

If you have an AUTO_INCREMENT column, or if you can add one to your table, you may also be able to delete duplicates via a self-join:

DELETE table_a
   FROM table1 AS table_a
   INNER JOIN table1 AS table_b
   ON table_a.col1 = table_b.col1
      AND table_a.id < table_b.id;

To find duplicate rows in a table based on a particular column, see Knowledge Document 1023312.1.

References

NOTE:1023312.1 - How can I Find Duplicated Rows in a Table?

Show Related Information Related


Products
  • Oracle Database Products > MySQL > MySQL Server > MySQL Server
Keywords
DUPLICATE ENTRY; MYSQL

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/11780477/viewspace-705237/,如需转载,请注明出处,否则将追究法律责任。

上一篇: scheduler管理
请登录后发表评论 登录
全部评论

注册时间:2009-06-12

  • 博文量
    195
  • 访问量
    598779