MySQL에서 중복 값 찾기
varchar 열이있는 테이블이 있는데이 열에 중복 값이있는 모든 레코드를 찾고 싶습니다. 중복을 찾는 데 사용할 수있는 가장 좋은 쿼리는 무엇입니까?
를 수행 SELECT
로모그래퍼 GROUP BY
절. name 이 중복 항목을 찾으려는 열 이라고 가정 해 보겠습니다 .
SELECT name, COUNT(*) c FROM table GROUP BY name HAVING c > 1;
그러면 첫 번째 열에 이름 값 이있는 결과가 반환 되고 두 번째 열에 해당 값이 나타나는 횟수 가 반환 됩니다.
SELECT varchar_col
FROM table
GROUP BY varchar_col
HAVING COUNT(*) > 1;
SELECT *
FROM mytable mto
WHERE EXISTS
(
SELECT 1
FROM mytable mti
WHERE mti.varchar_column = mto.varchar_column
LIMIT 1, 1
)
이 쿼리는 고유 한 레코드가 아닌 완전한 레코드를 반환합니다 varchar_column
.
이 쿼리는 COUNT(*)
. 중복이 많고 COUNT(*)
비용이 많이 들고 전체가 필요하지 않은 COUNT(*)
경우 동일한 값을 가진 두 행이 있는지 알면됩니다.
varchar_column
물론 의지 에 대한 인덱스를 사용 하면이 쿼리의 속도가 크게 향상됩니다.
levik의 답변을 기반으로 중복 행의 ID를 얻으 GROUP_CONCAT
려면 서버가 지원 하는 경우 수행 할 수 있습니다 (쉼표로 구분 된 ID 목록을 반환합니다).
SELECT GROUP_CONCAT(id), name, COUNT(*) c FROM documents GROUP BY name HAVING c > 1;
SELECT *
FROM `dps`
WHERE pid IN (SELECT pid FROM `dps` GROUP BY pid HAVING COUNT(pid)>1)
테이블 이름이 TableABC이고 원하는 열이 Col이고 T1의 기본 키가 Key라고 가정합니다.
SELECT a.Key, b.Key, a.Col
FROM TableABC a, TableABC b
WHERE a.Col = b.Col
AND a.Key <> b.Key
위의 답변에 비해이 접근법의 장점은 키를 제공한다는 것입니다.
Employee의 이름 열에서 중복 된 레코드 수를 찾으려면 아래 쿼리가 유용합니다.
Select name from employee group by name having count(*)>1;
중복 측면에서 많은 용도로 사용되는 JOIN 접근 방식을 보지 못했습니다.
이 접근 방식은 실제 두 배의 결과를 제공합니다.
SELECT t1.* FROM my_table as t1
LEFT JOIN my_table as t2
ON t1.name=t2.name and t1.id!=t2.id
WHERE t2.id IS NOT NULL
ORDER BY t1.name
내 마지막 쿼리에는 group by, count 및 GROUP_CONCAT 결합과 같은 몇 가지 답변이 여기에 포함되었습니다.
SELECT GROUP_CONCAT(id), `magento_simple`, COUNT(*) c
FROM product_variant
GROUP BY `magento_simple` HAVING c > 1;
This provides the id of both examples (comma separated), the barcode I needed, and how many duplicates.
Change table and columns accordingly.
SELECT t.*,(select count(*) from city as tt
where tt.name=t.name) as count
FROM `city` as t
where (
select count(*) from city as tt
where tt.name=t.name
) > 1 order by count desc
Replace city with your Table. Replace name with your field name
I saw the above result and query will work fine if you need to check single column value which are duplicate. For example email.
But if you need to check with more columns and would like to check the combination of the result so this query will work fine:
SELECT COUNT(CONCAT(name,email)) AS tot,
name,
email
FROM users
GROUP BY CONCAT(name,email)
HAVING tot>1 (This query will SHOW the USER list which ARE greater THAN 1
AND also COUNT)
Taking @maxyfc's answer further, I needed to find all of the rows that were returned with the duplicate values, so I could edit them in MySQL Workbench:
SELECT * FROM table
WHERE field IN (
SELECT field FROM table GROUP BY field HAVING count(*) > 1
) ORDER BY field
SELECT
t.*,
(SELECT COUNT(*) FROM city AS tt WHERE tt.name=t.name) AS count
FROM `city` AS t
WHERE
(SELECT count(*) FROM city AS tt WHERE tt.name=t.name) > 1 ORDER BY count DESC
The following will find all product_id that are used more than once. You only get a single record for each product_id.
SELECT product_id FROM oc_product_reward GROUP BY product_id HAVING count( product_id ) >1
Code taken from : http://chandreshrana.blogspot.in/2014/12/find-duplicate-records-based-on-any.html
CREATE TABLE tbl_master
(`id` int, `email` varchar(15));
INSERT INTO tbl_master
(`id`, `email`) VALUES
(1, 'test1@gmail.com'),
(2, 'test2@gmail.com'),
(3, 'test1@gmail.com'),
(4, 'test2@gmail.com'),
(5, 'test5@gmail.com');
QUERY : SELECT id, email FROM tbl_master
WHERE email IN (SELECT email FROM tbl_master GROUP BY email HAVING COUNT(id) > 1)
SELECT DISTINCT a.email FROM `users` a LEFT JOIN `users` b ON a.email = b.email WHERE a.id != b.id;
I prefer to use windowed functions(MySQL 8.0+) to find duplicates because I could see entire row:
WITH cte AS (
SELECT *
,COUNT(*) OVER(PARTITION BY col_name) AS num_of_duplicates_group
,ROW_NUMBER() OVER(PARTITION BY col_name ORDER BY col_name2) AS pos_in_group
FROM table
)
SELECT *
FROM cte
WHERE num_of_duplicates_group > 1;
For removing duplicate rows with multiple fields , first cancate them to the new unique key which is specified for the only distinct rows, then use "group by" command to removing duplicate rows with the same new unique key:
Create TEMPORARY table tmp select concat(f1,f2) as cfs,t1.* from mytable as t1;
Create index x_tmp_cfs on tmp(cfs);
Create table unduptable select f1,f2,... from tmp group by cfs;
One very late contribution... in case it helps anyone waaaaaay down the line... I had a task to find matching pairs of transactions (actually both sides of account-to-account transfers) in a banking app, to identify which ones were the 'from' and 'to' for each inter-account-transfer transaction, so we ended up with this:
SELECT
LEAST(primaryid, secondaryid) AS transactionid1,
GREATEST(primaryid, secondaryid) AS transactionid2
FROM (
SELECT table1.transactionid AS primaryid,
table2.transactionid AS secondaryid
FROM financial_transactions table1
INNER JOIN financial_transactions table2
ON table1.accountid = table2.accountid
AND table1.transactionid <> table2.transactionid
AND table1.transactiondate = table2.transactiondate
AND table1.sourceref = table2.destinationref
AND table1.amount = (0 - table2.amount)
) AS DuplicateResultsTable
GROUP BY transactionid1
ORDER BY transactionid1;
The result is that the DuplicateResultsTable
provides rows containing matching (i.e. duplicate) transactions, but it also provides the same transaction id's in reverse the second time it matches the same pair, so the outer SELECT
is there to group by the first transaction ID, which is done by using LEAST
and GREATEST
to make sure the two transactionid's are always in the same order in the results, which makes it safe to GROUP
by the first one, thus eliminating all the duplicate matches. Ran through nearly a million records and identified 12,000+ matches in just under 2 seconds. Of course the transactionid is the primary index, which really helped.
Select column_name, column_name1,column_name2, count(1) as temp from table_name group by column_name having temp > 1
SELECT ColumnA, COUNT( * )
FROM Table
GROUP BY ColumnA
HAVING COUNT( * ) > 1
If you want to remove duplicate use DISTINCT
Otherwise use this query:
SELECT users.*,COUNT(user_ID) as user FROM users GROUP BY user_name HAVING user > 1;
to get all the data that contains duplication i used this:
SELECT * FROM TableName INNER JOIN(
SELECT DupliactedData FROM TableName GROUP BY DupliactedData HAVING COUNT(DupliactedData) > 1 order by DupliactedData)
temp ON TableName.DupliactedData = temp.DupliactedData;
TableName = the table you are working with.
DupliactedData = the duplicated data you are looking for.
Try using this query:
SELECT name, COUNT(*) value_count FROM company_master GROUP BY name HAVING value_count > 1;
참고URL : https://stackoverflow.com/questions/688549/finding-duplicate-values-in-mysql
'Nice programing' 카테고리의 다른 글
JavaScript에서 (function () {}) () 구조는 무엇입니까? (0) | 2020.09.29 |
---|---|
python setup.py 제거 (0) | 2020.09.29 |
빈 배열 요소 제거 (0) | 2020.09.29 |
선호하는 diff 도구 / 뷰어로 'git diff'출력을 보려면 어떻게해야합니까? (0) | 2020.09.29 |
malloc과 calloc의 차이점은 무엇입니까? (0) | 2020.09.29 |