Deleting Dups [message #19460] |
Wed, 20 March 2002 07:01 |
CJ
Messages: 8 Registered: September 2000
|
Junior Member |
|
|
I have a table that has 5 columns. Columns A,B,C,and D need to be unique - duplicates occur when column E has different data. Is there a way to delete the duplicates - using a column F? Column F contains dates - We would need to delete all duplicate records where A,B,C,and D are equal and column F has the earliest date range. For Example:
005 OR 12345 123 description 1/1/92
005 OR 12345 123 Diff Desc. 12/1/01
I would want the first record deleted, but keep the second. It is not a static thing - the dates and the data will differ greatly. There are roughly 300,000 records to manually go through each month, so any help is GREATLY appreciated!
|
|
|
Re: Deleting Dups [message #19466 is a reply to message #19460] |
Wed, 20 March 2002 10:22 |
Jon
Messages: 483 Registered: May 2001
|
Senior Member |
|
|
I didn't create 5 columns but you'll get the idea...
15:18:25 ==> drop table table1;
Table dropped.
15:20:00 ==> create table table1 (col1 number, col2 number, col3 date);
Table created.
15:20:00 ==> insert into table1 values (1,1,'02/01/1999');
1 row created.
15:20:01 ==> insert into table1 values (1,1,'02/01/2000');
1 row created.
15:20:01 ==> insert into table1 values (1,1,'02/01/2001');
1 row created.
15:20:01 ==> insert into table1 values (1,2,'03/01/1999');
1 row created.
15:20:01 ==> insert into table1 values (1,2,'03/01/2002');
1 row created.
15:20:01 ==> insert into table1 values (1,3,'02/01/1999');
1 row created.
15:20:01 ==> delete from table1 where (col1, col2, col3) not in
15:20:01 2 (select col1, col2, max(col3) from table1 t1
15:20:01 3 group by col1, col2);
3 rows deleted.
15:20:01 ==> select * from table1;
COL1 COL2 COL3
--------- --------- ----------
1 1 02/01/2001
1 2 03/01/2002
1 3 02/01/1999
15:20:03 ==>
|
|
|
Re: Deleting Dups [message #19507 is a reply to message #19460] |
Fri, 22 March 2002 05:19 |
Alex Mazur
Messages: 17 Registered: March 2002
|
Junior Member |
|
|
Try this
delete from xtab t1
where exists
( select 1 from xtab t2
where
t1.a = t2.a and
t1.b = t2.b and
t1.c = t2.c and
t1.d = t2.d and
t1.f <
( select max(t3.f)
from xtab t3
where
t3.a = t2.a and
t3.b = t2.b and
t3.c = t2.c and
t3.d = t2.d
)
)
Hope it will help.
|
|
|