Home » SQL & PL/SQL » SQL & PL/SQL » Identify data based on data in other tables
Identify data based on data in other tables [message #674431] Sun, 27 January 2019 09:33 Go to next message
rrcr
Messages: 18
Registered: January 2019
Junior Member
We have tables say A, B, C
I need to identify data from table A based on Data in C.
We have been provided relation among the tables
C. Col1=B. Col2
B. Col3=A. Col4
We have written query like
Select a. * from
A, B, C
Where A. Col4=B. Col3
And B. Col2=C. Col1
And c. Col >somedate.
It is giving dups as the join keys are not pk.
We tried to write using exists
Select * from A where exists
(select 1 from B where A. Col4=B. Col3 and exists(
Select 1 from C where B. Col2=C. Col1)
But this is running long time as there 200 Million records.
Please suggest a better approach.
Thanks in advance
Re: Identify data based on data in other tables [message #674434 is a reply to message #674431] Sun, 27 January 2019 10:16 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Welcome to the forum.
Please read OraFAQ Forum Guide and How to use [code] tags and make your code easier to read.

For any performances question, please read http://www.orafaq.com/forum/mv/msg/84315/433888/102589/#msg_433888 and post the required information.
Also always post your Oracle version, with 4 decimals, as solution depends on it.

[Updated on: Sun, 27 January 2019 10:16]

Report message to a moderator

Re: Identify data based on data in other tables [message #674439 is a reply to message #674434] Sun, 27 January 2019 12:16 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
Please read and follow the forum guidelines, to enable us to help you:
OraFAQ Forum Guide
How to use {code} tags and make your code easier to read

untested since you declined to provide test tables and test data.

SELECT * 
FROM   a 
WHERE  col4 =  (SELECT col3 
                FROM   b 
                WHERE  col2 =  (SELECT col1 
                                FROM   c));
Re: Identify data based on data in other tables [message #674441 is a reply to message #674431] Sun, 27 January 2019 12:35 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wih you would not say "record" when you mean "row"
Re: Identify data based on data in other tables [message #674446 is a reply to message #674439] Mon, 28 January 2019 01:47 Go to previous messageGo to next message
rrcr
Messages: 18
Registered: January 2019
Junior Member
Thanks For your Suggestion..
As there will be multiple rows returned in subquery.. we might use IN condition. I believe EXISTS give better performance than IN when we have large data. Please clarify
Re: Identify data based on data in other tables [message #674447 is a reply to message #674441] Mon, 28 January 2019 01:48 Go to previous messageGo to next message
rrcr
Messages: 18
Registered: January 2019
Junior Member
John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wih you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
Re: Identify data based on data in other tables [message #674448 is a reply to message #674447] Mon, 28 January 2019 01:51 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
rrcr wrote on Mon, 28 January 2019 07:48
John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.
Re: Identify data based on data in other tables [message #674450 is a reply to message #674448] Mon, 28 January 2019 02:02 Go to previous messageGo to next message
rrcr
Messages: 18
Registered: January 2019
Junior Member
John Watson wrote on Mon, 28 January 2019 01:51
rrcr wrote on Mon, 28 January 2019 07:48
John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.
I have tested with sample data .. distinct is running for 0.5 secs where as exist running for 0.27 sec
Re: Identify data based on data in other tables [message #674452 is a reply to message #674450] Mon, 28 January 2019 02:09 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
So you haven't tested properly, have you. I would guess that you haven't checked out the execution plans either. You do need to do a bit of investigation, you know. For example,
orclx>
orclx> set autot trace exp
orclx> select distinct d.* from dept d join emp e on (e.deptno=d.deptno);

Execution Plan
----------------------------------------------------------
Plan hash value: 3764232723

-----------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |     3 |    69 |     7  (29)| 00:00:01 |
|   1 |  HASH UNIQUE                  |         |     3 |    69 |     7  (29)| 00:00:01 |
|   2 |   MERGE JOIN SEMI             |         |     3 |    69 |     6  (17)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     4 |    80 |     2   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN           | PK_DEPT |     4 |       |     1   (0)| 00:00:01 |
|*  5 |    SORT UNIQUE                |         |    14 |    42 |     4  (25)| 00:00:01 |
|   6 |     TABLE ACCESS FULL         | EMP     |    14 |    42 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("E"."DEPTNO"="D"."DEPTNO")
       filter("E"."DEPTNO"="D"."DEPTNO")

orclx>
Re: Identify data based on data in other tables [message #674455 is a reply to message #674452] Mon, 28 January 2019 02:23 Go to previous messageGo to next message
rrcr
Messages: 18
Registered: January 2019
Junior Member
John Watson wrote on Mon, 28 January 2019 02:09
So you haven't tested properly, have you. I would guess that you haven't checked out the execution plans either. You do need to do a bit of investigation, you know. For example,
orclx>
orclx> set autot trace exp
orclx> select distinct d.* from dept d join emp e on (e.deptno=d.deptno);

Execution Plan
----------------------------------------------------------
Plan hash value: 3764232723

-----------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |     3 |    69 |     7  (29)| 00:00:01 |
|   1 |  HASH UNIQUE                  |         |     3 |    69 |     7  (29)| 00:00:01 |
|   2 |   MERGE JOIN SEMI             |         |     3 |    69 |     6  (17)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     4 |    80 |     2   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN           | PK_DEPT |     4 |       |     1   (0)| 00:00:01 |
|*  5 |    SORT UNIQUE                |         |    14 |    42 |     4  (25)| 00:00:01 |
|   6 |     TABLE ACCESS FULL         | EMP     |    14 |    42 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("E"."DEPTNO"="D"."DEPTNO")
       filter("E"."DEPTNO"="D"."DEPTNO")

orclx>
Yeah i havent done this .. i just checked run time for those queries. I am new to this analysis. thanks for info
Re: Identify data based on data in other tables [message #674462 is a reply to message #674455] Mon, 28 January 2019 11:39 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
does table A have a primary key?
Re: Identify data based on data in other tables [message #674471 is a reply to message #674450] Mon, 28 January 2019 14:50 Go to previous messageGo to next message
EdStevens
Messages: 1376
Registered: September 2013
Senior Member
rrcr wrote on Mon, 28 January 2019 02:02
John Watson wrote on Mon, 28 January 2019 01:51
rrcr wrote on Mon, 28 January 2019 07:48
John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.
I have tested with sample data .. distinct is running for 0.5 secs where as exist running for 0.27 sec
How much of your time is justified optimizing the query to save 0.23 seconds?
Re: Identify data based on data in other tables [message #674475 is a reply to message #674471] Tue, 29 January 2019 00:39 Go to previous messageGo to next message
rrcr
Messages: 18
Registered: January 2019
Junior Member

EdStevens wrote on Mon, 28 January 2019 14:50
rrcr wrote on Mon, 28 January 2019 02:02
John Watson wrote on Mon, 28 January 2019 01:51
rrcr wrote on Mon, 28 January 2019 07:48
John Watson wrote on Sun, 27 January 2019 12:35
Surely all you need do is

select DISTINCT a.* ......

and, by the way, I wish you would not say "record" when you mean "row"
Thanks For your suggestion.. Distinct generally takes more time isnt it.
How do you know if you haven't tried it? The optimizer should rewrite it to a semijoin, which I think is what you were trying to do with EXISTS.
I have tested with sample data .. distinct is running for 0.5 secs where as exist running for 0.27 sec
How much of your time is justified optimizing the query to save 0.23 seconds?
For test data it is 0.2 sec where as we have data around 180 millions.. it will have great impact right

[Updated on: Tue, 29 January 2019 00:39]

Report message to a moderator

Re: Identify data based on data in other tables [message #674476 is a reply to message #674462] Tue, 29 January 2019 00:40 Go to previous messageGo to next message
rrcr
Messages: 18
Registered: January 2019
Junior Member
BlackSwan wrote on Mon, 28 January 2019 11:39
does table A have a primary key?

Yes we have .. but the joining keys are not primary keys hence getting dups
Re: Identify data based on data in other tables [message #674484 is a reply to message #674476] Tue, 29 January 2019 07:56 Go to previous message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
rrcr wrote on Mon, 28 January 2019 22:40
BlackSwan wrote on Mon, 28 January 2019 11:39
does table A have a primary key?
Yes we have .. but the joining keys are not primary keys hence getting dups

show us how you have duplicates when PK ensure UNIQUE values.

When "duplicate" does exist, what determines which row is eliminated to remove the "duplicate" value?

Please read and follow the forum guidelines, to enable us to help you:
OraFAQ Forum Guide
How to use {code} tags and make your code easier to read

Previous Topic: DataMapper - JSON -> XML - Add a colon in XML tag
Next Topic: MULTISET EXCEPT - COLLECTION
Goto Forum:
  


Current Time: Fri Mar 29 09:23:11 CDT 2024