Home » RDBMS Server » Performance Tuning » join condition in index (ora 11.2.0.2)
join condition in index [message #547455] Wed, 14 March 2012 06:13 Go to next message
orarule
Messages: 11
Registered: March 2012
Junior Member
Hi All,

First of many posts here!

For a hash join statement, is it beneficial to have the join condition objects in the index as well as the objects in the where clause?

Thanks!
Re: join condition in index [message #547465 is a reply to message #547455] Wed, 14 March 2012 06:39 Go to previous messageGo to next message
John Watson
Messages: 8929
Registered: January 2010
Location: Global Village
Senior Member
Welcome to the forum. Please read our OraFAQ Forum Guide and How to use [code] tags and make your code easier to read

What index are you talking about?

Re: join condition in index [message #547471 is a reply to message #547465] Wed, 14 March 2012 06:54 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
If you have a SQL performing a Hash Join, then it is not likely to benefit by creating an index.

Usually an index will have no effect, because it is not used. If the index is used, then it is quite possible that it will make the query slower. Hash Joins are used for high data volumes and indexes are often worse than full table scans in these situations.

In the unlikely event that an index DOES improve the performance of such a join, a Nested Loops join with the join-key indexed would probably be faster still.

Ross Leishman
Re: join condition in index [message #547473 is a reply to message #547471] Wed, 14 March 2012 07:01 Go to previous messageGo to next message
John Watson
Messages: 8929
Registered: January 2010
Location: Global Village
Senior Member
I was thinking of this sort of thing:
orcl> select /*+ use_hash(emp dept) */ ename,dname from emp natural join dept where empno=1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 373232351

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |     1 |    26 |     5  (20)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |     1 |    26 |     5  (20)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |    13 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |       |     0   (0)| 00:00:01 |
|   4 |   TABLE ACCESS STORAGE FULL  | DEPT   |     4 |    52 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
   3 - access("EMP"."EMPNO"=1000)
Re: join condition in index [message #547476 is a reply to message #547471] Wed, 14 March 2012 07:12 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
rleishman wrote on Wed, 14 March 2012 22:54
In the unlikely event that an index DOES improve the performance of such a join, a Nested Loops join with the join-key indexed would probably be faster still.


And that query would be better still (over volumes a bit greater than EMP/DEPT) with DEPT.DEPTNO indexed and an Indexed Nested Loops join.

I'm not saying it's impossible for an optimised query to use an index and a hash join (index joins being a good example), but without better information from the OP we are forced to work with probabilities, and the probability is that an index won't improve hash join.

Ross Leishman
Re: join condition in index [message #547478 is a reply to message #547465] Wed, 14 March 2012 07:16 Go to previous messageGo to next message
orarule
Messages: 11
Registered: March 2012
Junior Member
Sorry I dont have the statement on this computer.

Ill get on my work computer shortly.

The Hash already uses an index for both tables. All of the objects (except 1object that only has 2 distinct values) are in the indexes that are used.
Re: join condition in index [message #547481 is a reply to message #547478] Wed, 14 March 2012 07:19 Go to previous messageGo to next message
John Watson
Messages: 8929
Registered: January 2010
Location: Global Village
Senior Member
I'm sure Ross is right.
But as you still have not bortered to describe the indexes or the tables or the query, I don't think it is possible to take this further.
Re: join condition in index [message #547484 is a reply to message #547478] Wed, 14 March 2012 07:31 Go to previous messageGo to next message
orarule
Messages: 11
Registered: March 2012
Junior Member
SQL Statement
----------------------------------------------------------------------------------------------------------------------
SELECT
  DISTINCT T_00 . "HEADER"
FROM
  "CRMD_ORDER_INDEX" T_00 INNER JOIN "CRM_JEST" T_01 ON T_01 . "MANDT" = :A0 AND T_01 . "OBJNR" = T_
  00 . "HEADER"
WHERE
  T_00 . "CLIENT" = :A1 AND T_00 . "PROCESS_TYPE_IX" = :A2 AND T_00 . "OBJECT_TYPE" = :A3 AND T_00
  . "STAT_OPEN" = :A4 AND T_01 . "INACT" = :A5 AND T_01 . "STAT" = :A6
 
 
Execution Plan
 
----------------------------------------------------------------------------------------------------------------------
System: PC1
Plan hash value: 3861442254
 
------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name               | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                    |  5790 |   384K|       | 30582   (1)| 00:03:13 |
|   1 |  HASH UNIQUE                  |                    |  5790 |   384K|       | 30581   (1)| 00:03:13 |
|*  2 |   HASH JOIN                   |                    | 58199 |  3864K|  2832K| 30575   (1)| 00:03:13 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| CRMD_ORDER_INDEX   | 56850 |  2165K|       | 14580   (1)| 00:01:32 |
|*  4 |     INDEX RANGE SCAN          | CRMD_ORDER_INDEXZ1 |   113K|       |       |   169   (1)| 00:00:02 |
|   5 |    TABLE ACCESS BY INDEX ROWID| CRM_JEST           |  1167K|    32M|       | 14987   (1)| 00:01:35 |
|*  6 |     INDEX RANGE SCAN          | CRM_JEST~I         |  1167K|       |       |  1518   (1)| 00:00:10 |
------------------------------------------------------------------------------------------------------------
 
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
 
   1 - SEL$58A6D7F6
   3 - SEL$58A6D7F6 / T_00@SEL$1
   4 - SEL$58A6D7F6 / T_00@SEL$1
   5 - SEL$58A6D7F6 / T_01@SEL$1
   6 - SEL$58A6D7F6 / T_01@SEL$1
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("T_01"."OBJNR"="T_00"."HEADER")
   3 - filter("T_00"."STAT_OPEN"=:A4)
   4 - access("T_00"."CLIENT"=:A1 AND "T_00"."PROCESS_TYPE_IX"=:A2 AND "T_00"."OBJECT_TYPE"=:A3)
   6 - access("T_01"."MANDT"=:A0 AND "T_01"."STAT"=:A6 AND "T_01"."INACT"=:A5)
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
 
   1 - (#keys=1) "T_00"."HEADER"[RAW,16]
   2 - (#keys=1) "T_00"."HEADER"[RAW,16]
   3 - "T_00"."HEADER"[RAW,16]
   4 - "T_00".ROWID[ROWID,10]
   5 - "T_01"."OBJNR"[RAW,16]
   6 - "T_01".ROWID[ROWID,10]


NONUNIQUE Index CRMD_ORDER_INDEXZ1
Column Name #Distinct
CLIENT 1
PROCESS_TYPE_IX 32
OBJECT_TYPE 8
PARTNER_NO 231.677

Last statistics date 11.03.2012
Analyze Method Sample 288.844 Rows
Levels of B-Tree 3
Number of leaf blocks 214.669
Number of distinct keys 5.970.683


NONUNIQUE Index CRM_JEST~I
Column Name #Distinct
MANDT 1
STAT 51
INACT 2

Last statistics date 11.03.2012
Analyze Method Sample 611.294 Rows
Levels of B-Tree 3
Number of leaf blocks 384.831
Number of distinct keys 51
Average leaf blocks per key 8.102
Average data blocks per key 71.450
Re: join condition in index [message #547572 is a reply to message #547484] Wed, 14 March 2012 15:39 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Questions\: Answer all of them.
- How many rows in CRMD_ORDER_INDEX?
- How many of those rows match access("T_00"."CLIENT"=:A1 AND "T_00"."PROCESS_TYPE_IX"=:A2 AND "T_00"."OBJECT_TYPE"=:A3)
- How many rows in CRM_JEST
- How many of them match access("T_01"."MANDT"=:A0 AND "T_01"."STAT"=:A6 AND "T_01"."INACT"=:A5)
- How many rows does the join produce if you remove the DISTINCT?
- How many rows does the SQL produce with the DISTINCT?
- What is your current initialization parameter PGA_AGGREGATE_TARGET (use command SHOW PARAMETER PGA_AGGREGATE_TARGET) in SQL*Plus)

Ross Leishman
Re: join condition in index [message #547581 is a reply to message #547572] Wed, 14 March 2012 21:16 Go to previous messageGo to next message
orarule
Messages: 11
Registered: March 2012
Junior Member
Summary:

CRMD_ORDER_INDEX rows     29.107.300
Rows match access         113.00 
CRM_JEST rows             59.522.500
Rows match access         1.167.00
Remove Distinct           58.199
With Distinct             5.790
PGA_AGGREGATE_TARGET      3221225472


Hmm so first thing here CRMD_ORDER_INDEX should be the driver because it has less rows, right?

Thanks!

- How many rows in CRMD_ORDER_INDEX?
As per statistics run on 3/11: 29.107.300

- How many of those rows match access("T_00"."CLIENT"=:A1 AND "T_00"."PROCESS_TYPE_IX"=:A2 AND "T_00"."OBJECT_TYPE"=:A3)
This is shown in the explain plan below correct? The index range scan uses access predicates CLIENT, PROCESS_TYPE_IX, AND OBJECT_TYPE. So it fetches 113.000 rows.

- How many rows in CRM_JEST
As per statistics run on 3/11: 59.522.500

- How many of them match access("T_01"."MANDT"=:A0 AND "T_01"."STAT"=:A6 AND "T_01"."INACT"=:A5)
Same thing as above 1.167.000

- How many rows does the join produce if you remove the DISTINCT?
This is indicated by step 2: HASH JOIN correct? 58199

- How many rows does the SQL produce with the DISTINCT?
This is shown as step 1: HASH UNIQUE correct? 5790

- What is your current initialization parameter PGA_AGGREGATE_TARGET (use command SHOW PARAMETER PGA_AGGREGATE_TARGET) in SQL*Plus)
3221225472

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Re: join condition in index [message #547582 is a reply to message #547581] Wed, 14 March 2012 21:32 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Those number in the Explain Plan are estimates. I want actuals.

Ross Leishman
Re: join condition in index [message #547584 is a reply to message #547582] Wed, 14 March 2012 22:37 Go to previous messageGo to next message
orarule
Messages: 11
Registered: March 2012
Junior Member
Here's Actuals

CRMD_ORDER_INDEX rows     29.718.780
Rows match access         650.618 
CRM_JEST rows             60.245.663
Rows match access         4.256.788
Remove Distinct           58.199 (still estimate)
With Distinct             11.346
PGA_AGGREGATE_TARGET      3221225472


I keep getting ORA-00907: missing right parenthesis when trying to execute with distinct removed:

SELECT  CRMD_ORDER_INDEX.HEADER

FROM CRMD_ORDER_INDEX, CRM_JEST

WHERE CRM_JEST.OBJNR=CRMD_ORDER_INDEX.HEADER,
    CRMD_ORDER_INDEX.PROCESS_TYPE_IX="ZCWR",
    CRMD_ORDER_INDEX.OBJECT_TYPE="BUS2000223",
    CRMD_ORDER_INDEX.STAT_OPEN="X",
    CRM_JEST.STAT="I1002";
Re: join condition in index [message #547587 is a reply to message #547584] Wed, 14 March 2012 22:56 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
>I keep getting ORA-00907: missing right parenthesis when trying to execute with distinct removed:
replace double quote marks (") with single quote marks (')
Re: join condition in index [message #547588 is a reply to message #547587] Wed, 14 March 2012 22:59 Go to previous messageGo to next message
orarule
Messages: 11
Registered: March 2012
Junior Member
Used code below, still get the error.


SELECT CRMD_ORDER_INDEX.HEADER

FROM CRMD_ORDER_INDEX, CRM_JEST

WHERE CRM_JEST.OBJNR=CRMD_ORDER_INDEX.HEADER,
    CRMD_ORDER_INDEX.PROCESS_TYPE_IX='ZCWR',
    CRMD_ORDER_INDEX.OBJECT_TYPE='BUS2000223',
    CRMD_ORDER_INDEX.STAT_OPEN='X',
    CRM_JEST.STAT='I1002';
Re: join condition in index [message #547589 is a reply to message #547588] Wed, 14 March 2012 23:10 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
>Used code below, still get the error.
sorry to hear that.
It would not throw error, if you used valid syntax like below
SELECT crmd_order_index.header 
FROM   crmd_order_index, 
       crm_jest 
WHERE  crm_jest.objnr = crmd_order_index.header 
       AND crmd_order_index.process_type_ix = 'ZCWR' 
       AND crmd_order_index.object_type = 'BUS2000223' 
       AND crmd_order_index.stat_open = 'X' 
       AND crm_jest.stat = 'I1002'; 

[Updated on: Wed, 14 March 2012 23:22]

Report message to a moderator

Re: join condition in index [message #547594 is a reply to message #547589] Wed, 14 March 2012 23:25 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Your filter criteria in each of those tables selects 2% in one case and 6% in the other. These are in the "break-even" zone. It might be quicker to perform a full-table scan, but probably not by much. Or it might be quicker to perform the index scan, but probably not by much.

On balance, the index scan - even if it is a bit faster - is damaging the performance of other queries by hogging your Buffer Cache, so I would go with Full Table Scans.

I am also concerned that the join is non-unique. The join is access CRM_JEST.OBJNR = CRMD_ORDER_INDEX.HEADER.
- Is OBJNR unique in CRM_JEST?
- Is HEADER unique in CRMD_ORDER?

Unless ONE of these is true, you could be blowing out the join into millions of rows prior to the DISTINCT. If this is the case, the DISTINCT will end up using a lot of memory and eventually page to TEMP. This creates IO. This takes time.

You need to find out ho many rows are in the join before the DISTINCT.

Ross Leishman
Re: join condition in index [message #547598 is a reply to message #547594] Wed, 14 March 2012 23:40 Go to previous messageGo to next message
orarule
Messages: 11
Registered: March 2012
Junior Member
Thank you all very much. The SQL is not working with the distinct removed. I also added "select count(crmd_order_index.header)".

Looks like without DISTINCT only retrieves 111.063 rows.

I'm curently running a query to check distinct values but the estimated values are below. I will update this once the query is returned.


Object                        Distinct Values
crm_jest.objnr                17.577.694 (estimated)
crmd_order_index.header       11.364.285 (estimated)


CRMD_ORDER_INDEX rows     29.718.780
Rows match access         650.618 
CRM_JEST rows             60.245.663
Rows match access         4.256.788
Remove Distinct           111.063
With Distinct             11.346
PGA_AGGREGATE_TARGET      3221225472

Re: join condition in index [message #547599 is a reply to message #547598] Wed, 14 March 2012 23:50 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
So the join key is non-unique. Yet you are joining 4M rows to 650K rows and getting 111K back. That means there must be heaps of non-matching rows in both result sets.

Try this

SELECT DISTINCT HEADER
FROM CRMD_ORDER_INDEX
WHERE (HEADER) IN (
    SELECT 
    FROM CRM_JEST
    WHERE MANDT = :A0 
    AND INACT = :A5 
    AND STAT = :A6
)
AND CLIENT = :A1 
AND PROCESS_TYPE_IX = :A2 
AND OBJECT_TYPE = :A3 
AND STAT_OPEN = :A4 


Ross Leishman
Re: join condition in index [message #547623 is a reply to message #547599] Thu, 15 March 2012 02:45 Go to previous messageGo to next message
orarule
Messages: 11
Registered: March 2012
Junior Member
Thanks you very much!

Excuse my newbness but I'm trying to get this down. Since the key is non-unique a range scan will have to be performed. This is expensive because it will more than likely scan way to much data (anyway to determine how many extra).

Instead a basic select (from the larger dataset) inside another basic select on the smaller data set will be more efficient?

Also one of my original questions hasnt been answered (well its changed a little now). Is it ever beneficial to have the join condition fields in the index?

Re: join condition in index [message #547625 is a reply to message #547623] Thu, 15 March 2012 03:38 Go to previous message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
A range-scan just means that an index scan could return more than one row. If that happens to be 2 rows, or 5 rows, it's going to be pretty quick. But mot if it is 1M rows.

The uniqueness applied to the JOIN KEY, not to the columns you were filtering. If the join key was unique in one of those tables, then the join will return no more rows than the number selected from that table. However if the join key is non-unique in BOTH tables, then the number of rows returned could be as many as a CROSS PRODUCT of the number selected from each table. It's probably nowhere near that bad, but it could be bad enough to use up a lot of TEMP space.

In your case, although the join key was not unique, the number of rows in the join-result was bad but not catastrophic.

Did the alternate syntax work? If so, the reason is probably because the IN performs an implicit DISTINCT on the sub-query before joining to the outer-query. This makes the join key unique and prevents the cross-product behavior.

With non-join predicates in an INNER JOIN, it doesn't make any difference whether they are in the JOIN clause or the WHERE clause.

Ross Leishman
Previous Topic: Need to know Sql ID and text
Next Topic: Wait/Sleep command in SQL
Goto Forum:
  


Current Time: Wed Apr 17 19:31:34 CDT 2024