Home » RDBMS Server » Performance Tuning » Analytics and Predicate Pushing (9i/10g/11g)
Analytics and Predicate Pushing [message #566903] Thu, 20 September 2012 14:47 Go to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
Hi, I have a problem I will demonstrate. I read some work by Ross in an old thread but it did not help me (too dense I am).

The following sequence of simple statements shows me how Analytics and Predicate Pushing appear to work. But when I go from constant tests to a join with the same data as a row instead of a constant value, the Predicate Pushing stops.

I have a table with 9 million rows in it. It takes about 90 seconds to scan.
13:25:11 SQL> 
13:25:11 SQL> select count(*) thecount from lv_pln_usge_fact ;

  THECOUNT
----------
   8681388

1 row selected.

Elapsed: 00:00:32.28



If I run an analytic that counts all rows in the table I can see that easy enough. This exampe as I understand it, scans the table (I know this because if for no other reason, it take 90 seconds to get an answer), then after scanning all rows, counts them and adds the count to each row. There are in fact 9500 or so rows with the values of lv_rqst shown. Instead of seeing all 9500 rows, I distinct it to get only one. See how the count shows all rows in the table.

From this we see that the predicate is not pushed into the inner query to filter rows. If it had, the analytic would have produced a number like 9500 not 9 million. I have no issue with this as this is how Analytics are documented work.

13:25:43 SQL> select distinct *
13:25:43   2  from (
13:25:43   3  	     select LV_RQST
13:25:43   4  		   ,count(*) over () thecount
13:25:43   5  	     from lv_pln_usge_fact
13:25:43   6  	   )
13:25:43   7  where lv_rqst = '746780192'
13:25:43   8  /

LV_RQST      THECOUNT
---------- ----------
746780192     8681388

1 row selected.
Elapsed: 00:01:19.59


We can see how filter does happen with an analytic. We push the predicate into the inner query and all of a sudden we get a count of only those rows for the specific request. Thus we see the basics of how analytics work, particularly with respect to predicate pushing. There is not real rocket science here. The filter removed rows before the analytic counted them. This too is how Analytics are documented to work.

13:27:02 SQL> select distinct *
13:27:02   2  from (
13:27:02   3  	     select lv_rqst,count(*) over () thecount
13:27:02   4  	     from lv_pln_usge_fact
13:27:02   5  	     where lv_rqst = '746780192'
13:27:02   6  	   )
13:27:02   7  /

LV_RQST      THECOUNT
---------- ----------
746780192        9539

1 row selected.

Elapsed: 00:00:00.03


So far we have been doing "all rows" analytics. Now we use a PARTITION clause to group the data. Notice the rowcount. It is the count for just the lv_rqst. Do not be fooled. This is because the PARTITION column says to synchronize the analytic count to the data for its associated row. Thus the counts will be grouped by LV_RQST. Whether we had the predicate on the outside of the query or not, for this specific lv_rqst shown we would still get this count. This is not proof that we did any filtering with the predicate lv_rqst = '746780192'.

However, notice the speed of the query. It gets done so fast, that there is no way it is scanning the table. I know there is an index that starts with LV_RQST so I conclude that the index was used which I believe means the predicate was in fact PUSHED into the inner query. Thus I think we did in fact filter the rows to just this single LV_RQST value and we also accessed the table using the predicate as well which became an index range scan instead of full table scan. My biggest point is we were able to use the index to get the data we want, not scan the table and this was made so because we added the LV_RQST column to the OVER clause as part of the PARTITION BY expression.

13:27:03 SQL> select distinct *
13:27:03   2  from (
13:27:03   3  	     select LV_RQST
13:27:03   4  		   ,count(*) over (partition by lv_rqst) thecount
13:27:03   5  	     from lv_pln_usge_fact
13:27:03   6  	   )
13:27:03   7  where lv_rqst = '746780192'
13:27:03   8  /

LV_RQST      THECOUNT
---------- ----------
746780192        9539

1 row selected.

Elapsed: 00:00:00.04


Now I add an additional column to the inner query, and an additional predicate against that column to the outer query. We know that 9539 is the count of all rows where LV_RQST='746780192' as we have seen that before above. I also know the data and know about half say Y and half say N for this indicator column. This query gives the right answer. I just put it here are additional demonstration of the way analytics work. It demonstrates that the new predicate is not pushed into the innser query to filter rows. Again this is how Analytics are documented to work.

Again also please note the speed of the query. It used the index to range scan only the rows where LV_RQST='746780192'. So only those predicates that have their columns in the anlytics OVER clause are allowed to be pushed into the query for filtering and accessing purposes.

13:27:03 SQL> col LV_PLN_USGE_DEL_IND format a20 trunc
13:27:03 SQL> select distinct *
13:27:03   2  from (
13:27:03   3  	     select LV_RQST
13:27:03   4  		   ,LV_PLN_USGE_DEL_IND
13:27:03   5  		   ,count(*) over (partition by lv_rqst) lv_pln_dys_rmn
13:27:03   6  	     from lv_pln_usge_fact
13:27:03   7  	   )
13:27:03   8  where lv_rqst = '746780192'
13:27:03   9  and lv_pln_usge_del_ind = 'N'
13:27:03  10  /

LV_RQST    LV_PLN_USGE_DEL_IND  LV_PLN_DYS_RMN
---------- -------------------- --------------
746780192  N                              9539

1 row selected.

Elapsed: 00:00:00.04


This next query shows a little more clarity. If we add the indicator column to our OVER clause then the rowcount changes to be the number of rows where LV_RQST='746780192' AND lv_pln_usge_del_ind = 'N'. So by putting the column into the OVER expression, Oracle decides to push the predicate down into the inner query and filter the data before the analytic count is done. Again this just demonstrates for clarity how it works. I think I am describing it right anyway.

And once again the speed clearly indicates that an index range scan was done using LV_RQST='746780192'. Recall I said there is an index that starts with this column.

13:27:03 SQL> select distinct *
13:27:03   2  from (
13:27:03   3  	     select LV_RQST
13:27:03   4  		   ,LV_PLN_USGE_DEL_IND
13:27:03   5  		   ,count(*) over (partition by lv_rqst order by lv_pln_usge_del_ind) lv_pln_dys_rmn
13:27:03   6  	     from lv_pln_usge_fact
13:27:03   7  	   )
13:27:03   8  where lv_rqst = '746780192'
13:27:03   9  and lv_pln_usge_del_ind = 'N'
13:27:03  10  /

LV_RQST    LV_PLN_USGE_DEL_IND  LV_PLN_DYS_RMN
---------- -------------------- --------------
746780192  N                              4810

1 row selected.

Elapsed: 00:00:00.06


NOW WE COME TO MY PROBLEM.
Instead of using the constant value '746780192' we are going to create a one column one row table that has this value in it. We are then going to join to the the analytic subquery instead of doing a contant test against it.

13:27:03 SQL> create table kevt1
13:27:03   2  (
13:27:03   3  	 lv_rqst varchar2(10) not null
13:27:03   4  )
13:27:03   5  /

Table created.

Elapsed: 00:00:00.06
13:27:03 SQL> 
13:27:03 SQL> insert into kevt1 values ('746780192')
13:27:03   2  /

1 row created.

Elapsed: 00:00:00.00
13:27:03 SQL> 
13:27:03 SQL> commit
13:27:03   2  /

Commit complete.

Elapsed: 00:00:00.00



This query is in my mind the same query we did before but we loose the use of the index and go back to doing a FULL TABLE SCAN.
13:27:03 SQL> select distinct x.*
13:27:03   2  from (
13:27:03   3  	     select LV_RQST
13:27:03   4  		   ,count(*) over (partition by lv_rqst) lv_pln_dys_rmn
13:27:03   5  	     from lv_pln_usge_fact
13:27:03   6  	   ) x
13:27:03   7  	  ,kevt1
13:27:03   8  where kevt1.lv_rqst = x.lv_rqst
13:27:03   9  /

LV_RQST    LV_PLN_DYS_RMN
---------- --------------
746780192            9539

1 row selected.

Elapsed: 00:01:17.73


For a little bit more clarity, two more queries. Pay attention to how long it takes, and to how the additional joins affect things. Notice, particularly with the last statement, that the join criteria is being pushed into the inner query with the analytics. Otherwise how did it get that count?

14:55:21 SQL> select distinct x.*
14:56:16   2  from (
14:56:16   3        select LV_RQST
14:56:16   4       ,LV_PLN_USGE_DEL_IND
14:56:16   5       ,count(*) over (partition by lv_rqst order by lv_pln_usge_del_ind) lv_pln_dys_rmn
14:56:16   6        from lv_pln_usge_fact
14:56:16   7      ) x
14:56:16   8      ,kevt1
14:56:16   9  where kevt1.lv_rqst = x.lv_rqst
14:56:16  10  and x.lv_pln_usge_del_ind = 'N'
14:56:16  11  /

LV_RQST    LV_PLN_USGE_DEL_IND  LV_PLN_DYS_RMN
---------- -------------------- --------------
746780192  N                              4810

1 row selected.

Elapsed: 00:02:04.64

14:53:40 SQL> alter table kevt1 add LV_PLN_USGE_DEL_IND varchar2(1);

Table altered.

Elapsed: 00:00:00.01
14:54:22 SQL> update kevt1 set LV_PLN_USGE_DEL_IND='N';

1 row updated.

14:58:21 SQL> commit;

Commit complete.

Elapsed: 00:00:00.01

15:00:00 SQL> select distinct x.*
15:00:51   2  from (
15:00:51   3        select LV_RQST
15:00:51   4       ,LV_PLN_USGE_DEL_IND
15:00:51   5       ,count(*) over (partition by lv_rqst order by lv_pln_usge_del_ind) lv_pln_dys_rmn
15:00:51   6        from lv_pln_usge_fact
15:00:51   7      ) x
15:00:51   8      ,kevt1
15:00:51   9  where kevt1.lv_rqst = x.lv_rqst
15:00:51  10  and kevt1.lv_pln_usge_del_ind = x.lv_pln_usge_del_ind
15:00:51  11  /

LV_RQST    LV_PLN_USGE_DEL_IND  LV_PLN_DYS_RMN
---------- -------------------- --------------
746780192  N                              4810

1 row selected.

Elapsed: 00:01:38.92


Just so there is no confusion:

15:56:51 SQL> select * from kevt1

LV_RQST    LV_PLN_USGE_DEL_IND
---------- --------------------
746780192  N

1 row selected.

Elapsed: 00:00:00.01
15:56:52 SQL> 


So after looking at all this, here is my question: How do I get Oracle the use the index and nested loop join to the table lv_pln_usge_fact. We know Oracle pushes the predicates down when the columns are referenced in the OVER expression because we see that in several places. We also know the CBO can do a nested loop join with index access on LV_RQST because it does it when we use a constant test. But it won't use the index and nested loop when we do a join to a table with the same data, no matter how much rewriting or hinting I do.

I tested this in 9i/10g/11g and got same behavior in all three places.

Can someone make it do what I want?, or is this just the way Analytics behaves with a join?

Kevin

[Updated on: Thu, 20 September 2012 14:57]

Report message to a moderator

Re: Analytics and Predicate Pushing [message #566985 is a reply to message #566903] Fri, 21 September 2012 21:19 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Wow! Marathon post. I would have liked some AUTOTRACE, but what you are describing with FTS vs Index scan is fairly clear.

Unrelated to your problem but: when you add in the ORDER BY clause to the analytic function, are you expecting a (rolling) count of all values <= N, or all values = N? Because you are getting <= N. If you change the query to look at 'Y' you will get a count of N+Y.

If you have the column in the PARTITION BY clause, then it should be able to push the predicate inside the inline view, but clearly it chooses not to. Note that it will NOT push the predicate on lv_pln_usge_del_ind - that would change the result (if there were values < N).

Try a hint to see if you can *prove* that it can be done. Reverse the order of the tables in the FROM so you can use ORDERED.

15:00:00 SQL> select /*+ ORDERED USE_NL(x) NO_MERGE(x) PUSH_PRED(x)  */ distinct x.*
15:00:51   8  from kevt1
15:00:51   2  ,  (
15:00:51   3        select LV_RQST
15:00:51   4       ,LV_PLN_USGE_DEL_IND
15:00:51   5       ,count(*) over (partition by lv_rqst order by lv_pln_usge_del_ind) lv_pln_dys_rmn
15:00:51   6        from lv_pln_usge_fact
15:00:51   7      ) x
15:00:51   9  where kevt1.lv_rqst = x.lv_rqst
15:00:51  10  and kevt1.lv_pln_usge_del_ind = x.lv_pln_usge_del_ind
15:00:51  11  /


Ross Leishman
Re: Analytics and Predicate Pushing [message #567004 is a reply to message #566985] Sat, 22 September 2012 04:53 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
thanks for the advice Ross. I am trying your suggestion now and all manner of variations. So far no joy. I'll keep trying.

Also, thanks for the clarification. I was not aware the ORDER BY worked that way. May I ask, how do you know this? Do you have some special work background that introduced you to this? Did you run a test? Or do you have some fountain of knowledge somewhere that you have not told the rest of us about?

Quote:
Unrelated to your problem but: when you add in the ORDER BY clause to the analytic function, are you expecting a (rolling) count of all values <= N, or all values = N? Because you are getting <= N. If you change the query to look at 'Y' you will get a count of N+Y.

If you have the column in the PARTITION BY clause, then it should be able to push the predicate inside the inline view, but clearly it chooses not to. Note that it will NOT push the predicate on lv_pln_usge_del_ind - that would change the result (if there were values < N).

[Updated on: Sat, 22 September 2012 05:11]

Report message to a moderator

Re: Analytics and Predicate Pushing [message #567006 is a reply to message #567004] Sat, 22 September 2012 05:42 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
C'mon Kevin, you already know the answer to this one....

I've already made the same mistake before with analytic functions. My secret is I have a good memory for my own screw ups. I suspect its a survival trait of married men.

Keep at it, because it does work. Turn autotrace on. It may be pushing the predicate and then performing a FTS anyway. You may need an INDEX hint in the sub-query. You will be able to tell from the Predicate listing in the autotrace. Also make sure you have stats on both tables and indexes.



SQL> run a
  1  select distinct x.*
  2  from job
  3  ,(
  4      select job, deptno, count(*) over (partition by job order by deptno)
  5      from emp
  6  ) x
  7  where job.job = x.job
  8  and x.job = 'CLERK'
  9* and x.deptno = 10

JOB           DEPTNO COUNT(*)OVER(PARTITIONBYJOBORDERBYDEPTNO)                                                          
--------- ---------- -----------------------------------------                                                          
CLERK             10                                         1                                                          


Execution Plan
----------------------------------------------------------                                                              
Plan hash value: 3861188995                                                                                             
                                                                                                                        
----------------------------------------------------------------------------------------------                          
| Id  | Operation                       | Name       | Rows  | Bytes | Cost (%CPU)| Time     |                          
----------------------------------------------------------------------------------------------                          
|   0 | SELECT STATEMENT                |            |     1 |    30 |     8  (38)| 00:00:01 |                          
|   1 |  HASH UNIQUE                    |            |     1 |    30 |     8  (38)| 00:00:01 |                          
|*  2 |   HASH JOIN                     |            |     4 |   120 |     7  (29)| 00:00:01 |                          
|*  3 |    TABLE ACCESS FULL            | JOB        |     1 |     6 |     3   (0)| 00:00:01 |                          
|*  4 |    VIEW                         |            |     4 |    96 |     3  (34)| 00:00:01 |                          
|   5 |     WINDOW SORT                 |            |     4 |    44 |     3  (34)| 00:00:01 |                          
|   6 |      TABLE ACCESS BY INDEX ROWID| EMP        |     4 |    44 |     2   (0)| 00:00:01 |                          
|*  7 |       INDEX RANGE SCAN          | EMP_JOB_IX |     4 |       |     1   (0)| 00:00:01 |                          
----------------------------------------------------------------------------------------------                          
                                                                                                                        
Predicate Information (identified by operation id):                                                                     
---------------------------------------------------                                                                     
                                                                                                                        
   2 - access("JOB"."JOB"="X"."JOB")                                                                                    
   3 - filter("JOB"."JOB"='CLERK')                                                                                      
   4 - filter("X"."DEPTNO"=10)                                                                                          
   7 - access("JOB"='CLERK')                                                                                            

SQL> spool off


Ross Leishman
Re: Analytics and Predicate Pushing [message #567007 is a reply to message #567006] Sat, 22 September 2012 05:45 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
You give me hope. Will post when I have success. Thanks man.
Re: Analytics and Predicate Pushing [message #567031 is a reply to message #567007] Sun, 23 September 2012 04:22 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
Kevin, I can't reproduce your problem! I created the test case like this:
create table lv_pln_usge_fact (lv_rqst integer,LV_PLN_USGE_DEL_IND varchar2(20));
create index lvi1 on lv_pln_usge_fact(lv_rqst);
exec dbms_stats.gather_table_stats(user,'lv_pln_usge_fact',method_opt=>'for all columns size skewonly')
exec dbms_stats.set_table_stats(user,'lv_pln_usge_fact',numrows=>8681388)
create table kevt1 (lv_rqst varchar2(10) not null)
/
insert into kevt1 values ('746780192');
exec dbms_stats.gather_table_stats(user,'kevt1',method_opt=>'for all columns size skewonly')

First query, with the literal in the outer query, I get the FTS:
orcl> select distinct *
  2  from (
  3          select LV_RQST
  4                ,count(*) over () thecount
  5          from lv_pln_usge_fact
  6        )
  7  where lv_rqst = '746780192'
  8  /

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 1725200923

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |     1 |    26 |   989 (100)| 00:00:16 |
|   1 |  HASH UNIQUE                 |                  |     1 |    26 |   989 (100)| 00:00:16 |
|*  2 |   VIEW                       |                  |  8681K|   215M|   126  (99)| 00:00:03 |
|   3 |    WINDOW BUFFER             |                  |  8681K|   107M|   126  (99)| 00:00:03 |
|   4 |     TABLE ACCESS STORAGE FULL| LV_PLN_USGE_FACT |  8681K|   107M|   126  (99)| 00:00:03 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("LV_RQST"=746780192)

orcl>

Then the next query, I move the literal to the inner query and get the index range scan:
orcl> select distinct *
  2  from (
  3          select lv_rqst,count(*) over () thecount
  4          from lv_pln_usge_fact
  5          where lv_rqst = '746780192'
  6        )
  7  /

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 1661427414

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    26 |     2  (50)| 00:00:01 |
|   1 |  HASH UNIQUE        |      |     1 |    26 |     2  (50)| 00:00:01 |
|   2 |   VIEW              |      |     1 |    26 |     1   (0)| 00:00:01 |
|   3 |    WINDOW BUFFER    |      |     1 |    13 |     1   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN| LVI1 |     1 |    13 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("LV_RQST"=746780192)

orcl>

so far, exactly the same as you. But now I run te query with a join to the one-row table:
orcl> select distinct x.*
  2  from (
  3  select LV_RQST,count(*) over (partition by lv_rqst) lv_pln_dys_rmn
  4  from lv_pln_usge_fact
  5  ) x
  6  ,kevt1
  7  where kevt1.lv_rqst = x.lv_rqst
  8  /

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 965590451

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    25 |     5  (20)| 00:00:01 |
|   1 |  HASH UNIQUE                |       |     1 |    25 |     5  (20)| 00:00:01 |
|   2 |   NESTED LOOPS              |       |     1 |    25 |     4   (0)| 00:00:01 |
|   3 |    TABLE ACCESS STORAGE FULL| KEVT1 |     1 |    10 |     3   (0)| 00:00:01 |
|   4 |    VIEW PUSHED PREDICATE    |       |     1 |    15 |     1   (0)| 00:00:01 |
|   5 |     WINDOW BUFFER           |       | 86814 |  1102K|     1   (0)| 00:00:01 |
|*  6 |      INDEX RANGE SCAN       | LVI1  | 86814 |  1102K|     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("LV_RQST"=TO_NUMBER("KEVT1"."LV_RQST"))

orcl>
and there it is, nested loop, a pushed predicate and an index range scan.

Am I missing something? Or is my Oracle 11.2.0.3 cleverer than yours?

(If I'm being stupid, please forgive me, I'm still zonked after many air miles and time zones.)
Re: Analytics and Predicate Pushing [message #567032 is a reply to message #566903] Sun, 23 September 2012 08:05 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
Well....

1) that fact that you can display the plan gives me hope
2) your 11.2.0.3 may indeed be smarter than my 11.2.0.2 I do not get the same plan.

Maybe it is time for a call to Oracle support. Here is me doing your stuff.


First I create the data objects you provided (thanks for that).
08:34:17 SQL> create table lv_pln_usge_fact (lv_rqst integer,LV_PLN_USGE_DEL_IND varchar2(20));

Table created.

Elapsed: 00:00:00.15
08:34:26 SQL> create index lvi1 on lv_pln_usge_fact(lv_rqst);

Index created.

Elapsed: 00:00:00.06
08:34:31 SQL> exec dbms_stats.gather_table_stats(user,'lv_pln_usge_fact',method_opt=>'for all columns size skewonly')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.15
08:34:36 SQL> exec dbms_stats.set_table_stats(user,'lv_pln_usge_fact',numrows=>8681388)

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
08:34:41 SQL> create table kevt1 (lv_rqst varchar2(10) not null)
08:34:46   2  /

Table created.

Elapsed: 00:00:00.06
08:34:46 SQL> insert into kevt1 values ('746780192');

1 row created.

Elapsed: 00:00:00.06
08:34:51 SQL> exec dbms_stats.gather_table_stats(user,'kevt1',method_opt=>'for all columns size skewonly')

PL/SQL procedure successfully completed.


Now we test the problem query via explain plan.
08:36:31 SQL> explain plan for select distinct x.*
08:36:34   2  from (
08:36:34   3         select LV_RQST,count(*) over (partition by lv_rqst) lv_pln_dys_rmn
08:36:34   4         from lv_pln_usge_fact
08:36:34   5       ) x
08:36:34   6       ,kevt1
08:36:34   7  where kevt1.lv_rqst = x.lv_rqst
08:36:34   8  /

Explained.

Elapsed: 00:00:00.06
08:36:34 SQL> @showplan9i

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------
Plan hash value: 4066973695

----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |     1 |    36 |       | 41960   (3)| 00:00:02 |
|   1 |  HASH UNIQUE                  |                  |     1 |    36 |       | 41960   (3)| 00:00:02 |
|   2 |   MERGE JOIN                  |                  |  8681K|   298M|       | 41520   (2)| 00:00:02 |
|   3 |    VIEW                       |                  |  8681K|   215M|       | 41503   (2)| 00:00:02 |
|   4 |     WINDOW SORT               |                  |  8681K|   107M|   166M| 41503   (2)| 00:00:02 |
|   5 |      TABLE ACCESS STORAGE FULL| LV_PLN_USGE_FACT |  8681K|   107M|       |    65  (97)| 00:00:01 |
|*  6 |    SORT JOIN                  |                  |     1 |    10 |       |    17   (6)| 00:00:01 |
|   7 |     TABLE ACCESS STORAGE FULL | KEVT1            |     1 |    10 |       |    16   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   3 - SEL$2 / X@SEL$1
   4 - SEL$2
   5 - SEL$2 / LV_PLN_USGE_FACT@SEL$2
   7 - SEL$1 / KEVT1@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("X"."LV_RQST"=TO_NUMBER("KEVT1"."LV_RQST"))
       filter("X"."LV_RQST"=TO_NUMBER("KEVT1"."LV_RQST"))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=2) "X"."LV_RQST"[NUMBER,22], "X"."LV_PLN_DYS_RMN"[NUMBER,22]
   2 - (#keys=0) "X"."LV_RQST"[NUMBER,22], "X"."LV_PLN_DYS_RMN"[NUMBER,22]
   3 - "X"."LV_RQST"[NUMBER,22], "X"."LV_PLN_DYS_RMN"[NUMBER,22]
   4 - (#keys=1) "LV_RQST"[NUMBER,22], COUNT(*) OVER ( PARTITION BY "LV_RQST")[22]
   5 - "LV_RQST"[NUMBER,22]
   6 - (#keys=1) TO_NUMBER("KEVT1"."LV_RQST")[22]
   7 - "KEVT1"."LV_RQST"[VARCHAR2,40]

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

44 rows selected.

Elapsed: 00:00:00.40


Lets try with AUTOTRACE.
08:36:38 SQL> set autotrace on
08:36:46 SQL> select distinct x.*
08:36:47   2  from (
08:36:47   3         select LV_RQST,count(*) over (partition by lv_rqst) lv_pln_dys_rmn
08:36:47   4         from lv_pln_usge_fact
08:36:47   5       ) x
08:36:47   6       ,kevt1
08:36:47   7  where kevt1.lv_rqst = x.lv_rqst
08:36:47   8  /

no rows selected

Elapsed: 00:00:00.28

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=41960 Card=1 Bytes=36)
   1    0   HASH (UNIQUE) (Cost=41960 Card=1 Bytes=36)
   2    1     MERGE JOIN (Cost=41520 Card=8681388 Bytes=312529968)
   3    2       VIEW (Cost=41503 Card=8681388 Bytes=225716088)
   4    3         WINDOW (SORT) (Cost=41503 Card=8681388 Bytes=112858044)
   5    4           TABLE ACCESS (STORAGE FULL) OF 'LV_PLN_USGE_FACT' (TABLE) (Cost=65 Card=8681388 Bytes=112858044)
   6    2       SORT (JOIN) (Cost=17 Card=1 Bytes=10)
   7    6         TABLE ACCESS (STORAGE FULL) OF 'KEVT1' (TABLE) (Cost=16 Card=1 Bytes=10)




Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        290  bytes sent via SQL*Net to client
        333  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          0  rows processed

08:36:48 SQL> set autotrace off


Lets try with DISPLAY_CURSOR().
08:59:16 SQL> select distinct x.*
08:59:36   2  from (
08:59:36   3         select LV_RQST,count(*) over (partition by lv_rqst) lv_pln_dys_rmn
08:59:36   4         from lv_pln_usge_fact
08:59:36   5       ) x
08:59:36   6       ,kevt1
08:59:36   7  where kevt1.lv_rqst = x.lv_rqst
08:59:36   8  /

no rows selected

Elapsed: 00:00:00.04
08:59:36 SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------
SQL_ID  c36xhjdzwtpj0, child number 0
-------------------------------------
select distinct x.* from (        select LV_RQST,count(*) over
(partition by lv_rqst) lv_pln_dys_rmn        from lv_pln_usge_fact
) x      ,kevt1 where kevt1.lv_rqst = x.lv_rqst

Plan hash value: 4066973695

----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |       |       |       | 41960 (100)|          |
|   1 |  HASH UNIQUE                  |                  |     1 |    36 |       | 41960   (3)| 00:00:02 |
|   2 |   MERGE JOIN                  |                  |  8681K|   298M|       | 41520   (2)| 00:00:02 |
|   3 |    VIEW                       |                  |  8681K|   215M|       | 41503   (2)| 00:00:02 |
|   4 |     WINDOW SORT               |                  |  8681K|   107M|   166M| 41503   (2)| 00:00:02 |
|   5 |      TABLE ACCESS STORAGE FULL| LV_PLN_USGE_FACT |  8681K|   107M|       |    65  (97)| 00:00:01 |
|*  6 |    SORT JOIN                  |                  |     1 |    10 |       |    17   (6)| 00:00:01 |
|   7 |     TABLE ACCESS STORAGE FULL | KEVT1            |     1 |    10 |       |    16   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("X"."LV_RQST"=TO_NUMBER("KEVT1"."LV_RQST"))
       filter("X"."LV_RQST"=TO_NUMBER("KEVT1"."LV_RQST"))

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold


31 rows selected.

Elapsed: 00:00:00.29


My 11.2.0.2 is giving no joy.

08:37:33 SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

5 rows selected.


I will keep trying. Maybe it is some parameter setting. I'll check these. I will use your test case.

Thanks again for your time and effort. Kevin

[Updated on: Sun, 23 September 2012 08:06]

Report message to a moderator

Re: Analytics and Predicate Pushing [message #567034 is a reply to message #567032] Sun, 23 September 2012 08:19 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
Well Well... see what I found...

  1* select name,value from v$parameter order by name
09:07:35 SQL> /

NAME                                                                             VALUE
-------------------------------------------------------------------------------- --------------------------------------
O7_DICTIONARY_ACCESSIBILITY                                                      FALSE
_column_tracking_level                                                           3
_file_size_increase_increment                                                    2143289344

...

_push_join_predicate                                                             FALSE


09:07:37 SQL> alter session set "_push_join_predicate"=true;

Session altered.

Elapsed: 00:00:00.06
09:08:20 SQL> select distinct x.*
09:08:33   2  from (
09:08:33   3         select LV_RQST,count(*) over (partition by lv_rqst) lv_pln_dys_rmn
09:08:33   4         from lv_pln_usge_fact
09:08:33   5       ) x
09:08:33   6       ,kevt1
09:08:33   7  where kevt1.lv_rqst = x.lv_rqst
09:08:34   8  /

no rows selected

Elapsed: 00:00:00.06
09:08:34 SQL> 
09:08:34 SQL> 
09:08:34 SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
SQL_ID  c36xhjdzwtpj0, child number 1
-------------------------------------
select distinct x.* from (        select LV_RQST,count(*) over
(partition by lv_rqst) lv_pln_dys_rmn        from lv_pln_usge_fact
) x      ,kevt1 where kevt1.lv_rqst = x.lv_rqst

Plan hash value: 965590451

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |       |       |    18 (100)|          |
|   1 |  HASH UNIQUE                |       |     1 |    25 |    18   (6)| 00:00:01 |
|   2 |   NESTED LOOPS              |       |     1 |    25 |    17   (0)| 00:00:01 |
|   3 |    TABLE ACCESS STORAGE FULL| KEVT1 |     1 |    10 |    16   (0)| 00:00:01 |
|   4 |    VIEW PUSHED PREDICATE    |       |     1 |    15 |     1   (0)| 00:00:01 |
|   5 |     WINDOW BUFFER           |       | 86814 |  1102K|     1   (0)| 00:00:01 |
|*  6 |      INDEX RANGE SCAN       | LVI1  | 86814 |  1102K|     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("LV_RQST"=TO_NUMBER("KEVT1"."LV_RQST"))

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold


29 rows selected.


My problem now is that I cannot get this to back up to 9i. The parameter will set in 9i too, but I get the TABLE SCAN anyway.

I will keep working on it, but at least I know there is a fix coming.

Kevin
Re: Analytics and Predicate Pushing [message #567035 is a reply to message #567034] Sun, 23 September 2012 08:47 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
Yet another hidden parameter I have never looked at. It is true by default for my 11.2.0.3. Setting _push_join_predicate true or false switches between your first plan and mine.

Sussed.
Re: Analytics and Predicate Pushing [message #567036 is a reply to message #567034] Sun, 23 September 2012 08:49 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
suss (ss)
tr.v. sussed, suss·ing, suss·es Slang
1. To infer or discover; figure out: "I think I'm good at sussing out what's going on" (Ry Cooder).
2. To size up; study: "Suss out the designers in whom you are interested" (Lucia van der Post).

--------------------------------------------------------------------------------

[Probably short for suspect.]
Re: Analytics and Predicate Pushing [message #567037 is a reply to message #567036] Sun, 23 September 2012 08:59 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
The Who:

Hey you gettin drunk, so sorry, I got you sussed
Hey you smokin mother nature, this is a bust
Hey hung up old Mr. Normal don't try to gain my trust
'Cause you ain't gonna follow me any of those ways all though you think you must

Re: Analytics and Predicate Pushing [message #567038 is a reply to message #567037] Sun, 23 September 2012 09:02 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
On a more serious note, can you create a stored outline on 10 or 11, and use that in 9?
Re: Analytics and Predicate Pushing [message #567039 is a reply to message #567038] Sun, 23 September 2012 09:03 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
yes, that is a great thought. Let me see if I can backport such a thing. thanks man.
Re: Analytics and Predicate Pushing [message #567041 is a reply to message #567039] Sun, 23 September 2012 10:05 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
OK, I am going to wait for Oracle to get back to me on the TAR. Seems 10.2.0.3 just won't do the index access. Here are the 10g and 11g outline info, and the query that I believe has the necessary hints to give me the plan I seek. But 10g just won't do it. Notice the very last line in each outline list. 10g says FULL, 11g says INDEX. Maybe it is an optimizer_upgrade in 11g.

Outline Data  --  10g
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
      FIRST_ROWS(1)
      OUTLINE_LEAF(@"SEL$2")
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "KEVT1"@"SEL$1")
      NO_ACCESS(@"SEL$1" "X"@"SEL$1")
      LEADING(@"SEL$1" "KEVT1"@"SEL$1" "X"@"SEL$1")
      USE_NL(@"SEL$1" "X"@"SEL$1")
      FULL(@"SEL$2" "LV_PLN_USGE_FACT"@"SEL$2")
      END_OUTLINE_DATA
  */


Outline Data  --  11g
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.2')
      DB_VERSION('11.2.0.2')
      OPT_PARAM('optimizer_dynamic_sampling' 4)
      OPT_PARAM('_fix_control' '8198783:0 9728438:0')
      ALL_ROWS
      NO_PARALLEL
      OUTLINE_LEAF(@"SEL$639F1A6F")
      PUSH_PRED(@"SEL$1" "X"@"SEL$1" 1)
      OUTLINE_LEAF(@"SEL$1")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      FULL(@"SEL$1" "KEVT1"@"SEL$1")
      NO_ACCESS(@"SEL$1" "X"@"SEL$1")
      LEADING(@"SEL$1" "KEVT1"@"SEL$1" "X"@"SEL$1")
      USE_NL(@"SEL$1" "X"@"SEL$1")
      USE_HASH_AGGREGATION(@"SEL$1")
      INDEX(@"SEL$639F1A6F" "LV_PLN_USGE_FACT"@"SEL$2" ("LV_PLN_USGE_FACT"."LV_RQST"))
      END_OUTLINE_DATA
  */


select
  /*+
      INDEX(@"MYCRAP" "LV_PLN_USGE_FACT"@"MYCRAP" ("LV_PLN_USGE_FACT"."LV_RQST"))
      USE_NL(X)
  */
distinct *
from KEVT1
    ,(
       select /*+ qb_name(MYCRAP) */ LV_RQST,count(*) over (partition by lv_rqst) lv_pln_dys_rmn
       from lv_pln_usge_fact
     ) x
where kevt1.lv_rqst = x.lv_rqst
/

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(null,null,'basic +outline'));



I think I am done for now. Thanks again to both of you. I know this took you each some time to do and I appreciate it. I feel I have made significant progress to either a solution or a formal "can't do it on 10g" from Oracle.

Kevin
Re: Analytics and Predicate Pushing [message #567043 is a reply to message #567041] Sun, 23 September 2012 10:30 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
OK one more post (maybe). I found this document that desribes changes in the optimizer between 10g and 11g. Wouldn't you know it, but there is a section about EXTENDED JOIN PREDICATE PUSH DOWN. If I read it right, it is telling me that the feature I need is not in 10g.

Quote:
Extended Join Predicate Push Down

In previous releases when you had a SQL statement where a view V and a table T were joined by
a join predicate T.x = V.y, the Optimizer had only two possible join methods, a hash join or a
sort merge join to join T and V.

In Oracle Database 10g, we introduced the join predicate push down transformation, which
enabled the optimizer to push the join predicate into the view. So the join T.x = V.y becomes
T.x = T2.y (where T2 is the table inside view V, which has the column y in it) thereby opening
up the possibility of using a nested-loops join if an index is present on T2.y.

In Oracle Database 11g the join predicate push down capabilities have been extended to include
group by, distinct, anti-join, and semi-joins. For the following query;

SELECT p.prod_id, v1.row_count
FROM products p,
(SELECT s.prod_id, count(*) row_count
FROM sales s
WHERE s.quantity_sold BETWEEN 1 AND 47
GROUP BY s.prod_id) v1
WHERE p.supplier_id = 12
AND p.prod_id = v1.prod_id(+);

Although join predicate push down exists in Oracle Database 10g we can not use it due to the
group by. In the 10g we see the view v1 being evaluated followed by a hash join to the
PRODUCTS tables.

However, in 11g join predicate pushdown has become possible and we are now taking advantage
of the index on the sales table to do a nested-loops join instead of a hash join. The cost of the
new plan has come down from 79 to 28 because of join predicate pushdown.


If I am correct then my problem is a true limitation of the 10g optimizer and I am guessing that Oracle will tell same and that my only solution is to upgrade if I really want the feature.

Once again, thanks for all your help you two. I believe I have what I need to discuss this with my team and our customer.

Kevin

[Updated on: Sun, 23 September 2012 10:34]

Report message to a moderator

Re: Analytics and Predicate Pushing [message #567054 is a reply to message #567043] Sun, 23 September 2012 16:47 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
So the test for this would be to remove the DISTINCT on 10g and see if you can get a Nested Loops join

Ross Leishman
Re: Analytics and Predicate Pushing [message #567055 is a reply to message #567054] Sun, 23 September 2012 17:18 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
I don't think so. I think the DISTINCT is meaningless with respect to the problem. I put it there so that when I was running tests against real data, I only got one row returned instead of 9,600 rows in my sqlplus output.

The issue is the Analytic COUNT. It is an aggregate and as such I guess that means the 10g optimizer cannot handle it. At least that is how I see it. You see it a different way?

Re: Analytics and Predicate Pushing [message #567056 is a reply to message #567055] Sun, 23 September 2012 17:59 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
My mistake. I just re-read your old SQL. I thought you had the DISTINCT inside the inline view.

Still, analytic functions are not explicitly mentioned. Have you tried an INDEX(lv_pln_usge_fact) hint inside the inline view? Maybe worth a try just to give it a kick in the guts.

Ross Leishman
Re: Analytics and Predicate Pushing [message #567132 is a reply to message #567056] Mon, 24 September 2012 08:14 Go to previous message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
Just tried but no joy. You are correct, the manuals don't explicitly mention everything. I have never found a complete list anywhere but I am pretty sure analytics would count as an aggregation (eg. group by), so they keep things from working.

Thanks Ross, Kevin
Previous Topic: dbms_stats.gather_index_stats vs dbms_stats.gather_table_stats
Next Topic: appropriate memory
Goto Forum:
  


Current Time: Thu Mar 28 09:12:24 CDT 2024