Home » RDBMS Server » Performance Tuning » Need advice on index to apply (Oracle 11.2, Window 2008)
Need advice on index to apply [message #601336] Tue, 19 November 2013 14:12 Go to next message
jxh461
Messages: 185
Registered: March 2005
Senior Member
I have a table that has column (datekey) of type number that represents a date in the format yyyymmdd.

I am running the query
select ....
from  Mytable i
where i.datekey > TO_CHAR ( (SYSDATE - 356), 'YYYYMMDD') 


The query is running for very long.

I wanted to ask what is the best type of index to put on the column datekey ?
I am thinking in terms of bitmap, reverse,etc...

The column is not unique. The data is distributed as follows.

select substr(i.datekey,1,4),count(*)
from mytable i
group by  substr(i.datekey,1,4) ;

2022	37
2013	7814566
2007	38
2012	1863015
Re: Need advice on index to apply [message #601338 is a reply to message #601336] Tue, 19 November 2013 14:53 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
>I have a table that has column (datekey) of type number that represents a date in the format yyyymmdd.
flawed & unprofessional "design"!

It should be DATE datatype.

>where i.datekey > TO_CHAR ( (SYSDATE - 356), 'YYYYMMDD')
implicit datatype conversion precludes use of any index
Re: Need advice on index to apply [message #601344 is a reply to message #601338] Tue, 19 November 2013 16:41 Go to previous messageGo to next message
jxh461
Messages: 185
Registered: March 2005
Senior Member
What if there was no implicit datatype conversion and the filter was

>where i.datekey > 20121119

?
Re: Need advice on index to apply [message #601346 is a reply to message #601344] Tue, 19 November 2013 18:37 Go to previous messageGo to next message
cookiemonster
Messages: 13917
Registered: September 2008
Location: Rainy Manchester
Senior Member
The query retrieves > 80% of the data.
You do not want it to use an index, full table scan is far more efficient when retrieving large percentages of data.
Re: Need advice on index to apply [message #601348 is a reply to message #601346] Tue, 19 November 2013 19:32 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
cookiemonster is on the money.

The primary purpose of an index is to find a very small number of rows from a much larger number of rows. How small is small? I use the 2% rule. If I want <2% of the rows in a table then I would accept oracle choosing an index. If I want >2% of the rows then I usually expect to see a full table scan. So what % of the table do you want? more than 2%? or less than 2%?

Second, I have never figured out why people take a date, convert it to a number, then try to force the number to behave like a date???

Third BITMAP index is only for star schema. Other uses though potentially valid, are riding the edge of pain.

Fourth, BITMAP index is not for low cardinality columns. This is a mis-interpretation of Oracle's original commentary (bad as it was presented) about what these indexes are for. You have to understand the problem that STAR SCHEMAS are trying to solve, realize from this understanding why a set of BTREE indexes cannot provide the necessary indexing to solve this problem, then understand the one significant feature of a BITMAP index that makes it the type of index to use for a star schema (think Cranium Rats from Planescape Torment).

Good luck. Kevin
Re: Need advice on index to apply [message #601382 is a reply to message #601348] Wed, 20 November 2013 05:44 Go to previous messageGo to next message
Lalit Kumar B
Messages: 3174
Registered: May 2013
Location: World Wide on the Web
Senior Member
How much % of table data to be retrieved so that index should/shouldn't be used is a tricky thing.

Documentation says <15%.

Per documentation -

"Create an index if you frequently want to retrieve less than 15% of the rows in a large table. The percentage varies greatly according to the relative speed of a table scan and how the distribution of the row data in relation to the index key. The faster the table scan, the lower the percentage; the more clustered the row data, the higher the percentage."


I would say, Clustering_factor can reveal a lot of things. But again the %age thing is quite tricky.
Re: Need advice on index to apply [message #601391 is a reply to message #601382] Wed, 20 November 2013 08:34 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
Nice to see someone reading the manuals for a change. Here is the truth.

1. manuals are wrong. It is not 15%.

2. I use 2% but 2% is just a number. Your Milage May Vary. Yes clustering factor makes a difference, as does disk drive speed, and available memory, number of columns and row width, and maybe several other factors. But 15% is Oracle's attempt to straddle the fence. It allows them to introduce the idea that full table scans are good whilst staying less controversial. Anyone can create an example to show any percentage they want. What we need to know is how do things work on "MY" database. And it ain't 15%. For most tables and most databases it will be 2% or less, likely less. My 2% number I computed some three years ago for about 40 different databases where I work. So it is getting old and has not kept up with technology. The number is less than this now for most people and then there is EXADATA for which this number can easily be 1/10th%. I don't particuarly care about not being controversial. I am only interested in what works for me and gives me a tuning edge over others where I work and it is this experience that give to others.

3. clustering factor is too complicated to work with on a daily basis. Indeed one can argue any metric you use will have flaws in it. Given that I choose to pick a metric that is easy to work with yet still acurrately reflects reality 99% of the time. That is %-of-rows.

4. you can compute your own 2% rule if you like. Devise some tests using different indexes on some large tables (a table that takes say 2 minutes to scan or more, even 20 if you are willing to wait). Then create some selects that extract different percentages of tables and writes them to a table. Then compare this to the same queries doing full table scans (use a hint). It is a crude method maybe and it can be affected by different types of caching but with a little effort and care you will zero in on an average for the table. Do this for several large tables and you will zoom on on an average for you database. Aside from some extreme cases they all should be within a point of two of each other.

5. the 2% rule as I call it is just a guide. But it is a useful guide. It gives you and idea of what the query should be doing with the four primary characteristics that make up a query (access method, join method, driving table, join order). That is what I use it for. I hav a MAGIC query that gives me an idea of what query should look like. It is based on the 2% rule concept and almost never wrong. Helps me spot problems in a query plan. It works very similar to dynamic sampling.

Here is an example of what I call FILTERED ROWS PERCENTAGE SPREADSHEET (FRP):

ID    TABLE_OWNER    TABLE_NAME    TA   NUM_ROWS   ROWCOUNT CARDINALITY FILTERED_CARDINALITY ACTUAL_FRP PLAN_FRP
----- -------------- ------------- -- ---------- ---------- ----------- -------------------- ---------- --------
    5 CLDW_THPA_DLV1 EMP_LOC_DIM   EL     328402     328402       14125                18217        5.5      4.3
    9 CLDW_THPA_DLV1 EMP_DIM       EE    6140701    6140701      416846               215204        3.5      6.8
   10 CLDW_THPA_DLV1 EMP_DIM       EE    6140701    6140701      416846               215204        3.5      6.8
   11 CLDW_THPA_DLV1 EMP_LOC_DIM   EL     328402     328402      219850               212863       64.8     66.9
   15 CLDW_THPA_DLV1 EMPLR_LOC_DIM E        8869       8869        8862                 8867      100.0     99.9
   17 CLDW_THPA_DLV1 EMPLR_LOC_DIM E        8869       8869        8862                 8867      100.0     99.9
   19 CLDW_THPA_DLV1 EMP_DIM       EE    6140701    6140701      463508               236162        3.8      7.5
   21 CLDW_THPA_DLV1 EMP_LOC_DIM   EL     328402     328402      219850               212863       64.8     66.9
   21 CLDW_THPA_DLV1 EMP_DIM       EE    6140701    6140701      463508               236162        3.8      7.5
   23 CLDW_THPA_DLV2 EMP_LOC_DIM   EL     328850     329145       22445                18217        5.5      6.8
   23 CLDW_THPA_DLV1 EMP_LOC_DIM   EL     328402     328402      219850               212863       64.8     66.9
   27 CLDW_THPA_DLV2 EMP_DIM       EE    6141490    6141490      325660               215202        3.5      5.3
   29 CLDW_THPA_DLV2 EMP_DIM       EE    6141490    6141490      325660               215202        3.5      5.3
   30 CLDW_THPA_DLV2 EMP_LOC_DIM   EL     328850     329145      244823               212863       64.7     74.4
   33 CLDW_THPA_DLV2 EMPLR_LOC_DIM E        8869       8869        8869                 8867      100.0    100.0
   36 CLDW_THPA_DLV2 EMPLR_LOC_DIM E        8869       8869        8869                 8867      100.0    100.0
   37 CLDW_THPA_DLV2 EMP_DIM       EE    6141490    6141490      292765               236186        3.8      4.8
   39 CLDW_THPA_DLV2 EMP_LOC_DIM   EL     328850     329145      244823               212863       64.7     74.4
   40 CLDW_THPA_DLV2 EMP_DIM       EE    6141490    6141490      292765               236186        3.8      4.8
   42 CLDW_THPA_DLV2 EMP_LOC_DIM   EL     328850     329145      244823               212863       64.7     74.4

20 rows selected.

Quote:
From the above I learn the following:
1. stats appear to be representative (num_rows matches actual rowcount)
2. plan cardinality estimates are on the money (cardinality match actual filtered cardinality)
3. %-of-rows says FULL TABLE SCAN and HASH JOIN (all fetches are taking more than 2% of the rows)

With this information in hand I can review the query plan (not shown here (sorry I can't find it anywhere)) to see what it is doing. I see that it is doing all full table scans and hash joins as the 2% rule suggested it should. Thus it appears that there are no immediately obvious flaws with what the optimizer did with this plan. It is very likely the right plan (no early indication of cardinality errors, right access methods and join methods). Not many people can easily make this statement but now I can. In the end this means that this query which takes 30 seconds on average to run, is supposed to take 30 seconds to run. If we want it to go faster it will take something more than basic tuning. In this case we used materialized views to take the runtime down to 1/2 second.

Found it...
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------
Plan hash value: 3600664198

------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                |     1 |    42 |       |   101K  (2)| 00:00:04 |
|   1 |  SORT AGGREGATE                   |                |     1 |    42 |       |            |          |
|*  2 |   VIEW                            | VW_EMP_LOC_DIM |   478M|    18G|       |   101K  (2)| 00:00:04 |
|   3 |    UNION-ALL                      |                |       |       |       |            |          |
|*  4 |     HASH JOIN                     |                | 24569 |  1055K|       | 25796   (1)| 00:00:01 |
|*  5 |      TABLE ACCESS STORAGE FULL    | EMP_LOC_DIM    | 14125 |   358K|       |   437   (1)| 00:00:01 |
|*  6 |      VIEW                         |                |   416K|  7327K|       | 25357   (1)| 00:00:01 |
|*  7 |       WINDOW SORT PUSHED RANK     |                |   416K|    23M|    31M| 25357   (1)| 00:00:01 |
|*  8 |        FILTER                     |                |       |       |       |            |          |
|*  9 |         TABLE ACCESS STORAGE FULL | EMP_DIM        |   416K|    23M|       | 19846   (1)| 00:00:01 |
|  10 |     COUNT                         |                |       |       |       |            |          |
|* 11 |      HASH JOIN                    |                |   293M|    13G|       | 26468   (3)| 00:00:01 |
|  12 |       VIEW                        |                |  8862 | 62034 |       |    24   (5)| 00:00:01 |
|  13 |        HASH UNIQUE                |                |  8862 |   709K|       |    24   (5)| 00:00:01 |
|* 14 |         FILTER                    |                |       |       |       |            |          |
|* 15 |          TABLE ACCESS STORAGE FULL| EMPLR_LOC_DIM  |  8862 |   709K|       |    23   (0)| 00:00:01 |
|  16 |       VIEW                        |                |   463K|    18M|       | 25848   (1)| 00:00:01 |
|  17 |        MINUS                      |                |       |       |       |            |          |
|  18 |         SORT UNIQUE               |                |   463K|    15M|    23M|            |          |
|* 19 |          TABLE ACCESS STORAGE FULL| EMP_DIM        |   463K|    15M|       | 19732   (1)| 00:00:01 |
|  20 |         SORT UNIQUE               |                |   219K|  5582K|  8632K|            |          |
|* 21 |          TABLE ACCESS STORAGE FULL| EMP_LOC_DIM    |   219K|  5582K|       |   437   (1)| 00:00:01 |
|* 22 |     HASH JOIN                     |                | 30499 |  1310K|       | 24667   (1)| 00:00:01 |
|* 23 |      TABLE ACCESS STORAGE FULL    | EMP_LOC_DIM    | 22445 |   569K|       |   433   (1)| 00:00:01 |
|* 24 |      VIEW                         |                |   325K|  5724K|       | 24232   (1)| 00:00:01 |
|* 25 |       WINDOW SORT PUSHED RANK     |                |   325K|    18M|    24M| 24232   (1)| 00:00:01 |
|* 26 |        FILTER                     |                |       |       |       |            |          |
|* 27 |         TABLE ACCESS STORAGE FULL | EMP_DIM        |   325K|    18M|       | 19927   (1)| 00:00:01 |
|  28 |     COUNT                         |                |       |       |       |            |          |
|* 29 |      HASH JOIN                    |                |   185M|  8666M|       | 24966   (2)| 00:00:01 |
|  30 |       VIEW                        |                |  8869 | 62083 |       |    21   (5)| 00:00:01 |
|  31 |        HASH UNIQUE                |                |  8869 |   710K|       |    21   (5)| 00:00:01 |
|* 32 |         FILTER                    |                |       |       |       |            |          |
|* 33 |          TABLE ACCESS STORAGE FULL| EMPLR_LOC_DIM  |  8869 |   710K|       |    20   (0)| 00:00:01 |
|  34 |       VIEW                        |                |   292K|    11M|       | 24568   (1)| 00:00:01 |
|  35 |        MINUS                      |                |       |       |       |            |          |
|  36 |         SORT UNIQUE               |                |   292K|    10M|    14M|            |          |
|* 37 |          TABLE ACCESS STORAGE FULL| EMP_DIM        |   292K|    10M|       | 19809   (1)| 00:00:01 |
|  38 |         SORT UNIQUE               |                |   244K|  6216K|  9616K|            |          |
|* 39 |          TABLE ACCESS STORAGE FULL| EMP_LOC_DIM    |   244K|  6216K|       |   433   (1)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("E"."GRP_BEN_CASE_ID"='499489')
   4 - access("EMP_GID"="EL"."EMP_LOC_GID")
   5 - storage("EL"."GRP_BEN_CASE_ID"='499489' AND "EL"."POPULATION_STATUS_CD"<>'D')
       filter("EL"."GRP_BEN_CASE_ID"='499489' AND "EL"."POPULATION_STATUS_CD"<>'D')
   6 - filter("PRTY_REF_ID_RNK"=1)
   7 - filter(RANK() OVER ( PARTITION BY "GRP_BEN_CASE_ID","PRTY_REF_ID" ORDER BY CASE
              "EE"."EMPLMT_STAT_CD" WHEN 'T' THEN 2 WHEN 'R' THEN 2 WHEN 'D' THEN 2 WHEN 'I' THEN 2 ELSE 1 END
              ,INTERNAL_FUNCTION("EE"."SRCE_EFF_START_TMSP") DESC ,INTERNAL_FUNCTION("EE"."EMP_PK_ID") DESC )<=1)
   8 - filter(SYS_CONTEXT('APP1','CURRENT_SCHEMA')='CLDW_THPA_DLV1')
   9 - storage("EE"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EE"."SRCE_APP_SYS_CD"='ELIG')
       filter("EE"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EE"."SRCE_APP_SYS_CD"='ELIG')
  11 - access("A"."GRP_BEN_CASE_ID"="B"."GRP_BEN_CASE_ID")
  14 - filter(SYS_CONTEXT('APP1','CURRENT_SCHEMA')='CLDW_THPA_DLV1')
  15 - storage("E"."EMPLR_LOC_PK_ID"<>(-1) AND "E"."EMPLR_LOC_PK_ID"<>(-2) AND
              "E"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00')
       filter("E"."EMPLR_LOC_PK_ID"<>(-1) AND "E"."EMPLR_LOC_PK_ID"<>(-2) AND
              "E"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00')
  19 - storage("EE"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EE"."POPULATION_STATUS_CD"<>'D' AND "EE"."EMP_PK_ID"<>(-1) AND "EE"."EMP_PK_ID"<>(-2))
       filter("EE"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EE"."POPULATION_STATUS_CD"<>'D' AND "EE"."EMP_PK_ID"<>(-1) AND "EE"."EMP_PK_ID"<>(-2))
  21 - storage("EL"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EL"."POPULATION_STATUS_CD"<>'D')
       filter("EL"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EL"."POPULATION_STATUS_CD"<>'D')
  22 - access("EMP_GID"="EL"."EMP_LOC_GID")
  23 - storage("EL"."GRP_BEN_CASE_ID"='499489' AND "EL"."POPULATION_STATUS_CD"<>'D')
       filter("EL"."GRP_BEN_CASE_ID"='499489' AND "EL"."POPULATION_STATUS_CD"<>'D')
  24 - filter("PRTY_REF_ID_RNK"=1)
  25 - filter(RANK() OVER ( PARTITION BY "GRP_BEN_CASE_ID","PRTY_REF_ID" ORDER BY CASE
              "EE"."EMPLMT_STAT_CD" WHEN 'T' THEN 2 WHEN 'R' THEN 2 WHEN 'D' THEN 2 WHEN 'I' THEN 2 ELSE 1 END
              ,INTERNAL_FUNCTION("EE"."SRCE_EFF_START_TMSP") DESC ,INTERNAL_FUNCTION("EE"."EMP_PK_ID") DESC )<=1)
  26 - filter(SYS_CONTEXT('APP1','CURRENT_SCHEMA')='CLDW_THPA_DLV2')
  27 - storage("EE"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EE"."SRCE_APP_SYS_CD"='ELIG')
       filter("EE"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EE"."SRCE_APP_SYS_CD"='ELIG')
  29 - access("A"."GRP_BEN_CASE_ID"="B"."GRP_BEN_CASE_ID")
  32 - filter(SYS_CONTEXT('APP1','CURRENT_SCHEMA')='CLDW_THPA_DLV2')
  33 - storage("E"."EMPLR_LOC_PK_ID"<>(-1) AND "E"."EMPLR_LOC_PK_ID"<>(-2) AND
              "E"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00')
       filter("E"."EMPLR_LOC_PK_ID"<>(-1) AND "E"."EMPLR_LOC_PK_ID"<>(-2) AND
              "E"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00')
  37 - storage("EE"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EE"."POPULATION_STATUS_CD"<>'D' AND "EE"."EMP_PK_ID"<>(-1) AND "EE"."EMP_PK_ID"<>(-2))
       filter("EE"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EE"."POPULATION_STATUS_CD"<>'D' AND "EE"."EMP_PK_ID"<>(-1) AND "EE"."EMP_PK_ID"<>(-2))
  39 - storage("EL"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EL"."POPULATION_STATUS_CD"<>'D')
       filter("EL"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00' AND
              "EL"."POPULATION_STATUS_CD"<>'D')

Note
-----
   - dynamic sampling used for this statement (level=4)
   - automatic DOP: Computed Degree of Parallelism is 1

108 rows selected.


This is one of the things I do anyway. If you don't like 2% then figure out your own number, but it ain't 15%.

Again I commend you for taking the time to read the manuals. Kevin

[Updated on: Wed, 20 November 2013 08:49]

Report message to a moderator

Re: Need advice on index to apply [message #601399 is a reply to message #601391] Wed, 20 November 2013 09:14 Go to previous messageGo to next message
Lalit Kumar B
Messages: 3174
Registered: May 2013
Location: World Wide on the Web
Senior Member
Kevin Meade wrote on Wed, 20 November 2013 20:04
If you don't like 2% then figure out your own number, but it ain't 15%.


This is really a good demonstration Kevin. Of course, manual says about 15%, but without any solid proof, and then we can see a controversial statement imediately following it, that the %age highly depends on several factors. So, we are back to the golden words from Oracle, "It depends". I have witnessed index usage with almost 12% of the table rows being retrieved. Well, can't advocate over it, since, it really depends on so many factors, from "My DB" to "Your DB" and...

PS : I am going to steal your idea of FILTERED ROWS PERCENTAGE SPREADSHEET (FRP) Smile
Re: Need advice on index to apply [message #601403 is a reply to message #601399] Wed, 20 November 2013 09:40 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
steal away friend. That is what sharing is about.

FRP is a topic in my book (if I can ever get it done), and one of the scripts the book will offer.

Good luck. Kevin

ps. don't be too quick to give me credit for these ideas. They are nothing new. The same stuff was true back in 1985. These just wasn't any hash join back then. I remember Dan Tow and his book, cool stuff and very on the money. Similar ideas at work there. The book is still relevent even though it does not address hash join (was not around when he wrote the book).

[Updated on: Wed, 20 November 2013 09:42]

Report message to a moderator

Re: Need advice on index to apply [message #601424 is a reply to message #601391] Wed, 20 November 2013 15:51 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
I always learn something from your posts, Kevin. May I try to learn some more?

One "red flag" I look for in an exec plan is materializing a view (or subquery). I always go back to the code and ask whether the SQL construct that forced Oracle to materialize the views is actually necessary. Many programmers throw in a DISTINCT or a MAX when it isn't needed: perhaps they are afraid of getting a duplicate, so they use such tricks to remove it. But if they really understand their data, they should know whether a duplicate is possible, and if so, they should handle it properly.

Now, your plan materializes several views. I was looking at these examples:
|  12 |       VIEW                        |                |  8862 | 62034 |       |    24   (5)| 00:00:01 |
|  13 |        HASH UNIQUE                |                |  8862 |   709K|       |    24   (5)| 00:00:01 |
|* 14 |         FILTER                    |                |       |       |       |            |          |
|* 15 |          TABLE ACCESS STORAGE FULL| EMPLR_LOC_DIM  |  8862 |   709K|       |    23   (0)| 00:00:01 |

  14 - filter(SYS_CONTEXT('APP1','CURRENT_SCHEMA')='CLDW_THPA_DLV1')
  15 - storage("E"."EMPLR_LOC_PK_ID"<>(-1) AND "E"."EMPLR_LOC_PK_ID"<>(-2) AND
              "E"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00')
       filter("E"."EMPLR_LOC_PK_ID"<>(-1) AND "E"."EMPLR_LOC_PK_ID"<>(-2) AND
              "E"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00')

|  30 |       VIEW                        |                |  8869 | 62083 |       |    21   (5)| 00:00:01 |
|  31 |        HASH UNIQUE                |                |  8869 |   710K|       |    21   (5)| 00:00:01 |
|* 32 |         FILTER                    |                |       |       |       |            |          |
|* 33 |          TABLE ACCESS STORAGE FULL| EMPLR_LOC_DIM  |  8869 |   710K|       |    20   (0)| 00:00:01 |

  32 - filter(SYS_CONTEXT('APP1','CURRENT_SCHEMA')='CLDW_THPA_DLV2')
  33 - storage("E"."EMPLR_LOC_PK_ID"<>(-1) AND "E"."EMPLR_LOC_PK_ID"<>(-2) AND
              "E"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00')
       filter("E"."EMPLR_LOC_PK_ID"<>(-1) AND "E"."EMPLR_LOC_PK_ID"<>(-2) AND
              "E"."SRCE_EFF_END_TMSP"=TIMESTAMP' 9999-12-31 00:00:00')

I would ask "why did those views need to be materialized (probably a DISTINCT) and as they are so similar, could they be combined into one?

So my question is, do you think that my red flag for materializing a view is valid, or a waste of time? Do you think my approach of trying to combine subqueries and re-write them to make them mergeable is useful? I mean in general, not necessarily for this particular query.


Re: Need advice on index to apply [message #601426 is a reply to message #601424] Wed, 20 November 2013 16:23 Go to previous message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
I do not spend a lot of time working on mergable/non-mergable question of sql text. So I would suggest that this is something you likely have far more experience with than I do.

As evidence of same I point out that even though you have never seen it, you are 100% correct on the ocntents of the view. It has several features making it a bit complex.

1. it is some 300 lines of UNION ALL of four selects
2. each of these selects itself is a series of nested table expressions
3. each sequence of table expressions is doing its own group operations with at least two of the following in each:
   a. ANALYTIC
   b. DISTINCT
   c. MINUS
   d. ROWNUM

So there are lots of stopping points in the query for potentially doing a materialize and little hope of changing it.

I would never tell you to stop looking at code for inefficiencies. Seems to me your point is more about an improper use of unneeded SQL features and thus a basic flaw in someone's logic. It is always worth while to remove an unnecessary step and correcting misleading logic, if a query is performing poorly, so keep doing that.

As a general rule, I think materializing a rowset is indeed a red flag. But I always caution myself when looking at them:

1. I don't understand the rules that well so I make my self go slow when thinking about them.

2. I need to be careful not to change the semantics of the query incorrectly since many times the change needed to undo the materialization is one that changes the meaning of the query.

3. I generally use ROWNUM >= 1 to prevent merging and force a materialization more often than the other way round.  THIS IS BAD PRACTIVE ON MY PART, but demonstrative of the limitations of my own thinking.  I do it very infrequently, but when I want to stop some specific join order etc. from occuring this has always been an easy way to put a "stop here and do this stuff first" solution for me.  But I don't want other people to start doing this as it is just a hack, so please forget that I mentioned it.  It prevents various featurs from being considered and this is bad.  If I were smarter I would devise a better way out of the performance issue in these cases.


Not sure any of this helps but there you got. Kevin
Previous Topic: Oracle DB Performance Tests
Next Topic: SELECT works, INSERT+ SELECT hangs in SQLDeveloper Tools
Goto Forum:
  


Current Time: Thu Mar 28 13:28:09 CDT 2024