Home » RDBMS Server » Performance Tuning » A question about how to code a procedure to use dbms_parallel_execute package for a huge insert (11.2.0.4 on Linux)
A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672876] Tue, 30 October 2018 09:45 Go to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
Hi Experts,

I am on Oracle 11.2.0.4 on Linux. I have a query to tune and a scenario in which the query executes fine for smaller data sets and takes more time for huge data set - when it inserts more than one million rows. Our application timeout is 60 seconds and when the data inserts is more than 1 million we are getting timeout. I want to try out dbms_parallel_execute package for this - to change that query to use this package. This question is specifically about this requirement. I am asking this question to specifically know about how to use DBMS_PARALLEL_EXECUTE package to make an insert go in parallel. (Please do not spend your time on other aspects of tuning as I will raise a new question for it if I need help on that.)

For this I have the following setup: Basically a table with 4 column primary key (to mimic my actual issue) and a simpler insert into it from another table holding 1 million rows. But I am stumped at how to use dbms_parallel_execute for this. So many websites I referred are all giving references on DBMS_PARALLEL_EXECUTE for update statements but not for inserts. If someone can share how the insert can be paralleled - and for a table that does not have a single column primary key but a composite 4 column primary key, that will be very useful to me.


Test data setup ...the sql to be made parallel by dbms_parallel_execute is the last sql at the end of the data setup:


 create table test  
   (owner varchar2(30),  
     object_name varchar2(128),  
    object_id number,  
    object_id_2 number,    
 create_date date default sysdate);  
  
  
create table test2  
   (owner varchar2(30),  
     object_name varchar2(128),  
    object_id number,  
    object_id_2 number,    
 create_date date default sysdate);  
  
  
alter table test add constraint test_pk primary key (owner, object_name, object_id,object_id_2);  
  
alter table test2 add constraint test2_pk primary key (owner, object_name, object_id,object_id_2);  
  
insert /*+ append */ into test select owner, object_name,object_id,1 ,sysdate from dba_objects where object_id is not null;  
  
insert /*+ append */ into test select owner, object_name,object_id,2 ,sysdate from dba_objects where object_id is not null;  
  
  
commit;  
  
  
insert /*+ append */ into test select owner, object_name,object_id,3 ,sysdate from dba_objects where object_id is not null;  
  
commit;  
  
insert /*+ append */ into test select owner, object_name,object_id,4 ,sysdate from dba_objects where object_id is not null;  
  
commit;  
  
insert /*+ append */ into test select owner, object_name,object_id,5 ,sysdate from dba_objects where object_id is not null;  
  
commit;  
  
  
17:56:57 testuser@db>select count(*) from test;  
  
  
  COUNT(*)  
----------  
   1141645  
  
  
--actual sql to be made parallel:..this is to be converted to use dbms_parallel_execute packae  

insert  into test2 select owner, object_name,object_id,object_id_2 ,sysdate from test where object_id is not null;  

Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672877 is a reply to message #672876] Tue, 30 October 2018 10:26 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
why are you certain that DBMS_PARALLEL_EXECUTE is the optimal solution; especially since you don't know how to actually use it?

https://blogs.oracle.com/warehousebuilder/parallel-processing-with-dbmsparallelexecute

>Basically a table with 4 column primary key
What is PK in above example?
Is PK a SEQUENCE?
Why not parallel the SELECT that feeds the INSERT?
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672878 is a reply to message #672877] Tue, 30 October 2018 10:33 Go to previous messageGo to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
Thanks Blackswan. I want to try this out - dbms_parallel_execute. But it is only one step in final decision. I will not stick to it and will be open for other solutions - for instance I checked for append (and gave up due to its other locking issues and so on). But at least I want to try this out and I am stuck due to syntax and other issues. So this is a specific syntax related question.

Regarding PK, pls. read the question, I have posted it there , it is not based on a sequence.
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672879 is a reply to message #672876] Tue, 30 October 2018 10:38 Go to previous messageGo to next message
cookiemonster
Messages: 13917
Registered: September 2008
Location: Rainy Manchester
Senior Member
So for dbms_paralell_execute to work you need to divide the data you're working on into chunks.
In most of the examples it's chunking a table to be updated.
To make it work with an insert/select you need to chunk (one of) the table(s) you're selecting from.

So using the example from the [url=docs]https://docs.oracle.com/database/121/ARPLS/d_parallel_ex.htm#ARPLS67339[/url]
You would call CREATE_CHUNKS_BY_ROWID passing 'test' for the table name.
You'd create a sql statement like this:
l_sql_stmt := 'insert  into test2 select owner, object_name,object_id,object_id_2 ,sysdate 
               from test where object_id is not null and rowid BETWEEN :start_id AND :end_id';

Then you'd call RUN_TASK.

All the other code should be as per the example.
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672880 is a reply to message #672879] Tue, 30 October 2018 10:52 Go to previous messageGo to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
Thanks cookiemonster. This gives me a lot of guidance. My actual issue at work is a massive three table join so here how to paas row id? - but I have a way out- What I will try out now is using this example to insert the result set first into a GTT and then use rowid from that GTT. I hope to have it working by tomorrow and will share my finding.
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672881 is a reply to message #672880] Tue, 30 October 2018 11:11 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
>Our application timeout is 60 seconds
why not just disable this silly & self-inflicted timeout for the session loading the data?
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672882 is a reply to message #672880] Tue, 30 October 2018 11:30 Go to previous messageGo to next message
cookiemonster
Messages: 13917
Registered: September 2008
Location: Rainy Manchester
Senior Member
orausern wrote on Tue, 30 October 2018 15:52
Thanks cookiemonster. This gives me a lot of guidance. My actual issue at work is a massive three table join so here how to paas row id?
You chunk one of the three, probably whichever is likeliest to be the biggest at any given point in time.
then your insert would be:
insert into tablea
select whatever
from table1
join table2 on ..
join table3 on ..
where table1.rowid between ... and ...
and whatever


orausern wrote on Tue, 30 October 2018 15:52

but I have a way out- What I will try out now is using this example to insert the result set first into a GTT and then use rowid from that GTT. I hope to have it working by tomorrow and will share my finding.
If you want to speed things up enough to avoid your timeout a GTT is unlikely to improve matters.

Can't you bypass the timeout for this operation? You could always submit a job so it runs in the background.
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672883 is a reply to message #672881] Tue, 30 October 2018 11:36 Go to previous messageGo to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
Hi BlackSwan,

I feel bad about your comment. 60 second is not silly. In this days of competition , companies do need good performance. Dont understand why you thought 60 second timeout is silly.

[Updated on: Tue, 30 October 2018 11:37]

Report message to a moderator

Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672884 is a reply to message #672882] Tue, 30 October 2018 11:40 Go to previous messageGo to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
Thanks a lot! I will try out this way!! The changes have to be done within timeout so 60 second is all that we have and in that 1 Million rows to be inserted. But I hope it will be manageable! Will update by tomorrow what i learn by trying this out and thanks again!!

[Updated on: Tue, 30 October 2018 11:40]

Report message to a moderator

Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672885 is a reply to message #672883] Tue, 30 October 2018 11:44 Go to previous messageGo to next message
cookiemonster
Messages: 13917
Registered: September 2008
Location: Rainy Manchester
Senior Member
orausern wrote on Tue, 30 October 2018 16:36
Hi BlackSwan,

I feel bad about your comment. 60 second is not silly. In this days of competition , companies do need good performance. Dont understand why you thought 60 second timeout is silly.

60 seconds for day to day operations (making an order on a website) is fine (possibly on the high side).
60 seconds for processes that move millions of records around in one go though is completely unrealistic.
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672886 is a reply to message #672885] Tue, 30 October 2018 11:53 Go to previous messageGo to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
Oh ok...I do see the point here! Sorry I was wrong then. I am sorry BlackSwan.
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672887 is a reply to message #672885] Tue, 30 October 2018 12:03 Go to previous messageGo to next message
JPBoileau
Messages: 88
Registered: September 2017
Member
cookiemonster wrote on Tue, 30 October 2018 11:44

60 seconds for day to day operations (making an order on a website) is fine (possibly on the high side).
If your end-user has to wait 60 seconds to get an "order confirmed" message, you're going to have a LOT of re-clicks and duplicate orders!

JP

[Updated on: Tue, 30 October 2018 12:03]

Report message to a moderator

Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672889 is a reply to message #672887] Wed, 31 October 2018 04:35 Go to previous messageGo to next message
cookiemonster
Messages: 13917
Registered: September 2008
Location: Rainy Manchester
Senior Member
True - I was probably being overly generous with the "possibly on the high side"

That said - networks can be slow, not just DBs, and most websites (if they're sensible) will disable the order button until it gets a response back.
Usually when I order something on amazon it completes in a few seconds but I have seen it take close to a minute a few times.

If the DB end is taking much more than a second or two you've got a serious problem. If the end user occasionally has to wait longer than that, not so much.

So - a timeout on the application of 60 seconds, assuming that timeout is applied on a webserver and not a client device, is probably too high for day to day operations (while being too low for batch processes).
Re: A question about how to code a procedure to use dbms_parallel_execute package for a huge insert [message #672895 is a reply to message #672887] Wed, 31 October 2018 11:15 Go to previous message
orausern
Messages: 826
Registered: December 2005
Senior Member
To update:

I spent the day today working out the approaches and found the following so far:

1. DBMS_PARALLEL_EXECUTE using row id approach , for some reason was excessively bad. It was taking more than 15 minutes and then I killed the session. The actual query I have is some what complex with EXIST and NOT EXIST clauses etc. and something didn't go right. I couldn't troubleshoot further.

2. Other idea was to insert into a GTT and thereafter do a insert in the main table with select in parallel from GTT - that worked!! the timing came to 44 seconds. This was what BlackSwan suggested and that is working out for me. Thanks a lot to you both. select in parallel from the 3 tables did not work and for some reason the plan was not a parallel plan but select from GTT was a parallel plan. So this approach -of going via GTT looks good to me.

I wish I troubleshoot further and make this work with dbms_parallel_execute but I don't know. I already have a working solution now but I will explore further. Thanks a lot to you both cookiemonster and BlackSwan for your suggestions.

[Updated on: Wed, 31 October 2018 11:19]

Report message to a moderator

Previous Topic: scripts of Oracle SQL Performance Tuning and Optimization
Next Topic: Optimal Chunk size for update of 50M record table
Goto Forum:
  


Current Time: Thu Mar 28 06:07:55 CDT 2024