The Phoenix Rises: a How-To for SQL FROM SQL

Kevin Meade's picture
articles: 

If I had to recommend THE one most exploitable skill to have for an Oracle Developer or DBA, it would have to be HOW TO PLAY GOLF. I have observed few other talents give regular Mary and Joe, greater access to people from all levels of the corporate world. You will meet everyone from the janitor, to management elite without the fear or intimidation that we sometimes feel in these circles. For some reason, on the fairway all are equal; the only things that matter are your swing and your handicap. Alas, I never learned to play golf, so I must settle for imparting an actual job skill. And for that, my best first choice would be SQL FROM SQL.

Newbie in the business start out ignorant of any practical skill. But a year in the trenches has a way of fixing that. You go along happy, and then someone says, “We need a description of all the tables in our database… get it for us”. It is after all a menial task so they give to the Newbie. You start in at the keyboard…

DESC ACCOUNT
DESC ACCOUNT_LIMIT
…
DESC DEPT

Twenty minutes later you stop to count how far along you are… Hmm… only 252 more tables to go… boy my hands hurt. There has to be a better way you think; you wonder if carpal tunnel is still covered under your recently “restructured” medical plan. Then someone enters the room to see where the information is. “What are you doing man? That will take you all day; use SQL FROM SQL”. And you get your first introduction.

select ‘desc ‘||table_name from user_tables order by 1;

At this moment the light bulb may or may not have gone off in your head. If it did you likely have kept this little trick in your Felix bag since that day. If not, well… here is your second chance.

The thing about SQL FROM SQL though, is that we all practice it intuitively. We either don’t think too much about what we are doing when we do it, or we think way too hard. In the end this means we are really only speaking baby talk with this trick. We limit ourselves to easy stuff. You know what I mean if you have ever tried to tackle a really difficult problem with SQL FROM SQL. I have been in this dilemma more than once and there is a tendency at some point to give up because of the huge SQL FROM SQL scripts that can be produced in these situations (often several hundred lines). Off by one comma or quote or parenthesis, and you spend hours trying to hunt it down.

Repeating words from an earlier time, there must be a better way; some methodical process that will lead you to the mother of all SQL FROM SQL scripts. Well… there is; a series of reproducible, monkey capable steps that will produce arbitrarily complex SQL FROM SQL; code generators that take the form of a series of select statements which we UNION ALL together. Let us walk through the steps and then do some examples.

Prepare for Script Generation

1) Get a working piece of code of what it is you want to reproduce. We are going to turn this working example into a generic generator script. It is thus very important that you make sure the sample is free of all errors, no syntax problems, no logic flaws, and contains as varied an example as is reasonable to acquire without lots of effort.

2) Map your example to the Oracle Data Dictionary. Find names of things in the code sample and then figure out where you will get these names from out of the dictionary. Usually you are looking for things like: TABLE NAMES and COLUMN NAMES as well as data about these object and attributes. Also, you are looking for name construction rules too. You will many times need to generate trigger names or view names etc. and these will have root components taken from the dictionary. Normally this is just a translation off the table name, but you will be required to think ahead about how you are going to truncate components used to generate names, yet still ensure you don’t create name collisions. The common response to this is to use a number generator and append it to the name you are constructing. Then any truncation is of no matter as the number is unique and the rest of the name you generate is really just informational.

3) Note in your code sample, where the use of the dictionary changes. Your sample will likely start out with lines mapped to tables, then shift to lines mapped to columns then back to tables and so on. Each change represents what will become another independent select in your final generator solution. Remember, we are creating a big select statement composed of lots of little select statements with UNION ALL between then. If you are maintaining line fidelity to your code sample, then most lines will also translate into an independent select.

4) Put your mind in the right place. Don’t think, just do. If you think about the process you will mess it up. The idea is after all, to use a process that you don’t have to think about. This makes if fast and error proof, so don’t think, just do.

Start the Script Generation Process (lots of simple editing of the code sample)

1) Change every single quote in the code sample to two single quotes.

2) Put a single quote in front of each line and at the end of each line.

3) Create substitution components for real names using the following sytanx: ‘||table_name||’, and ‘||column_name||’ and so on. Noting where we mapped our code sample to the database, we take the attribute off the data dictionary that maps to a real name in our code sample, and construct a substitution component by putting a quote,doublebar in front, and a doublebar,quote at the end (don’t think about it, just do it).

4) Substitute your substitution components into your code sample.

5) Put select in the front of each line turning it into the beginnings of a select statement.

6) Add an identifying number column to each individual select with a unique number and name it SELECT_ID. Select number one can be 1; two can be 2, etc.

7) Add any additional data needed to correctly order your results. This will often involve adding COLUMN_ID to parts that generate based on column names.

8) Make sure all selects are UNION COMPATIBLE. Usually this means putting a 0 in for COLUMN_ID on those components that are at the table granularity.

9) Create the FROM clause and WHERE clause for the now select looking lines based on where the line maps to your data dictionary.

10) Insert a UNION ALL between selects and an ORDER BY on the end of it all

11) Test each individual select, run the thing, and compare corresponding results to your original code sample.

Let’s look at some examples:

A trigger generator

A common thing among DBA is to add control data to the end of each table. This may contain lots of stuff but almost always contains insert and update dates. Let’s generate code for all tables to do this. This one is easy, it only requires table names (we know the list of control columns and it is the same for every table). I have added ACTIVE_FLAG as a way of giving us something that used quotes in the code sample so I can show most of the steps in the process.

Create or replace Trigger biu_dept
before insert or update on dept
For each row
Begin
   If inserting then :new.insert_date := sysdate; end if;
   :new.update_date := sysdate;
   :new.active_flag := 'Y';
End;
/
Show errors

With the sample above and our mind in the right place to not think about it, we begin with step#1 from “Start the Script Generation Process”.

Create or replace Trigger biu_dept
before insert or update on dept
For each row
Begin
   If inserting then :new.insert_date := sysdate; end if;
   :new.update_date := sysdate;
   :new.active_flag := ''Y'';
End;
/
Show errors

'Create or replace Trigger biu_dept'
'before insert or update on dept'
'For each row'
'Begin'
'   If inserting then :new.insert_date := sysdate; end if;'
'   :new.update_date := sysdate;'
'   :new.active_flag := ''Y'';'
'End;'
'/'
'Show errors'

'||a.table_name||'

'Create or replace Trigger biu_'||a.table_name||''
'before insert or update on '||a.table_name||''
'For each row'
'Begin'
'   If inserting then :new.insert_date := sysdate; end if;'
'   :new.update_date := sysdate;'
'   :new.active_flag := ''Y'';'
'End;'
'/'
'Show errors'

select 'Create or replace Trigger biu_'||a.table_name||''
select 'before insert or update on '||a.table_name||''
select 'For each row'
select 'Begin'
select select '   If inserting then :new.insert_date := sysdate; end if;'
select '   :new.update_date := sysdate;'
select '   :new.active_flag := ''Y'';'
select 'End;'
select '/'
select 'Show errors'

select 101 select_id,'Create or replace Trigger biu_'||a.table_name||''
select 102 select_id,'before insert or update on '||a.table_name||''
select 103 select_id,'For each row'
select 104 select_id,'Begin'
select 105 select_id,'   If inserting then :new.insert_date := sysdate; end if;'
select 106 select_id,'   :new.update_date := sysdate;'
select 107 select_id,'   :new.active_flag := ''Y'';'
select 108 select_id,'End;'
select 109 select_id,'/'
select 110 select_id,'Show errors'

select 101 select_id,table_name,'Create or replace Trigger biu_'||a.table_name||''
select 102 select_id,table_name,'before insert or update on '||a.table_name||''
select 103 select_id,table_name,'For each row'
select 104 select_id,table_name,'Begin'
select 105 select_id,table_name,'   If inserting then :new.insert_date := sysdate; end if;'
select 106 select_id,table_name,'   :new.update_date := sysdate;'
select 107 select_id,table_name,'   :new.active_flag := ''Y'';'
select 108 select_id,table_name,'End;'
select 109 select_id,table_name,'/'
select 110 select_id,table_name,'Show errors'

All selects are already UNION COMPATIBLE

select 101 select_id,table_name,'Create or replace Trigger biu_'||a.table_name||''
from user_tables a
select 102 select_id,table_name,'before insert or update on '||a.table_name||''
from user_tables a
select 103 select_id,table_name,'For each row'
from user_tables a
select 104 select_id,table_name,'Begin'
from user_tables a
select 105 select_id,table_name,'   If inserting then :new.insert_date := sysdate; end if;'
from user_tables a
select 106 select_id,table_name,'   :new.update_date := sysdate;'
from user_tables a
select 107 select_id,table_name,'   :new.active_flag := ''Y'';'
from user_tables a
select 108 select_id,table_name,'End;'
from user_tables a
select 109 select_id,table_name,'/'
from user_tables a
select 110 select_id,table_name,'Show errors'
from user_tables a

select 101 select_id,table_name,'Create or replace Trigger biu_'||a.table_name||''
from user_tables a
union all
select 102 select_id,table_name,'before insert or update on '||a.table_name||''
from user_tables a
union all
select 103 select_id,table_name,'For each row'
from user_tables a
union all
select 104 select_id,table_name,'Begin'
from user_tables a
union all
select 105 select_id,table_name,'   If inserting then :new.insert_date := sysdate; end if;'
from user_tables a
union all
select 106 select_id,table_name,'   :new.update_date := sysdate;'
from user_tables a
union all
select 107 select_id,table_name,'   :new.active_flag := ''Y'';'
from user_tables a
union all
select 108 select_id,table_name,'End;'
from user_tables a
union all
select 109 select_id,table_name,'/'
from user_tables a
union all
select 110 select_id,table_name,'Show errors'
from user_tables a
order by 2,1
/

Create or replace Trigger biu_DEPT
before insert or update on DEPT
For each row
Begin
   If inserting then :new.insert_date := sysdate; end if;
   :new.update_date := sysdate;
   :new.active_flag := 'Y';
End;
/
Show errors

Got 400 tables, make 400 triggers, fast as you can.

Another example, COUNT of Rows in Tables

select count(*) rowcnt,'DEPT' table_name from dept;

becomes

select 'select count(*) rowcnt,'''||table_name||''' table_name from '||table_name||';'
from user_tables
order by table_name
/

Not sure how we got here, then do the steps yourself. This is an easy one but interesting too.

It generates this

select count(*) rowcnt,'DEPT' table_name from DEPT;

Which when run produces this

SQL> select count(*) rowcnt,'DEPT' table_name from DEPT;

    ROWCNT TABL
---------- ----
         0 DEPT

Another Example, (a COBOL record layout)

01 RDEPT.
   05 DNO          PIC 9(18) COMP-3.
   05 DNAME        PIC X(10).
   05 INSERT_DATE  PIC X(19).
   05 UPDATE_DATE  PIC X(19).
   05 ACTIVE_FLAG  PIC X(1).

'||table_name||'
'||column_name||'

set pagesize 999
set linesize 999
set trimspool on
set trimout on
set feedback off

col text format a60
col select_id noprint
col table_name noprint
col column_id noprint

select 101 select_id,table_name,0 column_id,'01 R'||table_name||'.' text
from user_tables
union all
select 101 select_id,table_name,column_id,'   05 '
           ||column_name||'          PIC '||decode(
                                                   data_type
                                                  ,'NUMBER','9(18) COMP-3.'
                                                  ,'DATE','X(19).'
                                                  ,'X('||data_length||').'
                                                  )
from user_tab_columns
where table_name in (select table_name from user_tables)
order by 2,1,3
/

01 RDEPT.
   05 DNO          PIC 9(18) COMP-3.
   05 DNAME          PIC X(10).
   05 INSERT_DATE          PIC X(19).
   05 UPDATE_DATE          PIC X(19).
   05 ACTIVE_FLAG          PIC X(1).

OK, so there are a few points to note here.

First, the biggest thing: there were 6 lines in our code sample, but there are only two selects in the final generator. This is because several of the lines belong to a common group (in this case the column definitions). This gets back to one of the early steps wherein we said to pay attention to where the transition between rowsources happens. The first line in this sample maps to table, the rest map to column. We can thus produce the last five rows from the same rowsource. So we only need one select for them. We just have to make sure we sort our results correctly and hence, the use of COLUMN_ID.

Second, we gave a column alias to the text string we were producing, and used several SQL*Plus commands to format results better. In particular we made sure not to print any columns other than the one we want to see (our text column).

Third, we were sloppy with determining the datatype specification. It is not lined up well. What is the point of generating lots of code if you aren’t going to make it look pretty? The generator will faithfully reproduce, so put more effort into it so that it will pull out those final touches that make things look professional.

Lastly, there are many options for translating oracle datatypes to COBOL datatypes so in real use this code is poor. I can get away with this because I am trying to teach, not do actual business world work. In situations like this, you might be better off creating a helper PL/SQL function that will produce a string containing whatever you need for a type specification. Then you can call this function in your SQL code. Indeed, smart use of PL/SQL helper functions can simplify your generator immensely.

Speaking of PL/SQL code, you may find that after a few rounds with these steps, you will be tacking large generator opportunities. It really is just a matter of thinking big. However, at some point your SQL generator scripts may get very large. Many hundreds of lines are not uncommon for big tasks. You might want to explore the creation of an actual PL/SQL procedure to do the generation for you instead. Most of the steps are the same but because you wrap it in procedural code, it can be much more efficient when it runs mostly due to your use of a single loop for the table which accounts most often for most lines in your sample. Here is an example of the COBOL record layout generator as a PL/SQL procedure.

create or replace
procedure cobol_layout (table_name_p in varchar2)
is
begin
   for r1 in (
              select table_name
              from user_tables
              where table_name = table_name_p
             ) loop
      dbms_output.put_line('01 R'||r1.table_name||'.');
      for r2 in (
                 select column_name,column_id,data_type,data_length
                 from user_tab_columns
                 where table_name = r1.table_name
                 order by column_id
                ) loop
         dbms_output.put_line('.   05 '||r2.column_name
              ||'         PIC '||case r2.data_type
                                    when 'NUMBER' then '9(18) COMP-3.'
                                    when 'DATE' then 'X(19).'
                                    else 'X('||r2.data_length||').'
                                 end
                             );
      end loop;
   end loop;
end;
/
show errors

set serveroutput on
exec cobol_layout('DEPT')

01 RDEPT.
.   05 DNO         PIC 9(18) COMP-3.
.   05 DNAME         PIC X(10).
.   05 INSERT_DATE         PIC X(19).
.   05 UPDATE_DATE         PIC X(19).
.   05 ACTIVE_FLAG         PIC X(1).

At this point, you have the basics in hand. My last parting words… THINK BIG. It is SQL FROM SQL. The idea is to generate lots of code to do things that are repetitive. Your goal can often be to apply sweeping changes across the database with this seeming slight of hand. The complexity of the example does not really matter. Just follow the basic process and you will get there.

I once wrote a Ten thousand line COBOL program some fourteen years ago. Nine thousand lines of it was generated code via SQL FROM SQL. Yes, they called me crazy, yes there were many who gaffed at the MONOLITHIC construct. But in the end even the most hard core admitted to the wisdom behind it; it took only 8 weeks to develop, test, and deploy; it freed an eight person maintenance crew from a data feed and reporting nightmare, and it was serviceable in spite of its size by just one lady in her spare time because it was “so consistent in its construction and easy to walk through” (her words not mine). It was fast, easily extended, and never failed in production once. They retired it last year when the system it supported was replaced by something not as good.

So go after the hard, complex, big stuff. You can generate thousands of lines of code easily. Do so.
And make sure that if you should need to alter code, do it in the code generator and rerun it. Avoid altering the result if possible. It makes it difficult to control what happens with the generated output otherwise.

Best of all, remember this: if you start with a working example free of bugs, and your generator can reproduce the original code sample correctly, then it is highly likely that all the artifacts you generate will be 100% free of defects. You may have go try a few times before you get the final version the way your want it but it is well worth it. Though eventually you will get good at it, even if it takes all day to get the generator exactly like you want it, it will pay off handsomely in the end. You can’t imagine how fast testing can go when there aren’t any bugs in the code.

Kevin

Comments

gojko's picture

great article! the urge to automate dull and error-prone tasks is one of the key traits of great programmers :)

On a related topic - I recently did a trigger code generator, that occasionally failed with the following message:

ERROR: ORA-24344: success with compilation error 
ORA-06512: at "SYS.DBMS_SYS_SQL", line 826 
ORA-06512: at "SYS.DBMS_SQL", line 39 ORA-06512: at "SYS.DBMS_UTILITY", line 549 

it turns out that CRLF (chr(13)||chr(10)) was confusing the compiler - replacing CRLF with just LF (chr(10)) solved the problem. keep that in mind if you are generating multi-line code.

Gojko Adzic
http://www.gojko.com

Kevin Meade's picture

cool, this error seems like one of those that takes hours to figure out. I am very glad you clued me to it.

Nice article. I do similar things and park the generators into stored procedures. One of them, when executed, generates the stored procedures required to update tables. Given a database, it takes literally 5 seconds to generate all the standard stored procedures for the entire database. The generated sps are parameterised based on the FKeys referencing the tables. All the metadata comes out of system tables. If there are structural table changes made - just re-run the generator to rebuild all the sps.

It took about 3 days to get the generator running properly - about the same amount of time it takes to manually do the job it does, only without any typos.
My colleages thought I was crazy too until they realised the whole process was generic and could be applied to any database. Sps? - 5 seconds later - voila!

You have a nice readable writing style. Thanks for the article.