Home » SQL & PL/SQL » SQL & PL/SQL » Select columns based on first record (Header from flat file) (Oracle 11g, Windows 2012 R2)
Select columns based on first record (Header from flat file) [message #655609] |
Tue, 06 September 2016 00:08 |
|
rahul1982
Messages: 53 Registered: November 2011 Location: Pune
|
Member |
|
|
I have an table, where the column sequence is not fix. I want to select only specific column by referring the first record which is a header from flat file.
Below are the sample scripts, I want to select only two columns where first record is name & city.
create table cust
(col1 varchar(10),col2 varchar(10), col3 varchar(10),col4 varchar(10), col5 varchar(10));
insert into cust
values('custno' ,'name' , 'address' ,'city' , 'zip' );
insert into cust
values('1' ,'abc' , '123 street' ,'Houston' , '77004' );
insert into cust
values('2' ,'qwe' , 'test ave' ,'Pune' , '41125' );
insert into cust
values('3' ,'xyz' , '10 floor' ,'New York' , '44547' );
insert into cust
values('4' ,'pqr' , '15 floor' ,'New York' , '44557' );
insert into cust
values('5' ,'hhh' , 'Test' ,'Mumbai' , '77455' );
Thanks!
|
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655612 is a reply to message #655609] |
Tue, 06 September 2016 02:36 |
John Watson
Messages: 8932 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Further to thisQuote:I hope this is a homework, you need to more precise if you want to get good marks. This,
Quote: I want to select only specific column by referring the first record is not the way to describe the problem. A relational engineer would say Quote:I want to PROJECT only specific column by referring the first ROW
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655614 is a reply to message #655609] |
Tue, 06 September 2016 03:24 |
flyboy
Messages: 1903 Registered: November 2006
|
Senior Member |
|
|
As far as rahul talks about flat file, I wonder why he wants to use Oracle database for its processing and how the posted table is related to its actual content.
This task looks like a part of some (home-made?) ETL tool, so I would say that any text processing utility (sed/awk/perl or some Windows ones if he uses that system) would serve this purpose - extract text from the position based on the first file row content - better.
|
|
|
|
|
|
|
|
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655657 is a reply to message #655655] |
Wed, 07 September 2016 02:04 |
|
Barbara Boehmer
Messages: 9095 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
Michel Cadot wrote on Tue, 06 September 2016 23:57
I don't understand your last sentences.
A table has a fixed number of columns and if the column names don't match the value positions you can't do nothing.
The OP said that the "column sequence is not fix", which I understand to mean that the order of the columns may change with each data set that they receive. So, the columns containing the name and city data and the position of the words name and city in the header row could be in one position in one data set and in another position in another data set. The next time that they receive another data set, the columns could be in different order and position. I have had to deal with something similar in the past, where the source that we obtained the data from kept changing their format from month to month, causing the code to have to be rewritten to load it, unless you write something generic like this. In that situation, sometimes they also added columns or changed column names, as well as shuffling the order, so I allowed for that as well. Is that clearer?
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655658 is a reply to message #655655] |
Wed, 07 September 2016 02:10 |
|
Michel Cadot
Messages: 68653 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
SQL> with
2 names_row as (select * from cust where not regexp_like(col1,'^\d+$')),
3 names as (
4 select position,
5 decode(position, 1,col1, 2,col2, 3,col3, 4,col4) column_name
6 from names_row,
7 (select level position from dual connect by level <= 4)
8 )
9 select decode(c1.position, 1,col1, 2,col2, 3,col3, 4,col4) name,
10 decode(c2.position, 1,col1, 2,col2, 3,col3, 4,col4) city
11 from cust,
12 (select position from names where column_name='name') c1,
13 (select position from names where column_name='city') c2
14 where regexp_like(col1,'^\d+$')
15 /
NAME CITY
---------- ----------
abc Houston
qwe Pune
xyz New York
pqr New York
hhh Mumbai
5 rows selected.
|
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655660 is a reply to message #655657] |
Wed, 07 September 2016 02:16 |
|
Michel Cadot
Messages: 68653 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
Barbara Boehmer wrote on Wed, 07 September 2016 09:04The OP said that the "column sequence is not fix", which I understand to mean that the order of the columns may change with each data set that they receive. So, the columns containing the name and city data and the position of the words name and city in the header row could be in one position in one data set and in another position in another data set. The next time that they receive another data set, the columns could be in different order and position. I have had to deal with something similar in the past, where the source that we obtained the data from kept changing their format from month to month, causing the code to have to be rewritten to load it, unless you write something generic like this. In that situation, sometimes they also added columns or changed column names, as well as shuffling the order, so I allowed for that as well. Is that clearer?
Not sure but the query I gave does not care about the order of the columns but, of course, the number of cust table columns is fixed.
If it can change just change the query to match with it as you'd do it for any application code when you change the definition of a table.
I think the problem is like loading a table with SQL*Loader, if you change the table definition you have to change the control file.
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655661 is a reply to message #655659] |
Wed, 07 September 2016 02:19 |
|
Michel Cadot
Messages: 68653 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
Quote:I will try to further clarify with an example. Suppose that the next data set that is loaded from a flat file has the columns in the following order, so that the name data is now in col1 and the city data is now in col3.
My query handles this case (as long as the number of columns does not change or the selected columns are still in the initial number of columns).
[Updated on: Wed, 07 September 2016 02:21] Report message to a moderator
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655662 is a reply to message #655660] |
Wed, 07 September 2016 02:22 |
|
Barbara Boehmer
Messages: 9095 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
Michel Cadot wrote on Wed, 07 September 2016 00:16
... the query I gave does not care about the order of the columns...
Apparently, your query does care about the order of the columns, because when the order of the columns is changed, as shown below, your query returns no rows.
SCOTT@orcl_12.1.0.2.0> select * from cust
2 /
COL1 COL2 COL3 COL4 COL5
---------- ---------- ---------- ---------- ----------
name address city zip custno
abc 123 street Houston 77004 1
qwe test ave Pune 41125 2
xyz 10 floor New York 44547 3
pqr 15 floor New York 44557 4
hhh Test Mumbai 77455 5
6 rows selected.
SCOTT@orcl_12.1.0.2.0> with
2 names_row as (select * from cust where not regexp_like(col1,'^\d+$')),
3 names as (
4 select position,
5 decode(position, 1,col1, 2,col2, 3,col3, 4,col4) column_name
6 from names_row,
7 (select level position from dual connect by level <= 4)
8 )
9 select decode(c1.position, 1,col1, 2,col2, 3,col3, 4,col4) name,
10 decode(c2.position, 1,col1, 2,col2, 3,col3, 4,col4) city
11 from cust,
12 (select position from names where column_name='name') c1,
13 (select position from names where column_name='city') c2
14 where regexp_like(col1,'^\d+$')
15 /
no rows selected
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655664 is a reply to message #655662] |
Wed, 07 September 2016 02:28 |
|
Michel Cadot
Messages: 68653 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
It does not work in this case because of the prerequisite I mentioned in my first post:
Quote:"First" is meaningless in a relational table as rows are like balls in a basket.
In this very specific case, you can determine the "column names" row as this is the lone one which does not contain a numeric value in "col1".
You have to define a condition that is met by the row containing the column names and can't be fit by any other row.
Your query does not work either if someone has 'name' for city for instance.
[Updated on: Wed, 07 September 2016 02:28] Report message to a moderator
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655668 is a reply to message #655664] |
Wed, 07 September 2016 02:42 |
|
Barbara Boehmer
Messages: 9095 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
Michel Cadot wrote on Wed, 07 September 2016 00:28It does not work in this case because of the prerequisite I mentioned in my first post:
Quote:"First" is meaningless in a relational table as rows are like balls in a basket.
In this very specific case, you can determine the "column names" row as this is the lone one which does not contain a numeric value in "col1".
You have to define a condition that is met by the row containing the column names and can't be fit by any other row.
Your query does not work either if someone has 'name' for city for instance.
So, your query does not meet the OP's requirements.
My query will fail if there is a value of "name" or "city" in some row other than the header row. Since this appears to be name and address data, I think that is unlikely. Although there might be something like "City of New York" or "New York City", I doubt there would be just "city" or just "name".
If, when loading the flat file, using either SQL*Loader or an external table, a sequence could be added, then that could be used to determine the header row, which would make things easier.
[Updated on: Wed, 07 September 2016 02:43] Report message to a moderator
|
|
|
Re: Select columns based on first record (Header from flat file) [message #655670 is a reply to message #655668] |
Wed, 07 September 2016 03:25 |
|
Michel Cadot
Messages: 68653 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
Quote:So, your query does not meet the OP's requirements.
If everything is free there is no solution.
Note: if it is sure there is a numeric ID in a field, you can check all columns and not just the first one as in my query.
Quote:My query will fail if there is a value of "name" or "city" in some row other than the header row.
So, your query does not meet the OP's requirements.
Quote:If, when loading the flat file, using either SQL*Loader or an external table, a sequence could be added, then that could be used to determine the header row, which would make things easier.
Sure, there is an order in a file there is not in a table.
The prime condition for the existence of a solution to OP's problem is:
Quote:You have to define a condition that is met by the row containing the column names and can't be fit by any other row.
If this requirement is not satisfied, there are no solutions.
|
|
|
Goto Forum:
Current Time: Sun May 19 10:31:41 CDT 2024
|