Home » Server Options » Text & interMedia » Want to Search any kind of PDF and other types of documents in one single coulmn
Want to Search any kind of PDF and other types of documents in one single coulmn [message #188177] Thu, 17 August 2006 06:31 Go to next message
Bharath Kumar ,V
Messages: 18
Registered: February 2004
Junior Member
Hi,

Want to Search any kind of PDF and other types of documents in one single coulmn

The text search was working fine for all types of documents,except it was not able to search certain pdf files. The documents name and paths were stored in the base table in varchar2 fileds

In order to get a solution, I went to the link
http://www.oracle.com/technology/products/text/htdocs/altfilters.htm
I choose XPDF way using user_filter. But now it is working only for PDF files (all kinds of PDF). It is not searching any other types of documents(like txt,html,xml and etc.,). I want to know, how to make it search all kinds of PDF as well as other types of documents also. Somewhere I understand, I have to create two filters, one(user_filter) for PDF and another for rest of the document types and comnine these two filters in creating an index. But not sure how to do this OR is there any other way?

The steps I have done for user_filter are as follows

begin
ctx_ddl.create_preference('DFILENAME','FILE_DATASTORE');
end;

begin
ctx_ddl.create_preference ('my_xpdf_filter', 'user_filter');
end;

begin
ctx_ddl.set_attribute('my_xpdf_filter', 'command', 'pdftotext.exe');
end;

create index docs_bak_idx on DOCS_BAK (DOC_PATH_FILENAME)
indextype is ctxsys.context
parameters ('Datastore DFILENAME filter my_xpdf_filter');

And finally made it as a job for sync index every 2 minutes.

If possible this post can be treated as URGENT.
Thanks

Re: Want to Search any kind of PDF and other types of documents in one single coulmn [message #188238 is a reply to message #188177] Thu, 17 August 2006 09:03 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
I don't think there is any way that you can combine two filters in one index like that. I think the most that you can do with one index is to have a format column where you can mark those that are just text and don't require any filter. Alternatively, you can have two separate tables, with two separate indexes, with two separate filters, and combine the search results. I recall that you posted the same question on the OTN forums a week ago. You might try posting on http://asktom.oracle.com to ask Tom Kyte. If Oracle expert Tom Kyte can't suggest something at least maybe he can confirm that it can't be done. He frequently has a large backlog of questions, but still accepts a few new ones briefly a few times per day, so you may have to check frequently to catch a time when you can post a new question. Or, if you can find a closely related thread about pdf's and filters, you might try adding your question as part of a review, which can be posted any time.
Re: Want to Search any kind of PDF and other types of documents in one single coulmn [message #188321 is a reply to message #188238] Fri, 18 August 2006 00:01 Go to previous message
Bharath Kumar ,V
Messages: 18
Registered: February 2004
Junior Member
Hi,

Thanks for the reply, instead of another table we already done it with another column, but we are looking for a solution where it can be done on the same column, and I refered tom site it says lot of backlog right now. if we have a solution of the same column it will be appreciated.

Cheers
B
Previous Topic: file_datastore not indexing correctly (oracle version 10.2.0.1.0)
Next Topic: ORA-04030: out of process memory
Goto Forum:
  


Current Time: Thu Mar 28 10:15:08 CDT 2024