Go to top of the forum TheBrain Message Board RSS Feed
A Message Board, Guestbook, or Poll hosted for your website.
TheBrain Technologies Message Board

Register Login New Posts
 
Forums > PersonalBrain Issues > search pdf attachment uncorrect result
 
Username:  
Password:  
 
   
 


Thread Tools Search This Thread 
Reply
 
Author Comment
 
Reality
Registered: Feb 03, 2010
Posts: 7

    Feb 03, 2010 at 05:06 PMReply with quote#1

I m searching for exsample for "ebcdi" and PB comes up with 2 different PDF Files. But I am sure that "ebcdi" is mentioned in minimum 3 more documents.

Rebuilding the search index made no improvement.

In output.log i found:
BrokenFiles:
  152 total:
.....

windows explorer finds about 10 txt.files with "ebcdi" in the mybrain\txt\ directory but it is not abel to search in the pdf files either.

OS: Win 7
PB: 5.5.2.2
J 1.6.0_17

Any ideas?

Moe
Moderator
Registered: Nov 18, 2008
Posts: 1,277

    Feb 04, 2010 at 09:45 AMReply with quote#2

Reality, Thank you for posting. The BrokenFiles report contains names of attachments that PB was not able to index. If the attachments you are sure about appear in the list that indicates why they are not being found. In such a case rebuilding the search index will be of no use.

PB should be able to index PDF attachments without a problem. Try setting PB to run as administrator then try attaching PDF's and searching. To set PB to run as administrator Right Click PersonalBain in your Start menu then select Properties > Comptibility (tab) > Run this Program as an administrator.

Thanks,
Moe
Reality
Registered: Feb 03, 2010
Posts: 7

    Feb 04, 2010 at 11:24 AMReply with quote#3

Thank you for your answer.

I found out that some of the pdf files have some protections.
write protection or
comment protection

does that have to do something withe the search index?

PB is already set to administrator mode....


Reality
Registered: Feb 03, 2010
Posts: 7

    Feb 04, 2010 at 11:57 AMReply with quote#4

Me again,

I thought it would be a good idea to send you the pdf files as an example. 

The first one ist the original file. It have all posible protections.
The second one is the same file, but i manipulate the settings with a tool. (I hope they will not read here. Please delete the attachments later.)

The second file is searchable in the Windows Explorer with adobe's IFilter. I think the first one is not.

I am testing with "ebcdi" that is mentioned in both files.

PB is not able to finde any file of these.

 
Attached Files:
pdf tu-berlin-Skript-Teil2-codierung.pdf (165.58 KB, 6 views)
pdf tu-berlin-Skript-Teil2-codierung_np.pdf (215.96 KB, 4 views)

Reality
Registered: Feb 03, 2010
Posts: 7

    Feb 04, 2010 at 12:48 PMReply with quote#5

Hey,
I habe got a partly solution.

My search key was wrong. ebcdi was not mentioned in the document. ebcdic was mentioned.
Windows Explorer is finding ebcdic when i put ebcdi in... It would bee nice if PB would too.

With the right keyword PB is also finding the word in the protected document. :-)

I have still the problem with the broken documents. I found out that some of the documents are realy broken but the moste are searchable with the IFilter.

I was not able to find "induktion" wiht PB in the attached document. "klausur2.pdf"

Now i re-attached the file again and.. hey..I was able to find "induktion" in this document. How is that possible. Is that because PB was not set to adminstrator mode at the beginning?

Do I have to do that with all the 150 broken files?


 
Attached Files:
pdf klausur2.pdf (47.22 KB, 7 views)

Reality
Registered: Feb 03, 2010
Posts: 7

    Feb 04, 2010 at 01:03 PMReply with quote#6

:-) and again,

I am sorry, the document was not searchable with PB. I merged it with another document, then it was searchable.

I attached the merged document. test.pdf

Try to find "induktion" again.

This document is made with iText 2.1.7
The other one was made wiht Ghostscript

Sorry for all that chaos...

 
Attached Files:
pdf test.pdf (101.06 KB, 5 views)

Reality
Registered: Feb 03, 2010
Posts: 7

    Feb 19, 2010 at 07:26 AMReply with quote#7

No answer? :-(

Moe
Moderator
Registered: Nov 18, 2008
Posts: 1,277

    Feb 19, 2010 at 03:14 PMReply with quote#8

Reality,

Thanks for posting.

Quote:

I have still the problem with the broken documents. I found out that some of the documents are realy broken but the moste are searchable with the IFilter.


IFilter may handle indexing documents differently then PB, so comparing IFilter with PB is like comparing apples and oranges.

Quote:

I was not able to find "induktion" wiht PB in the attached document. "klausur2.pdf"

Now i re-attached the file again and.. hey..I was able to find "induktion" in this document.


If this is the case,try the following:
  1. Close PB if open.
  2. Make a backup copy of your ".brain" file and your "_brain" folder then store them somewhere safe.
  3. In your "_brain" folder locate the folder "txt" then delete it.
  4. In your "Program Files\PersonalBrain" folder locate the "Brokenfiles.dat" file then delete it.
  5. Restart PB then do a re-index via File > Utilities > Rebuild Search Index.
Please let me know if this works for you.

Thanks,
Moe


Reality
Registered: Feb 03, 2010
Posts: 7

    March 30, 2010 at 01:30 AMReply with quote#9

Thanks for your answer,

It looks like it worked.

I renamed brockenfiles.dat -> brockenfiles.altdat
and output.log -> output.altlog

Rebuild the search index...

and now the output.log shows:
Brocken Files
[empty]

I also did some search-tests and could find all words/documents i was looking for.

Thanks very much for your help...
Rene
Previous Thread | Next Thread
Reply

  Bookmarks  
Digg Diggdel.icio.us del.icio.usStumbleUpon StumbleUponGoogle Google

TheBrain Mind Map & Mindmapping Software     Download PersonalBrain Mind Mapping Software