hpmartin
hello,
is there a limit regarding the file-size of an attachment? i attached an excel-file (Excel 2003) to a thought and found out that only the first 2.000 (roughly) rows are being indexed, the cell-contents below cannot be found through advanced search (F9). i converted it to Excel 2000 and back - but this doesn't change anything.
is this a bug?
There is some other thing striking: if I search for a word that's at the upper part of the file this word is found and I get the context of this word in the advanced search-box. For words in a lower part of the file I do not see this context (see attached document).

thanks for any reply.
alex


Quote
Harlan
Hi Alex,

For performance reasons, contextual previews of search results are not done beyond a certain point within a file. PB converts attachments into a text format so they can be indexed. If the text version does not include your search terms, they will not be indexed. If you are curious, you can look at the text versions of the files within the _brain/text folder.

Regards,
-Harlan
Quote
hpmartin
Hi Hakan,
thanks for your reply.
I found the ascii-file with the contents of the excel-file in the folder you mentioned and in this file I do find the word that cannot be found through the advanced search. Even with the windows-explorer filesearch i do find this word - but unfortunately not in PB, not even after rebuilding the search index manually.
when i update the excel-file in the first row it's no problem: after saving the file PB immediately finds the newly inserted string. but when in update the last row: no chance, only windows filesearch is able to find the textstring.

do you have any solution for this?

thanks in advance
alex

Quote
zenrain
You may try changing the minimum score setting Harlan posted in this thread.
http://forums.thebrain.com/tool/post/thebrain/vpost?id=2019667

Windows 7
J-1.6.0_22
--
OSX 10.6.3
Java SE 6
Quote
hpmartin
yes, i've seen this thread and I've already adapted the minimumScore-values of the xml-file to the ones proposed in the thread.
thanks anyway for your help.
is there some other thing I could do?
alex

Quote
Harlan
Alex,

If you can send us the file, we can take a look and see what the problem is. If possible, please zip it and send it to support@thebrain.com and ask them to forward it to me. I will send it on to one of our engineers who works on the search and indexing code.

Regards,
-Harlan
Quote
hpmartin
Hello Harlan,

I sent you the file.
Only words up to row 2028 can be found through advanced search.

thanks for your help.

regards
alex

Quote
mcaton
Thanks Alex.  We received the file.  Harlan and the engineers are taking a look.

Matt
Quote
hpmartin
hi Matt,
have you found so far anything that could explain why indexing this file doesn't work how it should?

thanks.
alex

Quote
Harlan
The upcoming release, 4.1.1.1, contains a fix that will address this problem. By default PB indexes the first 10,000 terms in a document. Your document had 208,000 terms. The default size has been increased to 50,000 and you will be able to increase it it 250,000 to handle your case by editing the SearchEngine.xml file. This will be noted in the release notes. (We did not increase the default to 250,000 because most documents are not this large and doing so will increase memory usage.)

Thank you for sending the file - sorry you did not receive an update earlier.

Regards,
-Harlan
Quote
hpmartin
Hi Harlan,

the search in the excel-file still does not work although I adapted the parameter 'maxFieldLength' in SearchEngine.xml to a higher value (to 250000).
What is even more problematic, however, is that searches of the contents of an internal thought's note are not successful (s. attached file).
When I search for 'xuser' (appears twice in a thought) PB does not find the corresponding thought.When I search for the word 'superdba' which also occurs twice in the same thought then the thought is found.

Do you have any solution for this?

Thanks in advance
Alex
Quote
Harlan
Alex,

Once you have modified the maxFieldLength parameter, you need to make sure the file is reindexed. To do so, move it to your desktop and delete it from your Brain, then reattach it.

For the problem you are having with the note, can you post the note's HTML source code? (see the view menu) Thanks.

Regards,
-Harlan
Quote
hpmartin
Harlan,

i deleted the excel-file and re-imported it, but the problem still exists.

the contents of the mentioned note is as follows:
<p>dbmcli -d d10 -u superdba,xxy user_put superdba PASSWORD=admin<br />
xuser -U w -u superdba,xxxxx -d D10 -n cherwsvr3d1 -S INTERNAL -t 0 -I 0<br />
dbmcli -d D10 -U w &#62; ok</p>

<p>dbmcli -d d10 -u superdba, xxz user_put control PASSWORD=machgk00<br />
<b>xuser</b> &#160;-U w -u control,xxxxxx -d D10 -n cherwsvr3d1 -S INTERNAL -t 0 -I 0<br />
dbmcli -d D10 -U c &#62; ok</p>
<p>&#160;</p>
<p>&#160;</p>

Although the word 'xuser' shows up twice the advanced search cannot find it.

Thanks.

Alex

Quote
Harlan
We are working on a possibly related bug with search results now. I'll make sure they look into this issue at the same time. Thanks for posting the information.
Regards,
-Harlan
Quote
Harlan
The reason the results are not appearing is due to the configuration of the search engine, which is set to only results with a certain relevancy score. This can be adjusted downward by editing the SearchEngine.xml file located in the res folder.

Open this file in a text editor and change the minimumScore values (which appear 3 times in the file) to 0.00. Restart PB after making these changes and the results will appear as expected. The next release (4.1.1.2) will be set so that 0.00 is the default value.

Regards,
-Harlan
Quote

Add a Website Forum to your website.

Newsletter Signup  Newsletter Signup        Visit TheBrain Blog   Visit TheBrain Blog       Follow us on Twitter   Follow Us       Like Us on Facebook   Like Us         Circle Us on Google+  Circle Us         Watch Us on Youtube  Watch Us       

TheBrain Mind Map & Mindmapping Software     Download TheBrain Mind Mapping Software