mosaic
I have attached an xml file which can be imported in 4.1.3.6 successfully. However, it cannot be imported into 4.5.0.3. I am wondering why.

This issue has been keeping me from upgrading to 4.5.x.





Brain Matters!
-------------------------
Profile 1 - PB:4.5.0.8
OS: Vista Ultimate SP1
Java: 1.6.0_07
Profile 2 - PB:4.5.0.6
OS: WInXP running inside VMware
Java: 1.6.0_06
Quote
Harlan
Hello, thanks for posting. I took a quick look but I'm not sure what's going wrong. I've forwarded the XML to the engineer who is responsible for this part of the code. Hopefully we'll have a resolution shortly.
Regards,
-Harlan
Quote
mosaic
Thanks for the quick fix since 4.5.0.4.

However, a slightly larger xml could not be pasted into 4.5.0.5, it could be imported into 4.1.3.6. Is there a limit on the size? I did a check using Notepad++, I was able to paste the whole document.(this mean this not caused by a memory limit) However, PB4.5.0.5 was complaining no enclosing </BrainData>, obviously the XML document has that.

I am attaching the zipped xml file as a test case (2.9MB when decompressed), which is Genesis (Book 1 of Bible) complete with verse contents for each of the verses and attachment as weblink to biblegateway.com.

Could we also add an entry under the Import for XML files?

Thanks.

Brain Matters!
-------------------------
Profile 1 - PB:4.5.0.8
OS: Vista Ultimate SP1
Java: 1.6.0_07
Profile 2 - PB:4.5.0.6
OS: WInXP running inside VMware
Java: 1.6.0_06
Quote
Harlan
Great. Thanks for uploading the XML. We'll test this one out and see what's going on. Yes, we will add a File > Import > XML command... BTW, did you write the code to generate that XML?
Regards,
-Harlan
Quote
mosaic
Thanks for looking into this. Yes, I did write the code to generate the XML file.


Brain Matters!
-------------------------
Profile 1 - PB:4.5.0.8
OS: Vista Ultimate SP1
Java: 1.6.0_07
Profile 2 - PB:4.5.0.6
OS: WInXP running inside VMware
Java: 1.6.0_06
Quote
mosaic
I am doing some testing cases on copy/paste.

Test Case 1 (GUI Problem)

Create a new brain called Testing, copy the thought into Notepad, change the thought Name to New (not the source), paste into the brain, do it the second time, you will get a dialoge saying the Thought already exists. This is a correct behavior, however, the dialoge is hidden behind the Pasting Thoughts window. I tried twice with the same result. See attached screen shot to confirm.

Now, you bring the dialoge to the forefront, click Cancel (you don't want to paste into the thought again), you would expect it to abort the operation gracefully, instead, I was presented with an error (see screen shot)

When I click okay on this, I was presented with, in the second line, Pasting was spelled as Parsing...(see screen shot)







Click image for larger version - Name: brain_4.5.0.5_testing_copy_paste_ui.JPG, Views: 321, Size: 58.02 KB Click image for larger version - Name: brain_4.5.0.5_testing_copy_paste_ui_2.JPG, Views: 320, Size: 40.65 KB Click image for larger version - Name: brain_4.5.0.5_testing_copy_paste_ui_3.JPG, Views: 321, Size: 23.84 KB
Brain Matters!
-------------------------
Profile 1 - PB:4.5.0.8
OS: Vista Ultimate SP1
Java: 1.6.0_07
Profile 2 - PB:4.5.0.6
OS: WInXP running inside VMware
Java: 1.6.0_06
Quote
Harlan
Thanks for the screenshots. "Parsing" is actually not a typo, but it is probably not the best choice of words either. We'll get these sorted out shortly.
Regards,
-Harlan
Quote
Harlan
Fixed in 4506, up now. Thanks.
Regards,
-Harlan
Quote
mosaic
Harlan,

Thanks for the new version PB4.5.0.8. I have been using the following to import XML files now:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE BrainData SYSTEM "http://www.thebrain.com/dtd/BrainData1.dtd">
<BrainData>
    <BrainSourceLocation>xml_file_name_here</BrainSourceLocation>
</BrainData>
On one of the occassions, I noticed that copying process is doing the indexing of the imported file as well. I got excited, thought this is a new feature, because as you can imagine, if you wait after all the files (83 of them) are imported, then do indexing, it's a nightmare, first of all, it takes forever, secondly, the index process does not work. However, I could not make the indexing happen again on later tries, I am quite sure it was there once. Could you please let me know if there is a trigger to make the copying do the indexing of the imported thoughts from XML? If not, could we add one in, it's probably better to do incremental indexing rather than doing it as a whole in the end.

Secondly, could we expand the above structure to import more than one at the same time, such as the following:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE BrainData SYSTEM "http://www.thebrain.com/dtd/BrainData1.dtd">
<BrainData>
    <BrainSourceLocation>xml_file1_name_here</BrainSourceLocation>
   <BrainSourceLocation>xml_file2_name_here</BrainSourceLocation>
   <BrainSourceLocation>xml_file3_name_here</BrainSourceLocation>
...
</BrainData>

Currently, if I did the above, it will only do the first file, and ignore the rest.


Brain Matters!
-------------------------
Profile 1 - PB:4.5.0.8
OS: Vista Ultimate SP1
Java: 1.6.0_07
Profile 2 - PB:4.5.0.6
OS: WInXP running inside VMware
Java: 1.6.0_06
Quote
Harlan
Indexing should be happening automatically whenever you paste. If it is not happening, this is a bug. It should probably provide user feedback let show the indexing process - we'll add that to a future release.
Regards,
-Harlan
Quote
mosaic
Harlan,

I think the mechanism is there, I guess definitely it's bug. I wrote a Gui-script to automate the importing process (by watching the appearing and disappearing of the status screen.) since I have 83 files to import.

There are multiple things I have experienced:

1. copying is not doing the indexing. (I knew in one of occasions, it was showing in the status screen that it is doing the indexing) but on all the later tries, it was not. And I knew the indexing was not done since those newly copied thoughts are not being picked up when doing the search.

2. re-indexing is not helping to address the issue, by applying your suggestions to give PB more memory, I applied 384m to the PB virtual machine, on my quad core cpu machine, it was able to index the brain (total thoughts ~42,000), however, when doing the search, many of the thoughts did not show up, upon further inspection, only activated thoughts were picked up by search. So I guess re-index also failed, even if you applied more memory to PB VM, also have a faster CPU. So this is another bug.

3. Huge Memory consumption and hang. I left the PB on overnight, and I created another GUI-script to randomly walk (not through the wander, as it will not activate the thought) . In the morning, when I checked, the machine crashed. All the script was doing is using the arrow keys(up, down, left, right) and enter to activate the thoughts, I was hoping after running overnight, it will pick up most of the thoughts, but instead, it crashed my machine. When I re-ran the script this morning, after 1 hour, I noticed that PB is using ~500MB of ram. So I stopped the script.  After these two runs, I had 7130 (out of 42000) thoughts are activated now.

4. Since the work I am doing involved copyrighted materials, I cannot upload them here for public viewing). However, I am willing to share with your development team privately if that's helpful for your team. I can send you 83 XML packed in one rar files (around 5MB), it will uncompressed to around 200MB. After importing into the brain, the brainzipped version will be around 33MB, so definitely the XML version will be ideal. Also, it will help you stress test the robustness of XML importing and iron out any bug you might have.

If you are interested, I can upload the rar archive here by creating a password, and I can PM you the password. Let me know whether this is suitable for you.
Brain Matters!
-------------------------
Profile 1 - PB:4.5.0.8
OS: Vista Ultimate SP1
Java: 1.6.0_07
Profile 2 - PB:4.5.0.6
OS: WInXP running inside VMware
Java: 1.6.0_06
Quote
mosaic
Upon further trials, I was able to find more clues to the problem:

1. The visual status of updating the index only appears once in a blue moon, however, on one occasion, I was able to do a screen shot of it. The following screen shot provides further proof of it. But as I said, I have no idea what's triggering it. It would be great it could consistently showing this status.

2. If I only import one XML files, and wait (even if there is no status showing index is being done) indefinite amount of time, the index could eventually pick up. But in this way, how long I need to wait in order to import all 83 files?

3. Upon consecutive importing of 83 files by inspecting the status (without the visual clue of updating index, as a result, not waiting for indexing finish), those index was never able to complete, even if I do the re-indexing at a later stage, still not successful in indexing.

So the best solution is to consistently  provide the visual clue of updating indexing, so my script could wait the right amount of time to import the XML files.

For now, I am going to change my script to wait a long time after finishing importing one file to see whether there is any progress in terms of indexing.

A large brain without the indexing is a nightmare to use.
This brain might as well does not exist.

Click image for larger version - Name: brain_pasting_thoughts.JPG, Views: 240, Size: 96.85 KB
Brain Matters!
-------------------------
Profile 1 - PB:4.5.0.8
OS: Vista Ultimate SP1
Java: 1.6.0_07
Profile 2 - PB:4.5.0.6
OS: WInXP running inside VMware
Java: 1.6.0_06
Quote
mbaas
mosaic, I found it encouraging to read that you are getting some import done, even with errors, but at least something worked for you. (I tried a large XML for the entire bible in several translations and that import just did nothing at all). So now I tried to generate a smaller file (4bks) and that has been imported fine, but the indexing was horror - consumed CPU for >2hrs, until I had to kill the .exe (needed my CPU for other work). I want to validate my file first and will then post it, as that seems to be far too long

One thing I observed during the import was that it seems the plex was continously redrawn - I have pasted the thought below the home-thought which has 2 jumps, and these jumps were always redrawn. I imagine this must also consume a bit CPU, so maybe we could gain a bit performance on the import by disabling that?

So, the indexing definitely happend and also the progress bar seemes to be contiously updated (but far too slow, overall). Maybe that's because in my structure I have one thought per bible-verse, so the chunks for indexing are smaller...


Quote
mosaic
mbaas,

Thanks for your message.

You wrote, "So now I tried to generate a smaller file (4bks) and that has been imported fine, but the indexing was horror - consumed CPU for >2hrs, until I had to kill the .exe (needed my CPU for other work)."

When you say the indexing was horror, could you please clarify that you are doing the indexing while importing individual files or you are doing the re-indexing after all the small files being imported?

Also, if you have not already tried, please give your PB virtual machine more memory by using the following tip (originally given to Br. K by Harlan):
-------------------------------------------------------------------------------

To increase the available memory (when operations such as importing a large PB3 file are failing)

1. Close PB4

2. Create a text file and put the following content into it:

-Xmx196m

Make sure there is a linefeed (Enter) at the of the text. Save this file and name it PersonalBrain.vmoptions in the PB program folder (the same location where the output.log file appears).

3. Restart PB4.

---------------------------------------------------------------------------------

Let me know whether that makes a difference for you or not. (In my case, I tried -Xmx384m, since I have lots of memory to spare)

In my case, while importing individual files, the indexing was not done consistently - because of the lack of a consistent visual clue as to when the index should be finished - so my gui-script does not know how long it should wait for the index to finish. This is one part of the problem. The second part, after all files being imported, I can do the re-indexing, however, as I mentioned, the re-indexing did nothing at all, all the missing indexings are not picked up by the re-indexing process.

BTW, our NRSV version has three layers of contents duplications:

We have Chapter Layer Thoughts which contain contents for each chapter, we have Pericope Layer Thoughts which contain contents for each pericope, we also have verse layer thoughts which contain verse content.

You also wrote, "One thing I observed during the import was that it seems the plex was continously redrawn - I have pasted the thought below the home-thought which has 2 jumps, and these jumps were always redrawn."

I haven't noticed this behaviour. One thing I did notice, I wrote a gui-script to radomly walk the bible (different from wander, as wander did not activate the thoughts it walked through), this script is very very simple, all it does is to use the arrow keys (up, down, left, right) and enter to walk through the thoughts. It actually on one occasion crashed my computer, on second occasion increased my memory consumption to around 500MB (after one hour of random walk). This is a big problem, but probably difficult to diagonose.

If you need any help with testing your importing, I can help you. As my gui-script is pretty good at doing this. What it does is it will ran through all my xml files one at a time, and copy and paste it into the PB, then wait for the status screen (copying thoughts, copying links, updating thoughts...) to disappear, then continue the next. - the missing link here is the status screen does not consistently show as a last step updating indexes.  So, the indexing part will have to wait until they have fixed the problems.







Brain Matters!
-------------------------
Profile 1 - PB:4.5.0.8
OS: Vista Ultimate SP1
Java: 1.6.0_07
Profile 2 - PB:4.5.0.6
OS: WInXP running inside VMware
Java: 1.6.0_06
Quote
mbaas
mosaic, first of all, one clarification about my approach: I have lots of bible-texts stored in datafiles, and I use these to build the XML as I need it. And no matter how many books I export into XML, I always just create a single file. So there was no question of doing indexing while importing the individual files or afterwards...

Thanks for the reminder about VM - I have set this to 300mb now. Hopefully I can do another run this evening, am very busy with other things right now...

Strange you didn't notice the screen-redrawing, maybe I was able to see it because my CPU was 100% and I had more running than just PB at that time, so the whole thing was very slow. (Or maybe this is OS-related...)

Thanks for offering help with the import, but as I use a single file only, that is not a problem for me - it's an 'all or nothing'-approach and atm I'm getting nothing, I'm afraid     Of course I could modify my XML-generator, but probably by the time I have done that, Harlan will also have fixed the import-probs. So I rather wait on him

Quote

Add a Website Forum to your website.

Newsletter Signup  Newsletter Signup        Visit TheBrain Blog   Visit TheBrain Blog       Follow us on Twitter   Follow Us       Like Us on Facebook   Like Us         Circle Us on Google+  Circle Us         Watch Us on Youtube  Watch Us       

TheBrain Mind Map & Mindmapping Software     Download TheBrain Mind Mapping Software