How many ParseUnits

How any Parse units are going to be active at the same time ?

When looking at potential speed enhancements, we could create a number of nodes upfront in the ProParse singleton, store the references in a static temp-table and distribute them as required to each parseunit.

This would mean the proparse itself would take several seconds to load, but the time would then not be in the parseunit itself.

I'll try to get some numbers


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Keeping the Nodes

Parseunit now uses "pooling" and "node initialization". What is this ?

When a parseunit is initialized, it allocates 15000 nodes and temp-table records. When LoadFiles() is called, all nodes LE the numnodes are "reset". When GetNode(n) is called for the first time, the node properties are set with the appropriate values.

This means that the time taken to create the temptable and nodes only happens once, so if the parseunit is reused for another set of files, the speed is a lot quicker.

Some timings:

PrePool: 
initial load: 2100ms
Initial Read: 1065ms
Pooling:
   initial load: 118ms
   Initial read: 800ms
subsequent read: 260ms

john's picture

Re: How many ParseUnits

Typically only one parse unit is processed at a time, but using an object pool sounds awkward to me. I think that when we need it, we'll be able to find simpler ways to make things faster.


I've done some initial work.

I've done some initial work. using a pool speeds things up by 50% (1021ms instead of 2030ms). I've also added the capabilities of dynamic resizing (if the parseunit needs more nodes than available, they get created).

It's not particularly pleasant code.

However, if only one parseunit at a time is used we could use some of the pooling techniques I've done here and rather than destroying the TTnode records and nodes, perhaps we could reuse the data and objects for the next cu. That means it would only be the initial cu that would be slow.