Friday, January 02, 2004
These are the notes from the optimisation work I was doing before Christmas:
Profiling dtable retrieval
mark object counts after search (india export)
and retrieval of entry.jsp
count objects created by delivery of entry_dt.jsp
287,000 instances of LinkedList$Entry
99.8% from HighlightEngine.highlightTerms
247,520 instances of LinkedList$ListItr
97% from highlightTerms.
180,018 instances of String
92% from highlightTerms.
So, highlighting could be a big hit.
1. Time 10 retrievals with existing code.
2. Remove highlight call and time again.
10 refreshes took 1 minute 20 seconds
Now with highlighting disabled:
50 seconds (saved 37.5%).
So, disable it for now and reintroduce later on. 5 seconds per requests is still too slow.
Test in profiler again for new object counts.
Much better! Highest is 16k for char[].
48% of these allocated in TextChunker.read - is there any way we can remove this?
not really, unless we don't store the table content in the entry.
OK, if we cut the dtable element from the stored RML then:
1. it might be much faster
2. we can regenerate it if required from the representation in the dtable SQL tables.
Let's try by hand-editing the stored entry text and see how much faster it is.
Other big hits before we rewrite:
15k byte[]
12k char[]
10k String
7k HeapCharBuffer
7k HeapByteBuffer
7k GetPropertyAction
4k Object[]
Now to edit the db and try again.
Frankly, no better. Why's that?
Changes weren't saved!
Will try again.
Amended transaction attribute in ejb-jar.xml and redeployed.
OK, it was actually a bug in EntryBean, which must have been there since the start. If you shorten an entry below the chunk threshold, the chunk field is never cleared. Now fixed.
Back to the optim.
new figures are
8k char[]
7k byte[]
5k HeapCharBuffer
2k StringBuffer
better but still high!
10 retrievals now take 24 seconds. Much better.
Putting the highlight code back in for pre-Christmas-break check-in.