xtim
Thursday, May 20, 2004
 
Added class="atlastile" attribute to the atlas tile inputs on the atlas page. This is so Carl can use the stylesheet to gives users a crosshair cursor. Pretty!

T
 
Added navigateFromTile to the IQueryOperations API. This replaces the existing sequence of

1 navigating from the original tile to the nearest featured tile then
2 finding the entry for the feature nearest its centre (all with the AtlasDAO)

with a single call to the QueryTool. It's more efficient but less predicable - you'll get one of the features from the destination tile but not necessarily the one nearest its centre. We still achieve our goal of moving n tiles east until we hit something and we get there quicker - have to get user feedback on whether people feel disorientated.

T
 
Added entryIdFromTileLocation to the IQueryOperations API, which tidies up and centralizes the resolution of a map click to an entry delivery. Testing in entry_at.jsp and it looks good.

Prepared a list of titles and ISBNs for TimC.

T
 
Integrated atlas tile rendering with our standard IQueryOperations interface. This provides a neater path into the tile rendering and re-uses our existing db connections for improved efficiency. This is for the June release. I'll do the same for our other AtlasDAO functions this afternoon.

Cluster update is on its way...

T
Wednesday, May 19, 2004
 
Tidied up new entry page, adding an include file to handle atlas requests.

T
 
Index merge ran out of disk space. Deleted temp files and original index, trying again.

T
 
Updated live jar and added index. Indexes are now merging. Today's going to be over too soon to run a cluster update, will run a getimage to speed things up tomorrow.


T
 
Fixed compass-point navigation in a similar way.

T
 
sql import part 1 is complete, with 4.5 gb free. Now importing the smaller part 2.

monument_3 and indexes on their way up next.

T
 
Now then. That was a corker.

There's a bit of a gap in the http spec as to what should happen when the user clicks an image input on a form. In particular, it's not defined whether the browser should send the coordinates of the click, the name-value pair associated with the input, or both.

Mozilla sends both.

IE, as we've just discovered, only sends the former.

This meant that our atlas handling code was receiving requests for a map centred on pixel 57, 103 of that tile. You know, that one. No index number given - at which point it understandably gave up.

Now fixed. We include a hidden field which lists all tile ids on the page, and each one appends its tile id to its associated control name. The receiving page can then iterate through all possible control names, looking for the one which sent us the click. Once found, we retrieve the click coordinates and proceed as before. A little less efficient than the original algorithm, but more browser-proof. If you're using lynx, I'm afraid you're still screwed.

Re-tagging as monument_3 before release.

T
 
Ah - the weekly vacuum kicked in overnight on the live site, which could explain the delay.

T
 
sql update STILL RUNNING on w1. This isn't even the complete update - there's a (shorter) part two which updates the entry headings. Free disk running at 3.6 gb.

Remaining to do for release:

complete sql update
perform sql update part 2
rsync index files
merge indexes
vacuum
update jar to monument_2
final test
cluster update

T
Tuesday, May 18, 2004
 
Indexes merged and search seems to work fine. Added globe-on-a-stick icon alongside results from the atlas.

T
 
New combining char gifs are live and have filled in the gap in my test entry. Bonza!

Index merge still not complete.

sql import still running, disk space is now down to 3.2 gb.

Log processing for the "Marketing your library" questionnaire seems OK.

T
 
Index is complete, now merging.

monument_2 passed jtest and jsitetest and is now running on didcot.

sql update still in progress.

The gym beckons.

T
 
Vacuum is complete - we've got 4.5 gb space on the server and I hope it's going to be enough. Started the import.

Fixed the dateline bug and speeded up the sql a bit. Will check in, re-tag as monument_2 and update rsw1.

Heading update completed without any errors. Entities are coming through fine (at least, we're getting the alt text for the images and the images are due up later this week).

Next: index the book, merge indexes, put monument_2 live on didcot, test...

T
 
Currently:

1. Vacuuming rsw1 before installing part 1 of sql update.
2. Updating headings on goring - no entity warnings yet. Woohoo!
3. Fixing map rendering for targets on tiles which cross dateline.

T
Monday, May 17, 2004
 
Things currently underway:

1. publish atlas.

Waiting for re-mark to complete before proceeding. We need to regenerate headings and re-index to take account of the newly-added entity combinations, then put the sql live, merge indexes and perform a cluster update. That will also put the May jar release live across the cluster.

2. bring tomcat into commission.

Waiting for the test servers from Matt - he's been busy today sorting out Outlook problems elsewhere.

Need to do some coding before I forget how to type. Will add the ability to search for a location by lon / lat.

T
 
Updated all entity mapping files and installed them on rsw1 ready for this week's cluster update.

Link re-mark is taking a while.

T
 
Re-ran the sql link step of the heading term linker, and the atlas now has links in that linker's cache. Schweet. Re-running mark links for the atlas.

Have now added the new entity combinations to most of the mapping files; Carl's generating the gifs and then I'll add the image references to the html mapping (need to select im_base or im_bottom as appropriate).

T
 
Hmm - the sql link for the heading linker timed out, which is why the new atlas is light on links. Will re-run that step and re-mark. Perhaps goring was very busy at the time?

T
 
Performed a manual update for the Brisbane account to bring their shelf up-to-date. Need to write a more efficient sync routine so we don't get huge SQL logs when someone hits sync - shouldn't be too hard to do.

T
 
Discovered that we're missing quite a few Unicode combinations (letter + combining chars for accents) which are used in the atlas. That's OK, we always thought we'd have to extend our set as new content comes in. We'll prepare the appropriate information today and I'll re-run the heading generator and indexer. Our new recruits are:

A_304
A_328
C_30C
C_327
D_30C
D_327
E_304
E_30C
G_30C
H_327
H_331
I_304
I_30C
I_328
K_328
L_31B
L_328
N_30C
O_30C
R_30C
R_331
S_30C
S_327
T_327
U_304
U_30C
U_328
U_A_304
U_C_30C
U_C_327
U_D_30C
U_D_327
U_H_327
U_H_331
U_O_304
U_O_30C
U_R_30C
U_R_327
U_S_30C
U_S_327
U_T_327
U_T_328
U_U_304
U_U_30C
U_Z_30C
U_Z_327
U_Z_331
Y_304
Z_30C
Z_327
Z_331

The new atlas import seems short on links - tracking down the reason now.

T
Friday, May 14, 2004
 
Another set of downloads, another failed build for the JNI jk and jk2 connectors. Bah. We're leaving this here. Matt is going to prepare me a pair of boxes for testing: I'll set one up with apache + jserv, the other with apache2 + mod_jk + tomcat 5.0.24 over ajp13. Then we can run some performance tests. Matt may look into JNI integration but I'm leaving further research in that direction to the experts! I'll get the sites running and we should be able to translate that into whatever environment we end up with.

It looks as if this will coincide with the site upgrade from rh62 to rh enterprise 3 (via whitebox) - we'll be going for a big-bang roll out! What's life without some excitement...

T
Thursday, May 13, 2004
 
After a bit of hacking, got the java side of the tomcat connectors to build under my new source tree.

Native side won't build. The generated makefile ends up calling apache's libtool with a classpath (where you might be expecting the name of an apache module). Who decided to release this? Couldn't they have tested it?

Trying an earlier release.

T
 
Started reimport of atlas. Tomorrow: clear links to atlas tiles from previous import then re-run atlas location linker.

T
 
Tomcat now built from sources and serving pages on my local machine.

T
 
Updated the xml implementations which ship with the 1.4 jdk to the latest version, as suggested by the xalan faq. These went into the jre endorsed directory and we're now at version 2.6.

Tomcat build continues...

T


 
Tracked the firefox/java problem - ff requires java 1.4.2 or later, the version of the plugin in that release requires gcc support libraries from gcc 3.2.2 or better and I'm on RedHat 7.3, which now may as well be cuneiform. Another thing to add to the list of things to do in idle moments - upgrade to the latest and greatest.

In the meantime, I'm sticking with Firefox for day-to-day (News!!) browsing and invoking Mozilla when I need to demonstrate the mapper.

Atlas reimport is on hold for the moment while some data issues are ironed out.

T
 
Trying firefox 0.8. Seems fast so far, but the java plugin I'm using with mozilla doesn't work. Downloading up-to-date jdk.

T
 
Added configuration parameters for the atlas rendering. We can now set the base URL for tile access and the server's location for the tile images themselves at deployment time.

Re-tagged release as monument_1 and updated www1. Saving cluster update until we've reimported the atlas, which means we're also putting off the integration of atlas_at.jsp modifications into entry.jsp.


T
Wednesday, May 12, 2004
 
AtlasRenderer is now putting in the bells and whistles - purple one-pixel border, padding around direction buttons etc. Also filling in height and width attributes on image table cells so the page loads more smoothly.

Codebase is tagged "monument" and is up and running on xplus. Everything passes jtest.

T
 
The sql import of the atlas data on the live servers is complete - but it looks as if we're going to reimport anyway to rearrange the browse entries.

Tomcat build got further yesterday than before, it's now downloading and installing its own dependencies. After a certain point though, we run into what seems to be a known bug in the XML parser and everything bombs out. Need to integrate a later version of the parser somehow.

Integrating final changes to atlas renderer before making the May release.

Found missing stats for Claire - an account had been redesignated "trial" insead of "customer", thereby confusing the new stats stats-by-customer-type system. Copied stats over, but I don't really see why customer type should affect where we file their stats - we've got a primary key anyway with their client id so that should be enough. Might change that in a spare hour sometime.

T
Monday, May 10, 2004
 
So now I'm trying to build Tomcat itself from sources and there's no documentation and NOTHING BUILDS. This is really stupidly frustrating. If ant is such a great tool I would expect to be able to unpack the distribution, tell a property file where I keep my JDK (maybe) and then run ant. Done.

Not a chance.

I'm yearning for configure; make; make install

T

 
Tried building mod_jk2 versions 2.0.2 and 2.0.4 without success.

Both of them seem to rely on lots of stuff which isn't included in their distribution and without instructions on how to create it. Blugh. I can't believe many people are using this (at least, building and using it for JNI). Trying a build of Tomcat itself from source to see if that can fill in any blanks.

T
 
Atlas imported and linked over the weekend. Updated jar and now running atlas location linker, which I think will take the rest of the day. It's been running for the morning and is up to 23k entries out of 70k.

Cheeky testing of the book on xplus with the xdev account v. exciting! The linkers have done a decent job of finding interesting information about many of the places on the map.

Will send atlas media up tonight with rsync.

T
Friday, May 07, 2004
 
Book import is underway. Before running the atlas link on Monday, remember to update the import jar!

T
 
The production tile descriptions are generated and sitting in db_server, ready to go to the site. Next thing to do is import the book and perform the atlas location link.

T
 
Right then, this is Friday afternoon talking, and it's telling me to draw a line under this enterprise and reassess it on Monday morning. The story so far:

1. Tomcat 5 is working fine, delivering our pages and handling virtual hosting. It has the functional potential to server our entire site on its own - performance not yet tested.
2. Apache 2 is also just dandy.
3. Using mod_jk 1.2.5 to link the two works, provided they communicate over the ajp13 protocol. This requires you to start both separately.
4. The more efficent and more automatic integration, whereby the two communicate over JNI and apache starts up Tomcat on its own, doesn't work. It falls down because we haven't got an implementation of org.apache.tomcat.modules.server.JNIEndpoint
5. I suspect that no-one has got this combination working. There are q and a's on the web relating to the ajp13 communication, but nothing about jni with these versions. We're beyond the bleeding edge and it's cold out here.
6. An alternative is to upgrade to mod_jk2, but the current release won't even compile.

Subject to thinking about it over the weekend, I'm inclined to follow this rough plan:

a) Try to get mod_jk2 compiling.
b) Based on (a), get JNI integration working.
c) If it's really not happening, benchmark Tomcat solo vs. Tomcat + Apache over ajp13 with mod_jk.
d) Adopt the winning config, put it live and start improving the web app (mmm - custom taglibs...).

We're due for a jar update next week, so I'm looking for the winner to go up the following Monday. Reckon that's the 17th then.

A change of pace for the rest of the afternoon: Time to put the atlas live.

T
 
org/apache/tomcat/modules/server/JNIEndpoint isn't shipped with Tomcat 4 either.

T
 
When trying to load the bridge the native side of the connector is unable to load a particular class:

org/apache/tomcat/modules/server/JNIEndpoint

Now this class isn't anywhere in the Tomcat 5.0.19 distribution or the Tomcat JK 1.2.5 connector distribution. Aha. Now hunting for it, but I'm getting a slight sinking feeling about it - what are the chanced it will integrate neatly with what we've got even if I do find it? I think we're being herded towards mod_jk2.

T
 
We're failing after the module has created a jni worker, when it calls "validate". In validate, the jni worker performs a series of checks to ensure we can proceed. The step which fails is when it's getting the bridge. This bridge was listed in one of the faqs saying "use bridge=tomcat33" but I don't really understand what that means yet - it's just weird property voodoo. Investigating.

T
Thursday, May 06, 2004
 
Testing with jni versus ajp13 and new debug outs reveals a divergence: In both cases, we retrieve a list of 1 element but in the former the map building fails. Find out why tomorrow.

Also, the atlas is ready for import. Time to run the tile generator on the live system and get that sucker out.

T
 
Yes, we can! printf at the same point gets dumped straight to the console! Wahey! Now we can get some proper tracing done...

T
 
The list (in fact, worker_map) is created and filled in only one place, and that's preceded by an indigenous call to the logger.

We never see the result of that log call (irrespective of the worker type setting).

That must mean that the log call is failing. Perhaps the logging machinery is not set up at the point the map is created? Can we wave a flag by some other means?

T
 
Got the wc_get_worker_for_name function to dump out its current worker map when it's called.

Works fine whenever all the listed workers are of type ajp13. As soon as one of them is marked jni then the entire list is empty.

So, what's filling the list when it does work?

T
 
First useful log output comes from jk_handler in mod_jk.c. Checking it out...

T
 
Looks as if jk_init is not getting called, and that's what sets up the worker map. Will change worker back to ajp13 (which gets found) and see how that one gets identified.

T
 
The story so far:

Entry point into the mod_jk stuff is init_ws_service in apache-2.0/mod_jk.c and this tallies with some of the messages I'm seeing in the logs, for example

[Thu May 06 12:14:48 2004] [mod_jk.c (497)]: agsp=80 agsn=xreferlocal hostn=xreferlocal shostn=xreferlocal cbsport=0 sport=0

is generated from that call.

There's lots of structures flying about which contain data and (void*) pointers to functions - c++ without all the language support. You sometimes forget how much java (or OO in general) shields you from. Bjarne, if you'd spent less time in Abba and more time on c++ we'd all have got here a lot earlier.

T
 
Cleaned and rebuilt the mod_jk module.

It's now taking notice of the jni directions, because it wouldn't start apache until I'd fixed the library path by setting LD_LIBRARY_PATH before starting apache. It was complaining:

[root@orac conf]# /usr/local/apache2/bin/apachectl start
Error occurred during initialization of VM
Unable to load native library: libjvm.so: cannot open shared object file: No such file or directory

so I ran

[root@orac conf]# export LD_LIBRARY_PATH=/shared/apps/java/j2sdk1.4.1/jre/lib/i386/
[root@orac conf]# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/shared/apps/java/j2sdk1.4.1/jre/lib/i386/native_threads
[root@orac conf]# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/shared/apps/java/j2sdk1.4.1/jre/lib/i386/server
[root@orac conf]# /usr/local/apache2/bin/apachectl start

which fixed that. Next it complained that the initial JVM heap size was too small. That was down to directives I'd copied into workers.properties from a HOWTO:

worker.worker1.ms=64
worker.worker1.mx=128

and have now modified to read:

worker.worker1.ms=64M
worker.worker1.mx=128M

and it seems much happier.

The implication of all this is that apache is now reading workers.properties, noticing that we'd like it use JNI and testing the jvm startup. All of which is good. The JNI worker which sparks all of this, however, is still not getting found by mod_jk when we actually request the page:



(Browser)

Internal Server Error

The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator, you@example.com and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.




(mod_jk log)

[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (486)]: Into jk_uri_worker_map_t::map_uri_to_worker
[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (500)]: Attempting to map URI '/'
[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (618)]: jk_uri_worker_map_t::map_uri_to_worker, done without a match
[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (486)]: Into jk_uri_worker_map_t::map_uri_to_worker
[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (500)]: Attempting to map URI '/'
[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (618)]: jk_uri_worker_map_t::map_uri_to_worker, done without a match
[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (486)]: Into jk_uri_worker_map_t::map_uri_to_worker
[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (500)]: Attempting to map URI '/home.jsp'
[Thu May 06 12:26:04 2004] [jk_uri_worker_map.c (580)]: jk_uri_worker_map_t::map_uri_to_worker, Found a suffix match worker1 -> *.jsp
[Thu May 06 12:26:04 2004] [mod_jk.c (1709)]: Into handler r->proxyreq=0 r->handler=jakarta-servlet r->notes=137914904 worker=worker1
[Thu May 06 12:26:04 2004] [jk_worker.c (132)]: Into wc_get_worker_for_name worker1
[Thu May 06 12:26:04 2004] [jk_worker.c (136)]: wc_get_worker_for_name, done did not found a worker



T
 
Googling for the error messages reveals nothing, so I'm going to have a look at the c code and try to figure out where it's going wrong.

The O'Reilly book on Tomcat (less than useful) declares that mod_jk2 is required to get this to work, though even they can't get it working for the book. They opine that it "might be fixed by the time you read this". Hmm. mod_jk2 won't even compile at the current release, which is why I'm using mod_jk. The documentation for mod_jk suggests it is capable of jni-based interaction, so I'm pursuing that for the moment.

The whole connector thing seems one of the bad examples of open source - it's fractured (mod_jk, mod_jk2, mod_webapp), badly documented if at all and very fragile. Shame really, as apache httpd and tomcat themselves are such good examples of what open source can produce. I really am a pompous old duffer.

T
Wednesday, May 05, 2004
 
The in-process server's not working yet. By the debugging messages, it looks as if the jni worker isn't getting added to the map, so it's not found by mod_jk. Hmm. Will see if I can find out why tomorrow.

T
 
Apache 2 now configured with virtual hosts, each of which is routing through to tomcat and serving the appropriate content.

Added DirectoryIndex directives so that each site's correct homepage is displayed by default.

Now to go for the tighter level of integration where apache runs tomcat in-context, saving us extra admin work.

T
 
Modified the dtable export links so that excel users can get it direct. If MS supported standard MIME-types we'd be OK with the single link, but there you go!

Tomcat now serving xrefer, xreferplus and xml sites from my machine. Unfortunately, tomcat doesn't seem happy with a symbolic link for each context's lib directory, so they've each got their own copies of the jars. Still, this is progress. Next to hook up apache with the vhost stuff.

Modified IQueryOperations to re-introduce the older retrieve() prototype so we don't have to modify all the callers. There's only so much we can change at once!

T
Tuesday, May 04, 2004
 
Got the "host" configuration element working in Tomcat. Wahey! Tomcat now serving xplus from the root context of the xlocal virtual host. Will try to extend this to a few other domains tomorrow, hook them up to apache and then try the in-process connector.

T
 
Downloaded and installed apache2.

There are two choices for the apache-to-tomcat connector, mod_jk and mod_jk2. Downloaded mod_jk2 which is nicely ant-driven, but alas it doesn't build at all. Nyet.

Downloaded and build mod_jk. It's now installed in my local apache2 configuration, and I've got apache2 to forward .jsp requests through it to tomcat, which processes them correctly. This is good.

Still got a problem with the context mappings - everything's in /xplus/stuff rather than the root. We need some way to indicate to apache that it should prefix every tomcat request the xplus context name.

T
 
Monday was a bank holiday so not a lot happened.

The story so far: Tomcat is now serving xreferplus pages on my local machine, login, search and retrieve are all working. Things which aren't working are generally broken because the application resides in //host/xplus/stuff.jsp rather than //host/stuff.jsp - somehow we need to get tomcat to serve things from the root context. A better alternative would be to get apache to route through to tomcat and let apache handle all the context/hostname mappings.

That's next on the list.

T

Powered by Blogger