Time for an Update
It has been a long time since I posted the last time. A lot of things have happened, in the meantime. However, the most important event has happened today: I corrected the bad manners of team.sPace. Now the load times are again in a reasonable range - or in other words, team.sPace has become usable again. The actual fix was extremely simple, less than 50 lines of code in the front-end and the back-end. The only thing was convincing myself to get over the old school thinking of monolithic CGI code. In this article I explain why team.sPace was so slow and what it took to fix it.
multi user feed aggregation on different platforms
team.sPace is a feed aggregator, that allows many users to register their personal news feeds (at this time, del.icio.us and rss2 or ATOM capable blogs. After the first experiment, I figured out that team.sPace creates a lot of unnecessary traffic on the related sites. Another issue was reported by new users who registered their del.icio.us accounts or blog feeds. For these users team.sPace showed their entires only after some time. This of course irritated the users.
So I thought of a fix that solves both problems. The core idea was to only check if things have changed if there is actually a user visiting the team.sPace portal, and then cache all outdated information. This reduces the network traffic dramatically, as there is a peak of requests in the morning and in the afternoon hours - with a much lesser extend during the weekends. However, in the original version of team.sPace I had a little caching script running, that checked every 30 minutes during office hours and every hour for the rest of the time. It is pretty obvious, that a lot of "caching" is done, when actually no user will recognize it. Therefore, I decided to check for updates only if someone would recognize it.
towards a solution of the problem
As one might expect, this approach had its own pitfalls. For example, the more users register, the more caching has to be done on external services. Most services do not like short frequency hits from a single client IP-address, and reduce their responsiveness to that requesting IP-address. Del.icio.us is such a service. With blogging services this is not so much of a problem, because most of the blogs are hosted on different systems.
The effect the caching process can be quite lengthy if a lot of information has to be reloaded. In order to reduce these timing problems I added some dynamic cache timeouts. These timeouts are randomly generated and assigned to the different feeds that are cached by team.sPace. This means that within frequent requests, it less likely, that much information from the same service is outdated at the same time.
Straight forward thinking I embedded the little Net::Tube from the feed caching script into the application class of the portal page. This solved the responsiveness for newly registered feeds. However, the first visitor of team.sPace in the morning and after lunch experience a long load time that also caused timeouts instead of presenting the team.sPace portal. These timeouts were direct effects of the reduced responsiveness of delicious.
I played around with different approaches to ease the caching problem, but the main problem was that the first request is unpredictable and so I ended up with adding a periodic caching again. Nevertheless, the periodic caching calls eased the timing problem poorly - at least not in a way that was satisfying for the users. As almost with each request some feeds had to be cached, the responsiveness of team.sPace was still too poor.
The final solution
Some time ago I read some books on Javascript coding style and learned to be more aware about asynchronous services. I realized that I violated one of the fundamental AJAX paradigms:
do quick things immediately do slow things asynchronously
I didn't do this, but I tried to do everything at once. But how to translate this idea into team.sPace?
First, I removed the caching tube from the core portal service. After that step team.sPace was speeding again. The only drawback was that it never updated.
In a second step I wrapped the caching Net::Tube into a micro service, that does nothing else than updating the team.sPace caches and tells the caller how many entries have been updated/added. This little service is called the "updater" If nothing changed the service reported 0 updates, if something has changed there are > 0 updates.
Finally, I added a bit of Javascript to the team.sPace portal. This code is executed just after the portal is initially loaded and calls the updater service and checks if something new has been added to team.sPace. This request is asynchronous and does not interfere the user interactions. However, if the updater reported some changes, the script updates the information on the page according to the current selection of the user (just in case a user set a tag cloud filter in the meantime).
This change makes the team.sPace portal available as fast as possible, and adds finally solves the problems introduced by periodic caching. The nice thing is, that if nothing has changed the user will not experience any lag on the display. If things have changed the display is updated. The only limitation at the current moment is that the web page is simply updated, which might be a bit of a surprise for the users. So there is some work left ;)