While working on an application built on Zend Framework, I experienced a really odd slow-down of the system while running on the web cluster at work as opposed to my machine at home. I couldn’t see what the issue was myself, and it seemed to baffle people on #zftalk a bit as well as work colleagues. The speed difference was quite dramatic – going from near instant on my home computer to around 30 seconds for a page display while running on the cluster.
Naturally, this required a fair amount of investigation…
It was quickly ruled out to be any fault of ZF. After all, it is being used by companies such as IBM, Zend, Sourceforge, Fox, and more. If the framework were not suitable and produced slow results then they would obviously not use it, nor would any of you!
Next to be ruled out was custom code built on top of ZF. With the exact same code-base producing faster results on one machine and not on another it was highly unlikely to be the code.
Profiling the code proved a little helpful. I profiled the database connection for each query and ruled out any slowness with that as they were taking fractions of seconds. Code profiling was a little bit more tricky, as everything seemed proportionally slower, not any one thing in particular. However, the Zend_Loader component seemed to be taking quite some time to perform its tasks.
With a little command-line magic (using ktrace, kdump, grep, awk, etc. – not by me, but by talented colleague) it was determined that the OS itself, Mac OSX ‘Tiger’, was mainly to blame. The cause of the problem was trying to determine relative paths and the slow speed at which Tiger was doing this… As I understand it, to determine the current directory, ‘.’, the OS needs to back track all the way to the root, get the whole list of directories and work out which inode matches the one your current path is, and then work its way back down the directories until it finds a match. Once it’s done that you have your current path. If it sounds intensive, that’s because it is.
When comparing Tiger to Leopard we were seeing a 1000x improvement (4 microseconds as opposed to 4 milliseconds) to do various getdirentries() calls.
If you used the include path for a handful of files you’d never notice a significant drop in speed, but the application I’m working on, together with ZF will typically include 140+ files.
So how was the issue resolved?
For the short term there was a very simple fix; simply alter the include path so that the current path is last to be checked and the more significant paths (such as where the application or Zend library is located) are first. This simple tweak took a 30+ second load time to around two seconds – a vast improvement! Still, two seconds is not ideal so we will be having Leopard-based machine installed on the web cluster to see if that also helps to increase performance.
I’m curious; has anyone else had a similar problem?