Easy page scraping with Zend\Dom (from Zend Framework 2)

The other day I was interested in getting some information from the sussex.academia.edu site, specifically I wanted a list of tags for each of the faculty members. Now, this sounds relatively easy except when you consider that initial page contains a list of links to various schools/departments people have listed, and then under each of those pages you have different fieldsets with different types of people on them (and I was only interested in the faculty fieldset), and each person may or may not have tags and even then those tags may be hidden behind some javascript so that you click and view all of the tags… When you consider all of that you would be forgiven in thinking that it’s actually quite a daunting task!

Let me assure you, though, that by using Zend\Dom from the Zend Framework 2 library it’s actually a really simple task. In fact, I did it in around 20 lines of code.

So let’s start by looking at the code and then break it down a little more.

Continue reading “Easy page scraping with Zend\Dom (from Zend Framework 2)”

Did you like this? Share it:

Zend OPcache – some info and a GUI

I’ve just started to use Zend OPcache in place of APC in a few places and so far it’s great!

Because documentation seems a little scarce right now, I’ve decided to jot down the methods available (mainly so I don’t forget).

Should be pretty obvious given the name – resets the opcache

This will return an array with three indices, one for the directives that have been set, one for version information and one for any blacklisted paths. For example:

    [directives] => Array
            [opcache.enable] => 1
            [opcache.enable_cli] => 1
            [opcache.use_cwd] => 1
            [opcache.validate_timestamps] => 1
            [opcache.inherited_hack] => 1
            [opcache.dups_fix] => 
            [opcache.revalidate_path] => 
            [opcache.log_verbosity_level] => 1
            [opcache.memory_consumption] => 134217728
            [opcache.max_accelerated_files] => 4000
            [opcache.max_wasted_percentage] => 0.05
            [opcache.consistency_checks] => 0
            [opcache.force_restart_timeout] => 180
            [opcache.revalidate_freq] => 2
            [opcache.preferred_memory_model] => 
            [opcache.blacklist_filename] => 
            [opcache.max_file_size] => 0
            [opcache.error_log] => 
            [opcache.protect_memory] => 
            [opcache.save_comments] => 1
            [opcache.load_comments] => 1
            [opcache.fast_shutdown] => 1
            [opcache.enable_file_override] => 
            [opcache.optimization_level] => 2147483647

    [version] => Array
            [version] => 7.0.2-dev
            [opcache_product_name] => Zend OPcache

    [blacklist] => Array


This will return that holds information about whether the opcache is turned on or not, whether it’s in a restart phase, the memory usage and hit statistics, and any files that have been cached along with hit and memory details. For example:

    [opcache_enabled] => 1
    [cache_full] => 
    [restart_pending] => 
    [restart_in_progress] => 
    [memory_usage] => Array
            [used_memory] => 9395880
            [free_memory] => 121987104
            [wasted_memory] => 2834744
            [current_wasted_percentage] => 2.1120488643646

    [opcache_statistics] => Array
            [num_cached_scripts] => 353
            [num_cached_keys] => 1546
            [max_cached_keys] => 7963
            [hits] => 27055
            [start_time] => 1365412384
            [last_restart_time] => 0
            [oom_restarts] => 0
            [hash_restarts] => 0
            [manual_restarts] => 0
            [misses] => 509
            [blacklist_misses] => 0
            [blacklist_miss_ratio] => 0
            [opcache_hit_rate] => 98.153388477725

    [scripts] => Array
            [/http/includes/libs/ZF-1.12.2/Zend/Loader/Autoloader.php] => Array
                    [full_path] => /http/libs/ZF-1.12.2/Zend/Loader/Autoloader.php
                    [hits] => 175
                    [memory_consumption] => 63320
                    [last_used] => Mon Apr  8 16:21:14 2013
                    [last_used_timestamp] => 1365434474
                    [timestamp] => 1343660895

            [/http/www-includes/libs/ZF-1.12.2/Zend/Db/Adapter/Oracle.php] => Array
                    [full_path] => /http/libs/ZF-1.12.2/Zend/Db/Adapter/Oracle.php
                    [hits] => 17
                    [memory_consumption] => 60600
                    [last_used] => Mon Apr  8 16:21:14 2013
                    [last_used_timestamp] => 1365434474
                    [timestamp] => 1325795702

            [/http/libs/ZF-1.12.2/Zend/View/Helper/Placeholder/Container.php] => Array
                    [full_path] => /http/libs/ZF-1.12.2/Zend/View/Helper/Placeholder/Container.php
                    [hits] => 175
                    [memory_consumption] => 2744
                    [last_used] => Mon Apr  8 16:21:14 2013
                    [last_used_timestamp] => 1365434474
                    [timestamp] => 1325795702



APC did have a GUI which was quite handy to see what had been cached, settings and memory usage but Zend OPcache doesn’t (currently as far as I could see) have anything similar.

It is possible to get information from looking at the output of phpinfo() which contains all of the key information.

However, I’ve just pushed the start of a GUI to GitHub. If you want to make changes and improvements then please feel free! It’s only had a very little amount of work done on it so far so is pretty raw.

Did you like this? Share it:

Handy little function

Quite often I find myself wanting to run the same script by either cli or through a browser. But I don’t want to fill my echo statements with <br /> tags if I’m on cli because that’d just look ugly, but at the same time I don’t just want to use \n when outputting in the browser because everything would be on the same line.

This handy little function helps to do simple output that will be readable in the browser as well as the command line:

$_ = function($str) {
    if (PHP_SAPI == 'cli') {
        echo $str;
    } else {
        echo nl2br(str_replace("\t", str_repeat('&nbsp;', 6), $str))."\n";

Then when I want to echo something I just do:

$_("This is a test\n");
$_("\tTime:" . time() . "\n\n");

Simple but handy.

Did you like this? Share it:

Sorting an array of objects by one or more object property

Quite often I find myself having an array of objects and needing to sort that array of objects by property (either one property or multiple)…

Imagine, for example, getting a large result set from your database and ordering in the query just takes too long. Or perhaps you’re getting results from a web service and that service doesn’t return the results in the order you’d like to use. Have you ever found yourself in that situation, too? On looking at the usort documentation one day I came across a comment by someone called Will Shaver that did almost what I wanted. With a little adaptation for my own use (being able to change the sort order, for example), it has become one of my favourite functions to use for sorting.

loading gist...

Now a few cools things about the function:

  1. It uses anonymous/lambda functions (or closures, whatever your prefer to call them), and that’s just plain fun
  2. You can sort on more than one property and because the sorting is recursive, it’ll sort the second property within the confines of the first, the third within the confines of the second, and so on. Think sorting in SQL
  3. You can sort in ascending or descending order for any of the properties
  4. It retains key associations so you could use this on an associative array of objects
  5. If the parameter you want to sort on is an array itself then you can use any value (by specifying it’s key) in that array as the sorting value
Did you like this? Share it:

Extend Zend_View_Stream to easily escape view variables

Zend_View_Stream is used pretty much when ever you use Zend_View, and I’ve blogged about how handy it is before.  But as it’s a class like any other, you can extend it to give added functionality.  One such use is to add automatic escaping to your view variables when you want.  So instead of doing:

<?php echo $this->escape($this->var); ?>
<?= $this->escape($this->var); ?>

You could simply do:

<?=~ $this->var; ?>

That’s a lot simpler, isn’t it?
Continue reading “Extend Zend_View_Stream to easily escape view variables”

Did you like this? Share it:



Usually I’m a total wallflower at conferences, gravitating to only the people I know. This time round I’m trying to change that and speak to people, ask speakers questions, and all that.

Right now, though, I’m enjoying dinner. 🙂

Did you like this? Share it:

PHPNW Conference 2010

Going to be travelling to the PHPNW Conference today (it’s tomorrow, but I don’t fancy catching the stupidly early train to get there on time), but going over the schedule is a pain… There are just too many good talks! How can I possibly go see them all?! The 11:15 time slot is easy, that’ll be Rob Allen’s talk on ZF 2 – we use it so much at work now that it’d be crazy to not find out what’s coming and how this may affect what we’re currently doing.  Same with the  Michelangelo van Dam talk on unit testing with ZF.  But the 12:15, 15:00 and 16:15 talks?  I have no idea what to choose!  Juozas Kaziukena’s Optimizing Zend Framework might be worth while, but then again, is it all about ZF1 and how much will be relevant for ZF2?  The HipHop talk by Scott McVicar would be interesting. I can’t see it being deployed at my work, should still be a good talk… I’m liking the sound of the Database version control without pain by Harrie Verveer as well!  And that still leaves me with two other time slots to decide on… Sheesh!

The agony of choice, eh? 😉

Did you like this? Share it:

QrCode view helper

You see QrCodes popping up every now and again on sites, in publications and the like. I think they can be a very handy way for people with cameras on their phones to get a url or other content on to their phone very easily. (I’m thinking more about those people without iPhones or full keyboards, of course!)

If you’ve never seen a QrCode before, it looks something like this:

QR Code image

Now how cool would it be to be able to generate that automatically for each page on your site and allow people to be able bookmark that site on their phone? Well, I think it’d be pretty cool! So I came up with a very simple ZF view helper to do it for me.

Continue reading “QrCode view helper”

Did you like this? Share it: