Aaron Saray

open source programmer,
web developer

entrepreneur, author
and musician

My Blog

contains PHP, Web and business/entrepreneurial related content. Please join in the conversation!

Load Facebook Fanbox Faster by Caching it

I wasn’t in favor of the Facebook fanbox on the site I was working on… but that’s what the client wanted – and that is what they get. I added it and moved on. Well, later, I started noticing a bit of errors in my Javascript Error log. I looked back at the newest edition: the fanbox. Depending on where I was connecting from, that box would take another 3 to 20 seconds to load. During that time, it was causing my page to appear to keep loading. My fear was that other web users would think the page is not done loading and have a bad experience on the web site.

I took a look at all the requests being generated with the firebug net request console and was completely blown away. It was loading tons of javascript and CSS – all things we had no use for. To top it off, it lowered my ySlow grade :)

I came up with the idea to cache the results. One of the volunteers working with the campaign I’m working on was assigned to work with me on this project: Jack Polifka. Some of the code and understanding I’m going to share here can be partially attributed to him.

What Is The Plan

The plan was to do the same request as the fanbox, but to cache that response. I thought this could be done with CURL, stored, and reloaded every hour. It wasn’t that imperative that the fan pictures and count updated every load. Once an hour would suffice.

Issues We Ran Into

Because the fanbox was being cached locally, it was destined not to work exactly perfect. The good news is that we were able to completely style it perfectly to fit in our layout (Thank our good friend Mark Skowron for his intuitive eye.) There were some issues though:

Javascript

The first issue was the use of javascript in the fanbox. Because Facebook was loading javascript from its own domain, it could do many extra functions that we wouldn’t be able to accomplish. The biggest one was identifying if you were already a fan of the page. Since Facebook was loading content from their domain, they were able to access your Facebook ID and determine if you already fan’d the page. Then, the item would update to say you’re already a fan. We couldn’t do this.

Fan… um… relativity

If you are logged into facebook when you view the widget, it appears to search the page for fans that are friends of yours. If so, it gives them priority in the picture ordering. Since we couldn’t provide that realtime cookie access, it is just a generic list of fans.

Facebook doesn’t like CURL

Honest FB, I wasn’t trying to mess with you or take advantage of you! But, when you saw me coming with a CURL user agent, you stopped me in my tracks. In order to continue the request, we had to change the CURL User Agent to something else. Then it loaded perfectly.

But, We did it Anyway

In the end, it was a success. A cron job is ran every hour to get the content of the facebook widget. Then, it is written out to a re-formatted output file and read for the next hour.

The cron script: build_facebook_fanbox.php

1
2
3
$builder = new facebook_fanbox();
$builder->getHTML();
$builder->write();

This is pretty self explanatory. The class is instantiated. A request is made to get the content. And then it’s written out. Those steps are separate because it facilitates testing easier. It’s not always necessary to write to a file during testing.

Next, I’m going to cover parts of the class and supporting files individually. fanbox.php

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class facebook_fanbox
{
    const USER_AGENT = 'Firefox XXXXXXXXXXXX';
    const FANBOX_URL = 'http://www.connect.facebook.com/connect/connect.php?api_key=xxx&channel_url=xxx&id=xxx&name=&width=280&connections=8&stream=0&logobar=1';

    protected $_html, $_fanCount = 0, $_fans = array(), $_pageInfo, $_output;

    public function getHTML()
    {
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_USERAGENT, self::USER_AGENT);
        curl_setopt($ch, CURLOPT_URL, self::FANBOX_URL);
        curl_setopt($ch, CURLOPT_FAILONERROR, TRUE); // if 400+, error out - don't want this
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        $this->_html = curl_exec($ch);

        if ($this->_html === false) {
            throw new exception(curl_errno($ch) . ': ' . curl_error($ch));
        }

        curl_close($ch);

        /**
         * write out to temp cache to review
         */

        $file = '/tmp/fanbox.html';
        file_put_contents($file, $this->_html);
    }
}

First, there are two constants. The first USER_AGENT is just the full user agent that we use to request the content of the fanbox. Remember, it was rejecting CURL’s user agent. The other constant is the FANBOX_URL which is the entire URL that is loaded. I retrieved this by reviewing the requests in the net::console window of firebug. Yours will contain your API Key and channel information.

The getHTML() function simply opens up a connection and retrieves the HTML. If there are any errors, it fails. Finally, it writes it out to a cache file in the tmp directory. I do this just in case I want to compare later on to make sure my final output matched what I retrieved.

Moving on, I added the following method:

1
2
3
4
5
6
    public function write()
    {
        $this->_parseHTML();
        $this->_buildOutput();
        $this->_writeOutput();
    }

This is pretty self explanatory. It just calls three internal methods, which I’ll cover next.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
    protected function _parseHTML()
    {
        $fancountExp = '/<span class="total">(.*?)<\/span>/';
        $fanExp ='/<div class="grid_item">(<(a|span).*?<\/(a|span)>)<\/div>/';
        $pageExp ='/<div class="connect_top clearfix">(<a.*?<\/a>)/';

        preg_match_all($fancountExp , $this->_html, $fanCount);
        $this->_fanCount = $fanCount[1][0];

        preg_match_all($fanExp, $this->_html, $fans);
        $this->_fans = $fans[1];

        preg_match_all($pageExp, $this->_html, $pageInfo);
        $this->_pageInfo = $pageInfo[1][0];
    }

Here, there are just three regular expressions used to parse out various bits of information from the retrieved HTML. First, the fan count. Then, all the fan boxes (which are a tags, span tags and img tags). Finally, it gathers the page information itself. This allows the owner of the page to change the page name – and our code not to break! As you can see, it assigns all of the values to the class internally. Let’s look at the next method.

1
2
3
4
5
    protected function _buildOutput()
    {
        $params = array('count'=>$this->_fanCount, 'fans'=>$this->_fans, 'page'=>$this->_pageInfo);
        $this->_output = view::get('facebook/fanboxtemplate', $params);
    }

This is pretty simple. It builds a parameter array of the values we’ve identified before. Then, this is passed into the helper function I have to generate the view. (The specifics of the helper function won’t be covered here. However, all it does is include the file specified in the first parameter, and assign all the values in the next parameter to an internal $vars array.) This output is then assigned to an internal variable.

In order to understand how we’re re-parsing the content, lets take a quick look at the stripped down HTML file. (This is a smaller version and is only meant as demonstration).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<div class="fan_box">
<?php echo $vars['page']; ?><br />
<div class="connect_button">
<a href="#" id="FBbecomeFan" class="FBbutton FBbutton_Gray FBActionButton">
<span class="FBbutton_Text"><span class="FBbutton_Icon FBbutton_IconNoSpriteMap">
</span>Become a Fan</span>
</a>
</div>

<div class="connections">
<?php echo $vars['count']; ?> Fans

<div class="connections_grid">
<?php
foreach ($vars['fans'] as $fan) {
    $fan = str_ireplace('<img', '<img alt="facebook profile icon"', $fan);
    print $fan;
}
?>
</div>
</div>
</div>

The most notable thing about this example is that we replace the image tag from Facebook’s fanbox and add in alt text. That’s about it. It’s all pretty self explanatory. (Once again, our real production version has a lot more options in it – this is just for demonstration.)

Finally, the last function is pretty simple:

1
2
3
4
5
protected function _writeOutput()
{
    $file = APPLICATION_PATH . '/views/partials/facebookfanbox.phtml';
    file_put_contents($file, $this->_output);
}

The final processed output is written to a page that is later included.

Final Words

While I’d love to use Facebook’s built in fanbox widget, it was causing issues with our page. I couldn’t afford to have the site slowing down because of their excessive resource loading. I think this method bridges the difference nicely.

This entry was posted in PHP and tagged , . Bookmark the permalink.

13 Responses to Load Facebook Fanbox Faster by Caching it

  1. Mark Skowron says:

    Well done, Aaron. I think the solution you came up with works great!

  2. Dimitar Angelov says:

    Hi, I’ve been bashing my head against the wall for weeks now how to acomplish this. Can you attach out_of_the_box.php ready to use files for the not so php savvy users?

  3. aazz says:

    Great script. Exactly what I needed.

    But: am i right in saying “like” button doesnt work anymore. I know from somewhere that there actually a lot going on behind this button.

    • Aaron says:

      @aazz: Correct. All the button does now is take them to the page. In the javascript version, it will show ‘they like it’ if they like it, or it will allow them to click it – increase the fan count – and add their picture to the page. If you have a hybrid solution for this, please blog! :)

  4. Don Gilbert says:

    @Aaron, thanks for this! I was searching for hours for this, and when I read it, I facepalmed myself. :) Why didn’t I think of cURL and regex. oh well.

    I got inspired by your code and modified it a bit into a functional WordPress plugin. Check it out here ( http://www.electriceasel.com/plugins/wordpress/plugin-facebook-fan-box-cache ) and let me know what you think. thanks! (I even gave you cred/linked back to this page in the plugin description.)

  5. Sean R Reid says:

    I’m trying to implement this method on an ExpressionEngine install for one of our larger clients. So far I’ve been trying to build just a basic page based on the info you’ve provided and what Don put together for his WordPress plugin. However, I’m still getting all sorts of errors and missing classes. Can you list the total of the files that you used in your version to make it all work?

    Thanks in advance!
    Sean R Reid
    sean@w3bg.com

    • Aaron says:

      Hi Sean,

      I’ve included all the files you’ll need in this post. Could you tell me what error you’re getting? Could it be an issue with CURL not being enabled? Thanks!

      • Sean R Reid says:

        Hey Aaron,

        I feel like it’s probably something rather silly on my end. One of the primary errors is: “Class ‘view’ not found in /var/www/fbbox/fanbox.php on line 48,” It’s referencing this line: view::get(‘facebook/fanboxtemplate’, $params);

        What I’m assuming is that I have the files setup incorrectly and it’s unable to get the proper output template?

        I’m able to pull data from Facebook just fine. I’m assuming I just didn’t understand the file/dir structure correctly.

        Thanks!
        Sean

        • Aaron says:

          Oh – you’re right Sean – there is a bit of code there I have left over from somewhere else.

          The View class – all it was doing was loading a piece of PHP and substituting the parameters. You could use a heredoc or an include statement there instead.

          • Sean R Reid says:

            Excellent! I’ll give that a go then! Thanks for getting back to me! I appreciate it. This is a great script (and awesome fix for a total nuisance problem).

            -Sean

  6. Danny Michel says:

    Does this still work?

    • Aaron says:

      Hi Danny – I’m not sure. It worked when I wrote it – but I haven’t had to work on a project that required this feature again since then.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>