4

I have a PHP script that acts as a JSON API to my backend database.

Meaning, you send it an HTTP request like: http://example.com/json/?a=1&b=2&c=3... it will return a json object with the result set from my database.

PHP works great for this because it's literally about 10 lines of code.

But I also know that PHP is slow and this is an API that's being called about 40x per second at times and PHP is struggling to keep up.

Is there a way that I can compile my PHP script to a faster executing format? I'm already using PHP-APC which is a bytecode optimization for PHP as well as FastCGI.

Or, does anyone recommend a language I rewrite the script in so that Apache can still process the example.com/json/ requests?

Thanks

UPDATE: I just ran some benchmarks:

  • PHP script takes 0.6 second to complete
  • If I use the generated SQL from the PHP script above and run the query from the same web server but directly from within the MySQL command, meaning, network latency is still in play - the fetched result set takes only 0.09 seconds to complete.

As you notice, PHP is literally 1 order of magnitude slower in generating the results. Network does not appear to be the major bottleneck in this case, though I agree it typically is the root cause.

2
  • Can you provide the lines of code you're running (if its only 10 lines)? Commented Oct 30, 2009 at 3:58
  • 1
    Since it is only 10 lines of PHP, I too would be interested in seeing it. I'm not having an easy time believing it takes .6 seconds to run 10 lines of php. I'm guessing web-server configuration might be the issue, but it is hard to say without knowing more about your infrastructure/source. Commented Aug 28, 2014 at 21:42

9 Answers 9

8

Before you go optimizing something, first figure out if it's a problem. Considering it's only 10 lines of code (according to you) I very much suspect you don't have a problem. Time how long the script takes to execute. Bear in mind that network latency will typically dwarf trivial script execution times.

In other words: don't solve a problem until you have a problem.

You're already using an opcode cache (APC). It doesn't get much faster than that. More to the point, it rarely needs to get any faster than that.

If anything you'll have problems with your database. Too many connections (unlikely at 20x per second), too slow to connect or the big one: query is too slow. If you find yourself in this situation 9 times out of 10 effective indexing and database tuning is sufficient.

In the cases where it isn't is where you go for some kind of caching: memcached, beanstalkd and the like.

But honestly 20x per second means that these solutions are almost certainly overengineering for something that isn't a problem.

Sign up to request clarification or add additional context in comments.

2 Comments

Agreed. If you're returning results from a DB query, perhaps it's the query itself that is slow? Try profiling the query on the SQL server.
Good comment. In short for faster executing format APC APC APC APC
3

I've had a lot of luck with using PHP, memcached and nginx's memcache module together for very fast results. The easiest way is to just use the full URL as the cache key

I'll assume this URL:

/widgets.json?a=1&b=2&c=3

Example PHP code:

<?
$widgets_cache_key = $_SERVER['REQUEST_URI'];

// connect to memcache (requires memcache pecl module)
$m = new Memcache;
$m->connect('127.0.0.1', 11211);

// try to get data from cache
$data = $m->get($widgets_cache_key);
if(empty($data)){
    // data is not in cache. grab it.
    $r = mysql_query("SELECT * FROM widgets WHERE ...;");
    while($row = mysql_fetch_assoc($r)){
        $data[] = $row;
    }
    // now store data for next time.
    $m->set($widgets_cache_key, $data);
}

var_dump(json_encode($data));
?>

That in itself provides a huge performance boost. If you were to then use nginx as a front-end for Apache (put Apache on 8080 and nginx on 80), you could do this in your nginx config:

worker_processes  2;

events {
    worker_connections  1024;
}

http {
    include  mime.types;
    default_type  application/octet-stream;

    access_log  off;
    sendfile  on;
    keepalive_timeout  5;
    tcp_nodelay  on;
    gzip  on;

    upstream apache {
        server  127.0.0.1:8080;
    }

    server {
        listen  80;
        server_name  _;

        location / {
            if ($request_method = POST) {
                proxy_pass  http://apache;
                break;
            }
            set  $memcached_key $uri;
            memcached_pass  127.0.0.1:11211;
            default_type  text/html;
            proxy_intercept_errors  on;
            error_page  404 502 = /fallback;
        }

        location /fallback {
            internal;
            proxy_pass  http://apache;
            break;
        }
    }
}

Notice the set $memcached_key $uri; line. This sets the memcached cache key to use REQUEST_URI just like the PHP script. So if nginx discovers a cache entry with that key it will serve it directly from memory, and you never have to touch PHP or Apache. Very fast.

There is an unofficial Apache memcache module as well. Haven't tried it but if you don't want to mess with nginx this may help you as well.

1 Comment

Why would just recommend using md5 or sha1 on the memcache key.
1

The first rule of optimization is to make sure you actually have a performance problem. The second rule is to figure out where the performance problem is by measuring your code. Don't guess. Get hard measurements.

PHP is not going to be your bottleneck. I can pretty much guarantee that. Network bandwidth and latency will dwarf the small overhead of using PHP vs. a compiled C program. And if not network speed, then it will be disk I/O, or database access, or a really bad algorithm, or a host of other more likely culprits than the language itself.

1 Comment

I ran a benchmark, see my updated post above. Network and I/O does not appear in this case to be the bottleneck
1

If your database is very read-heavy (I'm guessing it is) then a basic caching implementation would help, and memcached would make it very fast.

Let me change your URL structure for this example:

/widgets.json?a=1&b=2&c=3

For each call to your web service, you'd be able to parse the GET arguments and use those to create a key to use in your cache. Let's assume you're querying for widgets. Example code:

<?
// a function to provide a consistent cache key for your resource
function cache_key($type, $params = array()){
 if(empty($type)){
  return false;
 }
 // order your parameters alphabetically by key.
 ksort($params);
 return sha1($type . serialize($params));
}

// you get the same cache key no matter the order of parameters
var_dump(cache_key('widgets', array('a' => 3, 'b' => 7, 'c' => 5)));
var_dump(cache_key('widgets', array('b' => 7, 'a' => 3, 'c' => 5)));


// now let's use some GET parameters.
// you'd probably want to sanitize your $_GET array, however you want.
$_GET = sanitize($_GET);

// assuming URL of /widgets.json?a=1&b=2&c=3 results in the following func call:
$widgets_cache_key = cache_key('widgets', $_GET);

// connect to memcache (requires memcache pecl module)
$m = new Memcache;
$m->connect('127.0.0.1', 11211);

// try to get data from cache
$data = $m->get($widgets_cache_key);
if(empty($data)){
 // data is not in cache. grab it.
 $r = mysql_query("SELECT * FROM widgets WHERE ...;");
 while($row = mysql_fetch_assoc($r)){
  $data[] = $row;
 }
 // now store data for next time.
 $m->set($widgets_cache_key, $data);
}

var_dump(json_encode($data));
?>

Comments

0

You're already using APC opcode caching which is good. If you find you're still not getting the performance you need, here are some other things you could try:

1) Put a Squid caching proxy in front of your web server. If your requests are highly cacheable, this might make good sense.

2) Use memcached to cache expensive database lookups.

1 Comment

I ran a benchmark, see my updated post above. Network and I/O does not appear in this case to be the bottleneck
0

Consider that if you're handling database updates, your MySQL performance is what, IMO, needs attention. I would expand the test harness like so:

  • run mytop on the dbserver
  • run ab (apache bench) from a client, like your desktop
  • run top or vmstat on the webserver

And watch for these things:

  • updates to the table forcing reads to wait (MyISAM engine)
  • high load on the webserver (could indicate low memory conditions on webserver)
  • high disk activity on webserver, possibly from logging or other web requests causing random seeking of uncached files
  • memory growth of your apache processes. If your result sets are getting transformed into large associative arrays, or getting serialized/deserialized, these can become expensive memory allocation operations. Your code might need to avoid calls like mysql_fetch_assoc() and start fetching one row at a time.

I often wrap my db queries with a little profiler adapter that I can toggle to log unusually query times, like so:

function query( $sql, $dbcon, $thresh ) {
    $delta['begin'] = microtime( true );
    $result = $dbcon->query( $sql );
    $delta['finish'] = microtime( true );
    $delta['t'] = $delta['finish'] - $delta['begin'];
    if( $delta['t'] > $thresh )
        error_log( "query took {$delta['t']} seconds; query: $sql" );
    return $result;
}

Personally, I prefer using xcache to APC, because I like the diagnostics page it comes with.

Chart your performance over time. Track the number of concurrent connections and see if that correlates to performance issues. You can grep the number of http connections from netstat from a cronjob and log that for analysis later.

Consider enabling your mysql query cache, too.

Comments

0

Please see this question. You have several options. Yes, PHP can be compiled to native ELF (and possibly even FatELF) format. The problem is all of the Zend creature comforts.

1 Comment

This link is dead.
0

Since you already have APC installed, it can be used (similar to the memcached recommendations) to store objects. If you can cache your database results, do it! https://www.php.net/manual/en/function.apc-store.php https://www.php.net/manual/en/function.apc-fetch.php

Comments

0

From your benchmark it looks like the php code is indeed the problem. Can you post the code?

What happens when you remove the MySQL code and just put in a hard-coded string representing what you'll get back from the db?

Since it takes .60 seconds from php and only .09 seconds from a MySQL CLI I will guess that the connection creation is taking too much time. PHP creates a new connection per request by default and that can be slow sometimes.

Think about it, depending on your env and your code you will:

  1. Resolve the hostname of the MySQL server to an IP
  2. Open a connection to the server
  3. Authenticate to the server
  4. Finally run your query

Have you considered using persistent MySQL connections or connection pooling? It effectively allows you to jump right to query step from above.

Caching is great for performance as well. I think others have covered this pretty well already.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.