0

I have a static large array-of-arrays, witch represents tree structure with about ~100k nodes, this array is only read-only reference for value lockups.

Now, witch of this methods will perform better?

First, simply array definition in pure PHP file, for including in requests

Second, serialize this array, gzip serialized output, and load gzipped file and unserialize for every request

Or convert array to SQLite or somethin' similar, but storage must be capable of fast lockup of long "ID path" ie. 1->5556->123->45->455->Node_name (With actually PHP table doing very good)

Memory limit on server is not a problem

5
  • 2
    Just benchmark it and tell us. And consider memcached as an additional option. Commented Sep 15, 2010 at 22:50
  • 5
    Why would you gzip it if memory is not a concern? That's just going to take more processing time. Commented Sep 15, 2010 at 22:53
  • Memcached, apc, & read it into memory from a plain file if it isn't there anymore (server restarts & the like). Commented Sep 15, 2010 at 22:57
  • because gzipping will produce very less file size (as serialized text has very good compress ratio) and this will shorten up disk read operations Commented Sep 15, 2010 at 22:57
  • if the file is accessed a lot, it will under normal circumstances already be cached in memory, preventing the need for disk IO. Commented Sep 15, 2010 at 22:59

3 Answers 3

1

You'll need to turn the array into a PHP value a some point anyway, so gzip is out.

So if you are going to decide between keeping it on disk using something like sqlite, or just let php load it in every time (preferably having APC enabled), the real question is whats more important to you, memory or CPU. If you don't know yet, you're probably suffering from a case of premature optimization.

When it does become relevant to you to either cut down on memory or cpu, (or io) the answer will be more obvious, so make sure you can easily refactor.

If you want to predict what's better for you, do a benchmark.

Update I just saw memory is apparently not a concern. Go for the PHP array and include the file. Easy. Keep in mind though that if the total data size is 10MB, this will be 10MB per apache process. At a 100 apache processes this is already 1GB.

Sign up to request clarification or add additional context in comments.

Comments

1

those load times actually look pretty good considering you are loading the whole file on every request. gzip may help (by reducing the amount of data it has to read from disk)

if you want it faster or if that file is likely to get much bigger perhaps consider looking into ways to extract what you want without loading the whole file - pointers to tree nodes/fseek/hashtables/etc.

If you can work out a way to fseek to the data you actually want and just load that rather than loading the whole file that would speed things up a lot.

Comments

0

My benchmarks are telling everything:

Pure PHP file size: ~5 MB

Pure PHP file import: avg load time: ~210 ms
Pure PHP file import with APC: avg: ~60-80 ms
Unserialize(gzuncompress(file_get_contents()) : almost const, every request: ~ 40 ms

SQlite i've not tested, because lack of time, to port this tree structure into SQL DB

~host is on 4 core Intel Xeon with HT, and hardware RAID 5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.