Server on Steroids: FreeBSD, nginx, MySQL, PostgreSQL, PHP, and More
A comprehensive guide to server optimization covering OS tuning on FreeBSD, nginx frontend optimization, PHP backend acceleration with APC, and deep database tuning for both MySQL and PostgreSQL, including migration strategies, indexing, and offloading with SphinxSearch.
Quite a bit of time has passed since the previous optimization article was written. That long-suffering Pentium 4 with 512MB of RAM, simultaneously serving up to a thousand people on the forum and up to 150,000 peers on the tracker, has long been resting in a landfill, and the club has gone through more than one server since then.
0. Why Optimize at All?
There are three ways to grow: Scale up (upgrading hardware), Scale out (adding frontends), or optimization. The first option is used when you have money, the second when you have good architecture, and the third when you need to squeeze the maximum out of existing equipment.
1. OS Optimization (FreeBSD)
1.1 Upgrading to 7.x
Upgrading to a new version of FreeBSD brings significant advantages. The new ULE 3.0 Scheduler and jemalloc are very useful on multi-core systems (>=4 cores). MSI (Message Signaled Interrupts) provide fast interrupt handling.
1.2 Upgrading to 7.2
Superpages, increased KVA, and optimized default sysctl parameters come absolutely free when upgrading to the latest OS release.
1.3 Switching to amd64
Switching to amd64 provides massive KVA sizes and the ability to work with shared memory >2GB. Four gigabytes of RAM were being put even in laptops by 2009, so for servers this became standard. INT64 is frequently used in applications like databases and OpenSSL, providing performance gains.
1.4 Offloading the Network Subsystem
Optimization is divided into two parts: tuning ifconfig parameters and configuring sysctl.conf/loader.conf.
For quality NICs like em (Intel Gigabit) or bge (Broadcom Gigabit), the recommended options are:
tso(tso4, tso6) — TCP Segment Offloadinglro— Large Receive Offloadtxcsum,rxcsum— RX/TX Checksum Offloadlink0,link1,link2— driver-dependent
For third-rate NICs (re/rl/sk/nfe), these options may not work correctly, so it's better to stick with polling.
1.5 FreeBSD and Large Numbers of Files
FreeBSD has a technology for caching file names in directories called dirhash. The maximum memory is limited by vfs.ufs.dirhash_maxmem and defaults to 2MB. It's recommended to increase the memory until vfs.ufs.dirhash_mem hits the ceiling.
1.6 Softupdates, gjournal, and Mount Options
When power goes out, fsck on terabyte drives can take hours. The solution is using softupdates or journaling. It's better to place the journal partition on a separate disk.
It's recommended to add the noatime option to /etc/fstab. The async option is acceptable when you have RAID with a battery-backed unit.
2. Frontend Optimization (nginx)
On web projects, nginx typically consumes only 1%-5% of CPU, with PHP consuming the rest. However, config optimization can affect overall response time.
Recommended standard optimizations:
reset_timedout_connection on;
sendfile on;
tcp_nopush on;
tcp_nodelay on;2.1 Accept Filters
Accept filter technology allows passing packets from the kernel to the process only when data arrives or a valid HTTP request is received. To enable it, load the kernel module:
#kldload accf_httpAnd enable the filter in the nginx config:
listen 80 default accept_filter=httpready;2.2 Caching
nginx has a flexible response caching system for both fastcgi and proxy backends. RSS delivery via PHP can be cached for 3-5 minutes. The version of the site for guests can be cached for 5 minutes (if it's not a news site).
For all static content, it's recommended to set expires to one month:
location ~* \.(jpg|jpeg|gif|png)$ {
root /var/nnm-club;
expires 30d;
}2.3 AIO
AIO is useful for specific workloads and helps maintain response time when reducing the number of workers. To use it, load the kernel module:
#kldload aioAnd enable in nginx.conf:
sendfile on;
aio sendfile;3. Backend Optimization
For PHP, 50% of optimization comes from opcode caching, with the remaining gains from caching DB responses.
3.1 APC
Facebook developers have shared their insights on optimizing APC.
3.1.1 APC Locking
The old File Locking is often recommended to be replaced with spinlock or pthread mutex. pthread mutex became the default starting from version 3.0.16.
3.1.2 APC Hints
When you have many PHP files, you need to increase the values of apc.num_files_hint and apc.user_entries_hint in php.ini. These values control the sizes of APC hash tables and affect efficiency when the load factor reaches >= 0.75.
3.1.3 APC Fragmentation
Fragmentation in APC occurs because there is no automatic deletion of records by TTL or LRU. Records leave the cache only when accessed after TTL expiration or when memory is completely full, triggering a full cache reset.
3.2 PHP 5.3
Upgrading PHP from 5.2 to 5.3 provides performance gains, however the transition will be long, especially in a production environment, considering the list of deprecated functions.
4. Database Optimization
The best DB optimizations are:
- Not using an RDBMS at all (sphinxsearch)
- Not using the database (caching)
- Batch queries (WHERE IN (...), batch INSERT/UPDATE)
- Asynchronous DB operations (memcacheQ, ApacheMQ, AMQP, crontab)
4.1 MySQL
During the lifetime of a project, its database hits memory, disk, and CPU limits. You need to understand not just indexes and sorting, but how MySQL uses them internally.
50% of MySQL tuning can be done with utilities: tuningprimer, mysqltuner, mysqlsla.
4.1.1 Upgrading to 5.1
The upgrade brings numerous bonuses: optimizer improvements (especially GROUP BY), InnoDB plugin, Partitioning, Row-based replication.
4.1.2 Switching to InnoDB
MyISAM lacks transactions, uses TABLE LOCK (table locks on writes), and REPAIR after power loss takes hours.
InnoDB is fully ACID-compliant, uses row-level locking (simultaneous reads and writes), and allows operations like mysqldump in a single transaction. XtraDB from Percona helps with I/O-bound workloads.
4.1.3 Query Cache
Query Cache is the most misunderstood part of MySQL. Bigger does not mean better. On systems with >=8 processors, it's better to disable it since it's a completely non-parallelized subsystem. Contents are fully flushed on any table change, so it only helps on properly normalized tables.
4.1.4 Indexes
Missing indexes hurt SELECT, extra indexes hurt INSERT/UPDATE. Old indexes waste memory and slow down data modifications.
4.2 PostgreSQL
PostgreSQL is an enterprise-class database, but its default settings are such that it could run even on a mobile phone. Out of nearly 200 parameters, about 45 are responsible for tuning.
4.2.1 Indexes
PostgreSQL has far more capabilities: B-tree, hash, GiST, GIN, multicolumn, partial indexes, and indexes on expressions.
4.2.2 pgBouncer
pgBouncer (or an alternative) is the first thing that should be installed on a server with a database. For each database connection, a separate process is launched, consuming at least work_mem of RAM. When exceeding 200-500 processes, the server starts struggling. pgBouncer saves you from this situation.
4.2.3 pgFouine
pgFouine is an advanced PHP-based analog of mysqlsla. Together with Playr (a production log replayer), it allows query optimization on staging servers under real-world conditions.
4.3 Offloading the Database
The best way to optimize database operations is to access it as rarely as possible.
4.3.1 SphinxQL
SphinxSearch has learned to masquerade as a MySQL server. To use it, you just need to create a sphinx.conf, add indexer entries to cron, and switch your search to a different database. When switching to Sphinx, you can get rid of MyISAM and its FTS, improving both search speed and quality.
Usage examples:
- RSS subscription for search queries:
rss.php?q="starcraft 2" - Searching for short words (the movie "9" instead of rejection due to ft_min_word_len)
4.3.2 Non-RDBMS Storage
There are many places where a relational database isn't needed — a simple key-value store is sufficient. Key-value databases are used for caching selections from relational databases due to their speed.
In large PHP projects, it's useful to store opcode cache custom data — frequently used global variables. Retrieval speed is higher than from memcached on a neighboring machine.
4.4 Encodings
UTF-8 is great in every way except one — text in many languages takes up twice as much space. With a monolingual audience, this is worth thinking about.
4.5 Asynchrony
Synchronous data processing isn't always required. Asynchronous processing helps improve response time and reduce server load, since batch queries execute faster than individual ones.
Large projects use message queues (ApacheMQ, RabbitMQ, ZeroMQ, AMQP), while smaller ones can get by with cron.
Appendix: Small Things
1. SSHGuard or Alternative
Standard practice is installing anti-brute-force protection for SSH. It additionally protects against sharp spikes in Load Average during attacks by bots brute-forcing login-password pairs by the tens of thousands.
2. xtrabackup
LVM snapshots are a bottleneck. mysqldump locks tables and creates text backups that take weeks to restore. xtrabackup from Percona enables non-blocking binary backup of InnoDB/XtraDB tables.
The best tool is ZFS clones — they're created instantly, database restoration means just changing the path in the config. If restoration fails, you can roll back. From clones, you can restore the entire system after a failed kernel upgrade.
3. Moving Email to Another Host
With a large amount of spam, this helps reduce traffic and preserve IOPs.
4. Integration with Third-Party Software
For exchanging email messages with attachments, use smtp/imap with an adapter instead of a PHP script. Chat can be organized based on a Jabber server with a JavaScript client. For maps, use a mashup with Google Maps. Such systems scale much better than solutions built on PHP and MySQL.
5. Monitoring
You can't optimize what you can't measure. Performance metrics should be monitored, logged, and displayed on graphs. Tools: Nagios, Zabbix, Cacti, Munin.
6. Downsides of Optimization
Bleeding-edge technologies aren't called that for nothing. When moving to a new server, we found bugs in practically everything (APC, MySQL, nginx, xbtt). Thanks to Open Source, you can fix simple problems yourself.
Instead of an Afterword
Left behind the scenes were ZFS, distributed file systems, replication, and sharding — topics for separate posts. We welcome criticism, as a found mistake in a post often means a mistake in one of the author's projects.