Created: 2008-11-19
Last modified: 2008-12-15




MySQL Scalability @2008






by Dimitri


SSC Team, 2008
Sun Microsystems Inc.





Ces informations sont données à titre indicatif et n'engagent pas Sun Microsystems.

Table of contents



Benchmark Information
Customer Name(s): Sun / MySQL

NDA: no

Contact Information:
     - Dimitri@Sun.COM

Dates: Nov - Dec/2008

Keywords: MySQL, InnoDB, PBXT, Falcon, Concurrency, Scalability, db_STRESS, M9000, M-series, SPARC64


Hardware Configuration
Server(s):
     - domain #0 M9000, 192GB RAM, 12CPU SPARC64-VII 2500Mhz (or 48cores, or 96threads) - DB Server
     - domain #1 M9000, 256GB RAM, 16CPU SPARC64-VII 2500Mhz (or 64cores, or 128threads) - Client Injector

Storage:
     - SE3511 RAID1 900GB, single FC-port, Controller with a write cache


Software Configuration
System:
     - Solaris 10 update 5
     - UFS

Application(s):
     - MySQL
     - db_STRESS
     - dim_STAT



Abstract
Overview: It's more than one year I did not test any latest MySQL version, and as everybody from the MySQL team tell me there was a lot of performance improvements since my last test - I started to seek for a good occasion to replay again my dbSTRESS tests on the even more powerful and widely scalable platform. As well, I was curious to apply on the real workload all tuning findings seen from other teams - MySQL is a part of Sun now and everybody now paying attention about this database engine more then ever...

So, once we got a customer canceled their M9000 testing - I did not miss the opportunity to run my tests within a lost for other machine time :-)

Goal(s): Understand MySQL scalability bottlenecks or potential limits, and find the most optimal configuration for the single MySQL instance. My main interest will be the most currently popular transaction engine - InnoDB.

Result(s): continue to read, please :-)


Benchmark
To understand MySQL internal bottlenecks, I have to be sure there is no one H/W limitation blocking database engine. That's why during most of the tests my "redo flushing" will stay OFF (means it'll still be written to the filesystem, but not flushed to the disks (innodb_flush_log_at_trx_commit=0)). I also limited my tests to the InnoDB as the most attractive, currently working and stable transactions engine.

From several "best practice" findings I may summarize other "my.conf" options:

So, my initial MySQL config file is looking like:

[mysqld]
 max_connections=2000
 key_buffer_size=200M
 low_priority_updates=1
 table_cache = 8000
 sort_buffer_size = 2097152

innodb_file_per_table innodb_log_file_size=500M innodb_buffer_pool_size=16000M innodb_additional_mem_pool_size=20M innodb_log_buffer_size=8M innodb_checksums=0 innodb_doublewrite=0 innodb_support_xa=0 innodb_thread_concurrency=0 innodb_flush_log_at_trx_commit=0 innodb_max_dirty_pages_pct=15



First Probe tests...
My First tests I've started with MySQL 5.0.37, and then progressed with other versions once they were ready (compiled and 10M Objects loaded (64GB avg database size))...

First STRESS Test results
First Stress Test is running with following parameters:

[mysqld]
 max_connections=2000
 key_buffer_size=200M
 low_priority_updates=1
 table_cache = 8000
 sort_buffer_size = 2097152

innodb_file_per_table innodb_log_file_size=500M innodb_buffer_pool_size=16000M innodb_additional_mem_pool_size=20M innodb_log_buffer_size=8M innodb_checksums=0 innodb_doublewrite=0 innodb_support_xa=0 innodb_thread_concurrency=0 innodb_flush_log_at_trx_commit=0 innodb_max_dirty_pages_pct=15

Tested MySQL versions: 5.0.37, 5.0.67, 5.0.67-percona, 5.1.29, 6.0.7

NOTE:
     - all versions compiled and running with mtmalloc library
     - my compiled binaries ran slightly faster vs "standard" from mysql.com, so I used mine
     - there was no difference in tests if mysqld process priority was changed to FX


First 1600 Users Test results
First 1600 users test simply killed me - no one test finished correctly!!!
During every test after 200 concurrent sessions (and each session is sleeping 1 second between queries) the TPS level just dropped down and stayed on the near of zero value... Bummer!...

Restart ALL Tests with innodb_thread_concurrency = 8
It's quite cool feature in InnoDB - we may change innodb_thread_concurrency live on the running database! But anyway, I wanted to better understand what will be the real impact on both Stress and 1600usr workloads. So I've restarted all test again but with innodb_thread_concurrency=8 ...

MySQL Buffer Cache allocation...
I was quite surprising to discover MySQL/InnoDB is allocating buffer cache as a simple HEAP pageable zone and not as ISM Shared Memory (for ex.) like it's doing Oracle (for ex.)...

I've jumped into the source code and discovered at least for Linux it forces SHM usage and paying special attention to use big page size for space allocation. Well, Solaris is using 4MB page size automatically when application asking to allocate a big memory space, but the MySQL code is doing a simple malloc and it should be not optimal while your database is running on the already well loaded server - having buffer cache pages moved in memory is never a good idea :-))

So, I've changed little bit the InnoDB code...


Resource LOCKs
MySQL + InnoDB generating a lot of mutex locking related to various internal data/resources access. And seems to me it's still the main source of MySQL scalability limitations...

On the same time the locking nature is completely changing when innodb_thread_concurrency is not equal to zero - other functions (like srv_conc_enter_innodb()) going on the top of locks...


Solaris kernel locks?...
On the same time so much increased CPU System Time is quite negative...
So, I've tried to get a more close look on it. What is hot in the kernel?..

Seeking for the Most Optimal Configuration
As you may see, with less CPU/sores I got a better performance level. But I'm still thinking it may be even better if I'll fine the most optimal pair between innodb_thread_concurrency value and CPU/cores number...

What I'm trying to do: while my max. TPS level is already better comparing to my initial 3500 value, I'll try now to find a most optimal parameters to keep my 256 concurrent sessions on the highest possible TPS, and then on my opinion it'll be my currently best possible configuration!

So, I've created a processor set on my server and start MySQL inside. Then adding or removing on-fly any CPU/core/threads to/from this processor set. As well changing innodb_thread_concurrency value in MySQL, and trying to get the best ration between CPU User and System time, and as final confirmation - TPS level on 256 concurrent sessions.

Here are my findings:


Restart ALL Tests with 8 cores
Optimal configuration:

NOTE: I've also tried innodb_thread_concurrency = 0 within this configuration and all results where much more worse, as well 1600 Users Test did not work... So, having innodb_thread_concurrency != 0 seems to be much more secure for me :-))


Restart ALL Tests with FS cache disabled (Direct I/O)
Well, just to see at the end how well MySQL/InnoDB will work without any File System caching help (probably more close to the real life? it depends :-))

Tested configuration:



And other MySQL Engines?..
Well, I did not test MYISAM, probably will see it next time...

However, I was very interested on:

Both engines pretending to scale very well! But as always: idea is good, but implementation is most important! :-))


SUMMARY
Few final words: