« September 2014 | Main | July 2014 »

Friday, 15 August, 2014

MySQL Performance: Analyzing LinkBench Workload on MySQL 5.7 and MariaDB 10.1

Reading the article published on MariaDB site about performance evaluation of MySQL 5.7 -vs- MariaDB 10.1 I've got a double feeling: from one side I have no any reason to have a doubt in obtained results, and from another side - my feeling says me there is something going odd.. But well, I don't have any attention to blame anybody (and sorry if you think so) -- my willing here is only to understand what is going on, why, and what are the challenges on the observed test workload(s). I'll cover here only my observations on the LinkBench workload as it was not yet enough tested and involving many questions around..

LinkBench Benchmark :

  • developed by Facebook Team, so represents a true real life workload
  • written in Java
  • from the beginning since it creation was reported as very sensible to the index lock contention in InnoDB
  • for this reason was modified to have partitions in the most hot table (to split a single index contention)
  • this helped to improve performance, but index lock contention remained limiting..
  • the issue with InnoDB index lock was fixed only in MySQL 5.7 giving a significant performance boost in LinkBench workload


What is looking odd for me in MariaDB's test results :
  • indeed, having compression feature within any database engine is very valuable
  • however, compression by itself cannot improve an overall performance (except if you have some limitation on other levels, like slow storage, etc.)..
  • and on extremely fast Fusion-IO flash storage there is no doubt that performance result will be better when compression is not used, and it's exactly what we see in the presented results..
  • but then it's hard to believe for me that without using any compression MySQL 5.7 which is having the fix for InnoDB index lock contention is running slower than MariaDB 10.1 which is based on InnoDB from MySQL 5.6 and not having such a fix..

Well, I may be wrong as well, but the following is what I'm observing on LinkBench workload on my system.

Configuration :
  • Linux box with 40cores-HT 2.3Ghz running Oracle Linux 6.5
  • Fusion-IO flash storage using NVMFS filesystem
  • LinkBench database of 150G

Tuning :
  • let's start with a Buffer Pool (BP) = 50% of database size to follow MariaDB test conditions (so, 75G in my case)
  • I'm usually using 32 BP instances
  • as BP will not be able to keep the whole data set, we may expect constant IO reads
  • so a high activity on LRU to expect + LRU flushing
  • means LRU depth should be tuned well, I'll set it to 4000 (not too big, not too small, as it's per BP instance, this will give me 32 x 4000 free pages amount to expect which should cover a potential page IO read speed from Fusion-IO)
  • REDO log size is also important, as a bigger REDO is allowing to delay page flushing and keep flushing/checkpoint activity more smooth -- 12GB REDO here seems to be enough
  • adaptive hash index (AHI) -- in most RW workloads it's better to have disabled (and even in some RO), but curiously here has no impact at all..
  • using AIO + O_DIRECT is the must
  • all other setting is just following general "best practices" for most of workloads..
  • Performance Schema (PFS) is enabled + mutex instrumentation too



TEST #1 : LinkBench-150G 64users with 75GB Buffer Pool

There are 5 test cases executed in the following order:
  • #1 - MySQL 5.7 no doublewrite (innodb_flush_log_at_trx_commit=1, checksums crc32, innodb_doublewrite=0)
  • #2 - MySQL 5.7 secure ((innodb_flush_log_at_trx_commit=1, with checksums crc32, innodb_doublewrite=1)
  • #3 - MariaDB 10.1 no doublewrite (same options as MySQL 5.7 in #1)
  • #4 - MariaDB 10.1 secure
  • #5 - MariaDB 10.1 with atomic writes (without doublewrite, but atomic writes instead)

Each test case was running during 30min (if was several hours initially, but then I've decided to reduce the test duration time --because once the load became stable on the beginning, then not too much is happening later, and I don't have unlimited time ahead ;-))


Observations :
  • without doublewrite MySQL 5.7 is near x2 times faster than MariaDB 10.1 (20K TPS -vs- 12K TPS)
  • within "secure" configuration MySQL 5.7 is loosing 25% in performance (20K -> 15K TPS)
  • interesting that this workload is pretty sensible to REDO flushing (sync)
  • but the main overhead is coming from a doublewrite anyway..
  • however, as you can see, it's still better than MariaDB
  • interesting to see that atomic writes in MariaDB are not lowering too much TPS
  • so, once atomic writes will be available in MySQL 5.7 I may expect the same 20K TPS in secure config here
  • NOTE: to avoid doublewrite overhead you may also use O_DSYNC flush setting combined with EXT4 data journaling as it was recently demonstrated by Percona

So far, why MariaDB is getting a lower TPS than MySQL 5.7 ?..

Let's get a look on what Performance Schema is reporting:

Observations :
  • MariaDB 10.1 is hitting a severe contention on the index lock
  • while MySQL 5.7 is not having it anymore
  • and that's why MariaDB is running slower on this workload..


Let's check there is nothing abnormal going with page flushing :

Observations :
  • Checkpoint Age did not reach any critical level (max 7GB from 12GB in REDO was used)
  • no LRU single page flush either, so the amount of free pages was always enough..


What is the page IO read rate in this workload?


Observations :
  • MySQL 5.7 is requiring 20K / 18K free pages/sec
  • while MariaDB 10.1 is only 10K..
  • the excessive page scanning seen in MariaDB is fixed in MySQL 5.7 (still remains in MySQL 5.6, but in fact not impacting a final performance)

Well, all observed is confirming my initial worry.

Now, let's see how both engines will perform within "expected" test conditions (Facebook Team is mentioning in LinkBench notes that the data set should be bigger at least x10 times than a Buffer Pool size to reproduce their environment).. -- and for sure, having only 16GB Buffer Pool within the same workload will involve much more page IO reads, where storage IO level will play a huge role.


TEST #2 : LinkBench-150G 64users with 16GB Buffer Pool


Observations :
  • without a doublewrite MySQL 5.7 is out passing MariaDB 10.1 by x2.5 times (10K TPS -vs- 4K TPS)..
  • in a "secure" mode MySQL 5.7 is still x1.5 times better (over 6K TPS -vs- 4K TPS)
  • NOTE: I've also replayed the MariaDB test with using its multithreaded flushing feature (innodb_mtflush_threads=16, innodb_use_mtflush=1) -- this is helping to reach 4500-5000 TPS, but still not enough to attend the MySQL 5.7 level..

What about internal contentions?

Observations :
  • MySQL 5.7 is blocked mainly by fil_system mutex and log_sys mutex contentions
  • fils_system mutex contention is killing on all IO-bound workloads (and that's why testing Read-Only workloads is important too to better understand the problems on Read+Write ;-))
  • bunch of other lock contentions is blocking MariaDB here..

Any flushing related issues?


Observations :
  • Checkpoint Age did not reach even 2GB
  • no LRU single page flush either
  • all ok..

What about page IO read rate?

Observations :
  • free page rate is much higher, but not enormous (I've already observed 70K-80K pages/sec on this server with Fusion-IO on IO-bound workloads)
  • page rate on MariaDB is lower than on MySQL 5.7


Well, yet another reason why MySQL 5.7 just rocks and will be the next the best ever MySQL release :-)

I'm not pretending either the config setting I'm using and workload conditions are the most optimal here. I'm just sharing what I'm observing and curious for any input/idea about how this workload performance could be improved..

Few other notes:
  • Compression : indeed, compression on IO level is much more simple an optimal for any application (not only databases), and more and more will come over a time for sure with a goal to reach as max as possible a performance level of "uncompressed" workload..
  • LinkBench : seems to me needs some more love and more testing.. - would be happy to discuss with developers and any other observations related to this workload. It give me an impression to be "self limited" as with 64 concurrent users CPU is not really used, looks like there are some internal waits are happening during processing. I've also suspected garbage collection time on JVM level, but monitoring JVM stats showed that it's not so.. Open to any discussion ;-)

MySQL configuration setting I've used :
[mysqld]
  max_connections=4000

# myisam
  key_buffer_size=4000M
  ft_max_word_len = 16
  low_priority_updates=1

# general
  table_open_cache = 8000
  table_open_cache_instances=16
  back_log=1500
  query_cache_type=0

# files
  innodb_file_per_table
  innodb_log_file_size=1024M
  innodb_log_files_in_group=12
  innodb_open_files=4000

# buffers
  innodb_buffer_pool_size= 75000M / 16000M
  innodb_buffer_pool_instances=32
  innodb_log_buffer_size=64M

# tune
  innodb_checksums=1
  innodb_checksum_algorithm=crc32
  innodb_doublewrite= 0 / 1 
  innodb_support_xa=0
  innodb_thread_concurrency=0
  innodb_flush_log_at_trx_commit=1
  innodb_flush_method=O_DIRECT
  innodb_max_dirty_pages_pct=90
  innodb_max_dirty_pages_pct_lwm=10
  innodb_lru_scan_depth=4000
  innodb_page_cleaners=4

  join_buffer_size=32K
  sort_buffer_size=32K
  innodb_use_native_aio=1
  innodb_stats_persistent = 1
  innodb_spin_wait_delay=6

# perf special
  innodb_adaptive_flushing = 1
  innodb_flush_neighbors = 0
  innodb_read_io_threads = 16
  innodb_write_io_threads = 16
  innodb_io_capacity=15000
  innodb_purge_threads=4
  innodb_max_purge_lag_delay=30000000
  innodb_max_purge_lag=1000000
  innodb_adaptive_hash_index=0

# Monitoring
  innodb_monitor_enable = '%'
  performance_schema=ON
  performance_schema_instrument='%sync%=on'

# MariaDB specific:
  innodb_compression_algorithm=0
#  innodb_mtflush_threads=16
#  innodb_use_mtflush=1
# with atomic writes :
  innodb_use_fallocate=1 
  innodb_use_atomic_writes=1



Any comments are welcome! ;-)

Rgds,
-Dimitri
Posted by Dimitri at 17:44 - Comments...
Categories: MySQL