Friday, 15 August, 2014
Reading the article published on MariaDB site about performance
evaluation of MySQL 5.7 -vs- MariaDB 10.1 I've got a double feeling:
from one side I have no any reason to have a doubt in obtained results,
and from another side - my feeling says me there is something going
odd.. But well, I don't have any attention to blame anybody (and sorry
if you think so) -- my willing here is only to understand what is going
on, why, and what are the challenges on the observed test workload(s).
I'll cover here only my observations on the LinkBench workload as it was
not yet enough tested and involving many questions around..
LinkBench Benchmark :
- developed by Facebook Team, so represents a true real life workload
- written in Java
- from the beginning since it creation was reported as very sensible to the index lock contention in InnoDB
- for this reason was modified to have partitions in the most hot table (to split a single index contention)
- this helped to improve performance, but index lock contention remained limiting..
- the issue with InnoDB index lock was fixed only in MySQL 5.7 giving a significant performance boost in LinkBench workload
What is looking odd for me in MariaDB's test results :
- indeed, having compression feature within any database engine is very valuable
- however, compression by itself cannot improve an overall performance (except if you have some limitation on other levels, like slow storage, etc.)..
- and on extremely fast Fusion-IO flash storage there is no doubt that performance result will be better when compression is not used, and it's exactly what we see in the presented results..
- but then it's hard to believe for me that without using any compression MySQL 5.7 which is having the fix for InnoDB index lock contention is running slower than MariaDB 10.1 which is based on InnoDB from MySQL 5.6 and not having such a fix..
Well, I may be wrong as well, but the following is what I'm observing on LinkBench workload on my system.
- Linux box with 40cores-HT 2.3Ghz running Oracle Linux 6.5
- Fusion-IO flash storage using NVMFS filesystem
- LinkBench database of 150G
- let's start with a Buffer Pool (BP) = 50% of database size to follow MariaDB test conditions (so, 75G in my case)
- I'm usually using 32 BP instances
- as BP will not be able to keep the whole data set, we may expect constant IO reads
- so a high activity on LRU to expect + LRU flushing
- means LRU depth should be tuned well, I'll set it to 4000 (not too big, not too small, as it's per BP instance, this will give me 32 x 4000 free pages amount to expect which should cover a potential page IO read speed from Fusion-IO)
- REDO log size is also important, as a bigger REDO is allowing to delay page flushing and keep flushing/checkpoint activity more smooth -- 12GB REDO here seems to be enough
- adaptive hash index (AHI) -- in most RW workloads it's better to have disabled (and even in some RO), but curiously here has no impact at all..
- using AIO + O_DIRECT is the must
- all other setting is just following general "best practices" for most of workloads..
- Performance Schema (PFS) is enabled + mutex instrumentation too
TEST #1 : LinkBench-150G 64users with 75GB Buffer Pool
There are 5 test cases executed in the following order:
- #1 - MySQL 5.7 no doublewrite (innodb_flush_log_at_trx_commit=1, checksums crc32, innodb_doublewrite=0)
- #2 - MySQL 5.7 secure ((innodb_flush_log_at_trx_commit=1, with checksums crc32, innodb_doublewrite=1)
- #3 - MariaDB 10.1 no doublewrite (same options as MySQL 5.7 in #1)
- #4 - MariaDB 10.1 secure
- #5 - MariaDB 10.1 with atomic writes (without doublewrite, but atomic writes instead)
Each test case was running during 30min (if was several hours initially, but then I've decided to reduce the test duration time --because once the load became stable on the beginning, then not too much is happening later, and I don't have unlimited time ahead ;-))
- without doublewrite MySQL 5.7 is near x2 times faster than MariaDB 10.1 (20K TPS -vs- 12K TPS)
- within "secure" configuration MySQL 5.7 is loosing 25% in performance (20K -> 15K TPS)
- interesting that this workload is pretty sensible to REDO flushing (sync)
- but the main overhead is coming from a doublewrite anyway..
- however, as you can see, it's still better than MariaDB
- interesting to see that atomic writes in MariaDB are not lowering too much TPS
- so, once atomic writes will be available in MySQL 5.7 I may expect the same 20K TPS in secure config here
- NOTE: to avoid doublewrite overhead you may also use O_DSYNC flush setting combined with EXT4 data journaling as it was recently demonstrated by Percona
So far, why MariaDB is getting a lower TPS than MySQL 5.7 ?..
Let's get a look on what Performance Schema is reporting:
- MariaDB 10.1 is hitting a severe contention on the index lock
- while MySQL 5.7 is not having it anymore
- and that's why MariaDB is running slower on this workload..
Let's check there is nothing abnormal going with page flushing :
- Checkpoint Age did not reach any critical level (max 7GB from 12GB in REDO was used)
- no LRU single page flush either, so the amount of free pages was always enough..
What is the page IO read rate in this workload?
- MySQL 5.7 is requiring 20K / 18K free pages/sec
- while MariaDB 10.1 is only 10K..
- the excessive page scanning seen in MariaDB is fixed in MySQL 5.7 (still remains in MySQL 5.6, but in fact not impacting a final performance)
Well, all observed is confirming my initial worry.
Now, let's see how both engines will perform within "expected" test conditions (Facebook Team is mentioning in LinkBench notes that the data set should be bigger at least x10 times than a Buffer Pool size to reproduce their environment).. -- and for sure, having only 16GB Buffer Pool within the same workload will involve much more page IO reads, where storage IO level will play a huge role.
TEST #2 : LinkBench-150G 64users with 16GB Buffer Pool
- without a doublewrite MySQL 5.7 is out passing MariaDB 10.1 by x2.5 times (10K TPS -vs- 4K TPS)..
- in a "secure" mode MySQL 5.7 is still x1.5 times better (over 6K TPS -vs- 4K TPS)
- NOTE: I've also replayed the MariaDB test with using its multithreaded flushing feature (innodb_mtflush_threads=16, innodb_use_mtflush=1) -- this is helping to reach 4500-5000 TPS, but still not enough to attend the MySQL 5.7 level..
What about internal contentions?
- MySQL 5.7 is blocked mainly by fil_system mutex and log_sys mutex contentions
- fils_system mutex contention is killing on all IO-bound workloads (and that's why testing Read-Only workloads is important too to better understand the problems on Read+Write ;-))
- bunch of other lock contentions is blocking MariaDB here..
Any flushing related issues?
- Checkpoint Age did not reach even 2GB
- no LRU single page flush either
- all ok..
What about page IO read rate?
- free page rate is much higher, but not enormous (I've already observed 70K-80K pages/sec on this server with Fusion-IO on IO-bound workloads)
- page rate on MariaDB is lower than on MySQL 5.7
Well, yet another reason why MySQL 5.7 just rocks and will be the next the best ever MySQL release :-)
I'm not pretending either the config setting I'm using and workload conditions are the most optimal here. I'm just sharing what I'm observing and curious for any input/idea about how this workload performance could be improved..
Few other notes:
- Compression : indeed, compression on IO level is much more simple an optimal for any application (not only databases), and more and more will come over a time for sure with a goal to reach as max as possible a performance level of "uncompressed" workload..
- LinkBench : seems to me needs some more love and more testing.. - would be happy to discuss with developers and any other observations related to this workload. It give me an impression to be "self limited" as with 64 concurrent users CPU is not really used, looks like there are some internal waits are happening during processing. I've also suspected garbage collection time on JVM level, but monitoring JVM stats showed that it's not so.. Open to any discussion ;-)
MySQL configuration setting I've used :
[mysqld] max_connections=4000 # myisam key_buffer_size=4000M ft_max_word_len = 16 low_priority_updates=1 # general table_open_cache = 8000 table_open_cache_instances=16 back_log=1500 query_cache_type=0 # files innodb_file_per_table innodb_log_file_size=1024M innodb_log_files_in_group=12 innodb_open_files=4000 # buffers innodb_buffer_pool_size= 75000M / 16000M innodb_buffer_pool_instances=32 innodb_log_buffer_size=64M # tune innodb_checksums=1 innodb_checksum_algorithm=crc32 innodb_doublewrite= 0 / 1 innodb_support_xa=0 innodb_thread_concurrency=0 innodb_flush_log_at_trx_commit=1 innodb_flush_method=O_DIRECT innodb_max_dirty_pages_pct=90 innodb_max_dirty_pages_pct_lwm=10 innodb_lru_scan_depth=4000 innodb_page_cleaners=4 join_buffer_size=32K sort_buffer_size=32K innodb_use_native_aio=1 innodb_stats_persistent = 1 innodb_spin_wait_delay=6 # perf special innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_read_io_threads = 16 innodb_write_io_threads = 16 innodb_io_capacity=15000 innodb_purge_threads=4 innodb_max_purge_lag_delay=30000000 innodb_max_purge_lag=1000000 innodb_adaptive_hash_index=0 # Monitoring innodb_monitor_enable = '%' performance_schema=ON performance_schema_instrument='%sync%=on' # MariaDB specific: innodb_compression_algorithm=0 # innodb_mtflush_threads=16 # innodb_use_mtflush=1 # with atomic writes : innodb_use_fallocate=1 innodb_use_atomic_writes=1
Any comments are welcome! ;-)
Thursday, 15 May, 2014
The next MySQL TechDay is taking place in Paris, 22/May (the next week!!!) - if you're MySQL lover and will be in Paris area this day - hurry up to register on the event page and attend it - trust me, you'll not regret ;-))
We're continuing to follow our TechDay tradition:
- the event is completely free (but places are limited, so you have to be registered to attend)
- the content is pure technical and directly from Oracle engineering, no marketing ;-)
- this is a true full day event, and we're reserving enough time to go in depth for each presented stuff..
- the event is taking place in Oracle office in a pretty wide and comfortable amphitheater, covered by WiFi, so you may twit live about #mysqltechday and remain "connected" if this is a part of your constrains ;-)
- we're starting at 10:30 to let you arrive "stressless" regardless traffic issues and distance (we know from previous experience that many arriving from different places in France, far away from Paris, and also some will come even from different countries! - Brussels, London, Birmingham, Dublin are already in our map for now ;-))
- for those who will arrive earlier, a hot coffee with some sweats will be already waiting since 10:00 as a bonus ;-)
- note: if you're arriving via public transportation keep in mind there is a direct tram going to the Oracle office from La Defense station (15min and you're arrived)..
and to finish with organization points:
- around 13:00 we'll have a lunch in Oracle enterprise restaurant,
- around 15:30 a coffee break
- and around 17:30 we're expecting to finish (and let you in the same "stressless" conditions arrive at home ;-))
And now about the content..
very briefly we'll provide you an overview about the latest tech news
from the MySQL Team
then, as promised from the last TechDay, I'll tell you the whole story
about heavy OLTP workloads:
- In-Memory and IO-bound, Read-Only and Read+Write..
- their problems, solutions, workarounds, and improvements already made in MySQL 5.7
- there was a long and hard work made since then, the result are surprising and amazing on the same time - and there are still many questions remaining without an answer.. ;-)
- and, as promised, this time with a full deep dive into InnoDB internals -- we'll dig in details all the story with InnoDB flushing and purge, what was wrong before MySQL 5.5, what remained wrong in 5.5, improved in 5.6, redesigned and probably fixed in 5.7 -- how read-on-write issues were resolved, why parallel + improved flushing was implemented, what can be wrong and how to tune LRU flushing -- I'll tell you ALL ;-))
I'll have 2 hours to tell you the whole story, so be sure, you'll
have for your time ;-)
then, again, as promised, we'll have Mark LEITH as our special guest
during this event!
- last time Mark was unable to come due unexpected "management issues"..
- while this time we fixed all issues ahead, and just crossing fingers now for the flying conditions, as Mark will fly from UK to Paris the same day ;-)
- if you did not attend any Mark's talks before, I'd present him as a "Practical MySQL Performance Schema Magician" !! ;-)
- Performance Schema (PFS) is a gold mine of various valuable information about your MySQL instance
- while entering in any huge gold mine you may feel yourself little bit lost.. ;-)
- but Mark will show you how easy to find there your way in practice, and how powerful solutions built around PFS could be..
- Mark will also present you his "ps_helper" - a collection of scripts he made to simplify practical PFS usage - this is a really great stuff, I'd compare it to what DTrace Toolkit made for DTrace -- you may use many scripts as they are just straightforward, then learn by example and create many new ones adapted explicitly to what you need, etc..
- and trust me, some examples will really surprise you about how deep you may go with PFS ;-)
- the best will be if you'll come with your laptop with installed latest MySQL 5.7 (or 5.6) on it and play with presented stuff by yourself..
- BTW, ps_helper is fully integrated now within 5.7 and taking part of "sys" schema
- as well, if you are not already doing, think to use MySQL Workbench 6.1+ : while this GUI tool is simply great for many general DBA tasks, it also introduced since v.6.1 a very helpful interface to discover, request and configure PFS via GUI.. - the tool is free and can be downloaded from here: http://dev.mysql.com/downloads/tools/workbench/ (Linux, MacOSX and Windoze versions)
- and of course Mark will speak about the latest MySQL Enterprise Monitor (MEM) version - it's fully using now PFS in its metrics and the result is really amazing.. - Mark will tell you all about and show you a live demo, and if you want to try your hands on - you may start from here: http://www.mysql.com/products/enterprise/monitor.html (the tool is not free, but has a long enough trial period to try)..
Well, I hope you'll have a lot of fun and a lot of food for your brain this day! ;-))
See you there!
And also think ahead about other tech topics you'll happy so see covered the next MySQL TechDay..
Useful event links:
Wednesday, 02 April, 2014
A next MySQL 5.7 milestone release is available an it just rocks! ;-)
few benchmark results to see where we're today comparing:
- MySQL 5.7 / 5.6 / 5.5
- Percona Server 5.6 / 5.5
- MariaDB 10 / 5.5
for all engines the latest available versions were used; the data set is fitting memory size, so the main focus is on the internal contentions here: already fixed for some engines, or still remained for another ones ;-)
Sysbench OLTP_RO 8-tables :
Sysbench OLTP_RO Point-Selects 8-tables :
Sysbench OLTP_RW 8-tables :
All details about these benchmark results and others (IO-bound OLTP_RW, Uniform & Pareto, DBT2, LinkBench) - I'll present during my talk tomorrow at PerconaLive 2014. I will also cover: all internals about InnoDB flushing design and how we're making it yet more improved it in 5.7, fixed and pending issues we have for today, the impact of InnoDB purge, filesystem and flash storage choices on IO-heavy workloads, and several still pending unexplained mysteries around.. - prepare your brain for some storming ;-)
If you're attending Percona Live, don't miss the following talks from MySQL Team :
- A keynote by Tomas Ulin this morning
- All New InnoDB stuff in MySQL 5.7 by Sunny Bains
- All MySQL 5.7 Replication news by Luis Soares
- All MySQL 5.7 Sharding with Fabric News by Mats Kindahl
- All about MySQL 5.7 Core Server Improvements by Morgan Tocker
- And awesome stuff around MySQL 5.7 Performance Schema by Mark Leith
as well we'll be present all here this evening during our "Meeting MySQL Team" BOF session to answer any questions you want and discuss any issues you have - your valuable feedback helps us to make MySQL yet more better! - and nothing is better here than live and fair face to face discussions.. - so, don't miss ;-)
Also, to get an overview of all the new features and improvements coming within this 5.7 milestone release - you may find many interesting information for you by reading Geir's article - http://mysqlserverteam.com/the-mysql-5-7-4-milestone-release-is-available/
Friday, 31 January, 2014
February was yet so far.. - and finally it's just tomorrow, starting
with MySQL & Friends Dev Room at FOSDEM 2014 in Brussels. I have a
talk about "Starting
with MySQL PERFORMANCE SCHEMA" - in fact I would call it rather
"Using PFS with zero configuration" ;-) -- many people are thinking
PFS is complicate, while in reality it's very simple, and just need
little bit of love ;-) Since MySQL 5.6 PFS is enabled by default,
and as the result - there are several very useful instrumentation
stats available out-of-the-box, and my talk will be about them..
Of course I'll speak about MySQL Performance as well, and feel free to ask any questions about.
Also, don't miss talks from our MySQL Team :
See you all there!
(and hurry up to not miss MySQL & Friends Community Dinner - only few places left)
UPD: my slides are here - MySQL_PFS_2014-dim.pdf
Friday, 22 November, 2013
Last week, during Tomas' keynote
at MySQL Percona Live
Conference in London we announced as one of "previews" of the
following MySQL 5.7 release(s) -- an over 1,000,000 Query/sec
result obtained with InnoDB
Memcached plugin on a Read-Only workload. This article here is just
to confirm the announced results without going too much in details..
In fact we have no idea yet for today what are exactly the scalability and performance limits for this solution.. The huge gain in performance was possible here due initial overall speed-up made recently in MySQL 5.7 and letting us reach 500K QPS in a "normal" SQL Read-Only workload. Then yet more improvement in the InnoDB Memcached Plugin code were possible and came just naturally. Specially since Facebook Team challenged us here pretty well by expressing all performance limitations they are hitting in their workloads. As well Facebook provided us a test case workload which we successfully used to improve even more our code. And finally the same test case was used to obtain the following benchmark results ;-)
The test was executed in "standalone" mode (both server and client are running on the same server). So, we used our biggest HW box we have in the LAB - a 48cores machine. This server was able very quickly to point us into any existing or potential performance issues and bottlenecks (and what is interesting that most of them were now on the memcached code itself). However, Query/sec rate (QPS) is depending a lot here of memory latency and CPU frequency, while this server is having 2Ghz CPU cores only, so on a faster HW you may expect even better results ;-)
Now, comparing best-to-best QPS results obtained on this server we have the following :
and for people who prefer 2D charts :
I've placed in legend "MySQL 5.6", while a true label should be rather "the best result we observed until now" ;-)) -- because some part of Memcached code improvement will be back-ported to MySQL 5.6 as well, so we may expect to see next 5.6 releases running here better too. However - only with MySQL 5.7 code base you'll be able to go really high..
During my talk at Percona Live in London I've also presented the following graphs - the Memcached QPS is corresponding here to the InnoDB "dml_reads/sec" stats :
There are 4 tests on these graphs representing "previous" MySQL code running on Memcached workload :
- #1 - running on 48cores as it.. - we're hitting a severe contentions related to the MVCC code (which was fixed in the latest MySQL 5.7)..
- #2 - limiting MySQL server to run on 16cores only to lower this contention.. - and then hitting transaction related contentions (which was also fixed in the latest MySQL 5.7 code)..
- #3 - tune memcached plugin to keep several reads within a single internal transaction -- helps, but hitting other contentions..
- #4 - limiting MySQL server to run on 8cores to see if contentions may be lowered -- indeed, the max peak QPS becomes higher (on 32 users), but overall performance is worse..
While on the latest MySQL 5.7 code things are looking completely differently :
There are 2 tests on these graphs:
- #1 - is running on 48cores as it (no comments ;-))
- #2 - is using "tuning" option to keep several reads within a single internal transaction - just slightly better on a peak max QPS, otherwise no significant difference anymore..
And to really feel the difference in obtained QPS gap, let's bring them all together to the same graph :
As you can see, the difference is more than impressive ;-))
- all the curves on the left parts of graph representing QPS levels obtained on the "previous" MySQL 5.6 / 5.7 code..
- then, the last curves on the right part - with the latest MySQL 5.7 code..
So, work is still in progress, and I let Sunny and Jimmy provide you all deep details about this huge step forward we made in the latest MySQL 5.7 release!
I don't know what will be the performance limit here.. Probably only HW level.. And don't know if we'll have a big enough HW to see it ;-) -- currently via a single 1Gbit network link we already observed over 700K QPS performance, and while the limitation is coming here from a single network link, the main troubles are coming from clients processing rather server.. - so, seems like Memcached @InnoDB is scaling now way better comparing to the "original" Memcached itself ;-) -- then, what kind of performance may be expected when several network links are used (or simply more fast network cards are used) -- there is still a lot to discover! and RW workload performance will be yet another challenge as well ;-)
Kudos to Sunny and Jimmy! And my special thanks to Yoshinori (Facebook)! - I think this is an excellent example where a common work on a given problem provides a fantastic final result for all MySQL users!..
If you need some details about Memcached Plugin design - you may start your reading from here: https://blogs.oracle.com/MySQL/entry/nosql_memcached_api_for_mysql - while then, keeping in mind all presented here results, I let you imagine now what kind of performance you may expect if data will be accessed directly via "native" InnoDB API and by-passing the Memcached level.. ;-))
Friday, 08 November, 2013
I'm speaking next week at Percona
Live in in London. My
talk will be almost about MySQL Performance (of course),
covering major MySQL 5.6 improvements and latest findings made in
MySQL 5.7 for today. Percona Server 5.6 and MariaDB 10 will not be
missed as well.. - we're living very interesting times, and
performance topic is amazing today as never ;-)
As usually, I'll tell you "one more thing" about MySQL Performance latest news that you cannot read or find anywhere else.. - so, stay tuned ;-)
Thursday, 10 October, 2013
As promised, my slides from MySQL Connect talks are now uploaded to the
#OOW13 site and also available from here:
Thanks to all who attended MySQL Connect this year, for all great talks, discussions, suggestions and your support!
Friday, 04 October, 2013
This article is providing the details to the 500K
QPS with MySQL 5.7 benchmark results published earlier and explained
during my talk at MySQL Connect..
Looking back in the history of MySQL / InnoDB improvement, you may easily realize that there was no any so big speed-up(s) between releases in read-only (RO) performance as they came with MySQL 5.6. And it's easy to understand as well that without having a good level of scalability on read-only, it's hard to expect to reach a good level on read+write (specially when reads are dominating ;-))
However, we were so happy with RO performance in MySQL 5.6, that with 5.7 we turned our main focus on the read+write (RW), because on big data volumes things are not yet going as we wish.. But the RW dependency on RO jumped up once more again, and then InnoDB team came with few more improvements to push the max QPS level in 5.7 yet more far.
But let's take the things by order..
In fact in MySQL 5.6 on RO workload dominating internal contentions are
- Single table used only: MDL, trx_sys and lock_sys (InnoDB)
- Many tables used: trx_sys and lock_sys (so, InnoDB mainly)
For MySQL 5.6 our best obtained result was around 250K QPS, which on that time was the best ever seen on MySQL / InnoDB with SQL queries ;-)
However, it was only possible to achieve if READ-ONLY TRANSACTIONS were used (a new feature in MySQL 5.6), and again, if AUTOCOMMIT=1 was used, otherwise CPU cycles are simply wasted to process START TRANSACTION / COMMIT statements, and you're loosing in overall performance for sure..
So, the very first improvement introduced within MySQL 5.7 was an auto-discovery of RO transactions (in fact any InnoDB transaction is considered as read-only until any DML is not involved within it).. - this simplified a lot the RO transactions feature itself, and life for users and developers too - no need to take care anymore if RO transaction statement was used or not. But again, you cannot reach the max possible potential QPS on MySQL with this feature as CPU time is still wasted by processing transactions begin / end statements..
On the same time Percona came with a different solution to resolve "transaction list" management (TRX-list) and as the result lower the trx_sys mutex contention within InnoDB. Percona's solution kept better a high load on Point-Selects with transactions workload, but MySQL 5.7 was yet more better here (but I was not able to publish the results of 5.7 as its code was not yet public).. So, at least I'm able to do it now :
- the same RO Point-Select-TRX test (using transactions) on 8-tables is executed on MySQL 5.6, Percona 5.5, and then MySQL 5.7 (results are from May.2013)
- and as you can see, we're far here from the peak 250K QPS obtained on the same 16cores-HT configuration..
- MySQL 5.6 is hitting a contention on the trx_sys mutex and QPS is decreasing since 64usr..
- Percona 5.5 is keeping a load longer, and QPS decrease is starting only since 512usr..
- while MySQL 5.7 already kept on that time QPS without any decrease at all (and even with a higher number of concurrent users which you don't see on these graphs)..
However, it's clear that if one is willing to get the MAX of potential RO QPS rate with MySQL - transactions should be avoided..
So, let's see where it was our MAX QPS in May-2013.
Same Point-Select 8-tables test, but without transactions @MySQL 5.6 :
- the test is executed with keeping MySQL 5.6 running on 16cores only, then on 16cores-HT, then 32cores, and 32cores-HT
- as you can see, the Max QPS is even bigger than expected - it's 275K QPS on MySQL 5.6
- and the Max result is reached on 16cores-HT
- while result on 32cores is not better than on 16cores-HT (due internal contentions, and contentions are better managed here by having 2 CPU threads on the same core - so the true concurrency remains on the level of 16 threads, and not 32 as it will be on 32cores)..
While the same test on MySQL 5.7 was looking pretty differently, as for that time contention on the lock_sys mutex was already lowered in 5.7, and trx_sys mutex related code also got its first changes:
- first of all you may see that on the same 16cores-HT configuration 5.7 was already performing better than 5.6
- then, performance yet more improved on 32cores configuration!
- and then reaching its Max 350K QPS on 32cores-HT config!!
- for the first time in history on this particular (and most aggressive) RO workload we got a better result on 32cores -vs- 16cores-HT, and yet more improved with hyper-threading enabled (on 32cores-HT).. - awesome! ;-)
Comparing to all we saw until now, that was really very good!! ;-)
From the other hand, it was clear as well there is still a room for improvement. Contention on trx_sys was still remained. We did not use a full CPU power to do a useful work (still a lot of CPU cycles were spending on lock spinning).. But the result was already better than ever, and way better than 5.6, so there was no valid reason to continue digging to improve performance here, while our main focus was on RW workloads where we have even more huge room for improvement..
By the end of May, during our Perf-Meeting, Sunny added few changes more around trx_sys mutex contention, and our Max QPS moved to 375K QPS(!) since then - should be more than enough as improvement for 5.7, right? ;-)
On the same time we continued exchange with Percona team who proposed to manage TRX-list in a different way.. - the solution looked interesting, but on 5.5 code base was unable to show a higher QPS, and on 5.6 code base (once Percona Server 5.6 was available) the Max QPS was not bigger than in MySQL 5.6. However, the discussion involved a very interesting point: what will be the impact on RO performance if there will be some RW workloads running in parallel?.. And even if MySQL 5.7 code was still running better within the same test conditions, the impact was very visible (you may find my analyzes about here, while, again, I was not able to present 5.7 results during this time as the code was not public yet - may provide it in one of the next articles now)..
And as any pure RW workload was impacted here as well, there was enough of motivations to finally remaster the whole TRX-list related code in a way that Sunny wanted to see it from a long time.. And this experience was simply amazing! ;-)) Day after day we enjoyed to see our QPS graphs going higher and higher.. - till reached 440K QPS(!) on the same 32cores-HT server.
Point-Selects 8-tables @5.7-DMR2 :
No comments.. ;-))
However, there was a small mystery around.. - we intentionally with Sunny analyzed all bottlenecks and impact of code changes via different tools, and on some of the tests for my big surprise Sunny observed higher QPS levels than me.. This "mystery" was related to the following factors:
- on a high load the 5.7 code is running now near the limits of the HW capacity (mainly CPU), so every instruction matters!
- and the difference if UNIX socket or IP port is used becomes very visible!
- and Sysbench itself is using 30% of CPU time, but if the older Sysbench version is used (with a smaller code path) for the same test load it'll consume only 20% CPU, leaving additional 10% to the MySQL server ;-)
- so, in the same test workload, by using UNIX socket instead of IP port + replacing Sysbench-0.4.13 by Sysbench-0.4.8 we're out-passing 500K QPS(!) - easy, right? ;-))
Let's compare "before" and "after" :
- Lowered CPU% usage by Sysbench
- resulting in a higher CPU time availability for MySQL server
- and we're hitting 500K QPS..
What else?.. ;-)
I may only add: kudos Sunny & whole MySQL Dev Team! ;-)
Let's get a look now on Max QPS obtained on the same Point-Select 8-tables workload on other engines as well. I've used the latest ones available on beginning of Sep.2013 :
- MySQL-5.7.2 (DMR2)
- Percona Server 5.6.13-rc60.5
- Percona Server 5.5.33-rel31.1
Each engine is tested within the following configuration variations:
- CPU taskset: 8cores-HT, 16cores, 16cores-HT, 32cores, 32cores-HT
- Concurrent Users Sessions: 8, 16, 32 ... 1024
- InnoDB Spin Wait Delay: 6, 96
Then the best results from any given combination for every engine are compared -vs- others. And I'm obtaining the graph which I've already presented in the previous article:
Few comments :
- No need to comment the gap on the MySQL 5.7 results, it's just evident..
- then, what is interesting that none of engines based on the MySQL 5.5 code base did not get any closer to the MySQL 5.6 results..
- which is just confirming that only since the MySQL 5.6 code base was used, Percona Server reached the 5.6 level, while MariaDB-10 is still on the road..
- so, the MySQL 5.6 code base rocks, no doubt!
- and MySQL 5.7 simply rocks twice! ;-))
And what about scalability?
The answer is simple: MySQL 5.7 is the only one scaling here.
Then, if IP port is used and a more "heavy" Sysbench-0.4.13, the results are the following:
QPS is just slightly lower, but the tendency is exactly the same..
And scalability is pretty similar:
More other results will come, so stay tuned ;-)
NOTE : for a single-table-bound workloads the story is not yet good:
- decreased contentions on the InnoDB level made other contentions much more visible
- and MDL contentions became even more dominating when the load is bound on a single table..
- this is as expected, and should be fixed within the next DMRs..
We have yet many challenges ahead of us ;-)
For reference, my HW config for presented tests was the following:
- Server : 32cores-HT (bi-thread) Intel 2300Mhz, 128GB RAM
- OS : Oracle Linux 6.2
- FS : EXT4 mounted with "noatime,nodiratime,nobarrier"
max_connections=4000 key_buffer_size=200M low_priority_updates=1 table_open_cache = 8000 back_log=1500 query_cache_type=0 table_open_cache_instances=16 # files innodb_file_per_table innodb_log_file_size=1024M innodb_log_files_in_group = 3 innodb_open_files=4000 # buffers innodb_buffer_pool_size=32000M innodb_buffer_pool_instances=32 innodb_additional_mem_pool_size=20M innodb_log_buffer_size=64M join_buffer_size=32K sort_buffer_size=32K # innodb innodb_checksums=0 innodb_doublewrite=0 innodb_support_xa=0 innodb_thread_concurrency=0 innodb_flush_log_at_trx_commit=2 innodb_max_dirty_pages_pct=50 innodb_use_native_aio=1 innodb_stats_persistent = 1 innodb_spin_wait_delay= 6 / 96 # perf special innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_read_io_threads = 4 innodb_write_io_threads = 4 innodb_io_capacity = 4000 innodb_purge_threads=1 innodb_adaptive_hash_index=0 # monitoring innodb_monitor_enable = '%' performance_schema=OFF
In case you need it, Linux Sysbench binaries available from here:
The Sysbench command used to run Point-Selects test via UNIX socket (starting 8 processes in parallel):
LD_PRELOAD=/usr/lib64/libjemalloc.so /BMK/sysbench-0.4.8 --num-threads=$1 --test=oltp --oltp-table-size=10000000 \ --oltp-dist-type=uniform --oltp-table-name=sbtest_10M_$n \ --max-requests=0 --max-time=$2 --mysql-socket=/SSD_raid0/mysql.sock \ --mysql-user=dim --mysql-password=dim --mysql-db=sysbench \ --mysql-table-engine=INNODB --db-driver=mysql \ --oltp-point-selects=1 --oltp-simple-ranges=0 --oltp-sum-ranges=0 \ --oltp-order-ranges=0 --oltp-distinct-ranges=0 --oltp-skip-trx=on \ --oltp-read-only=on run > /tmp/test_$n.log &
The Sysbench command used to run Point-Selects test via IP port (starting 8 processes in parallel):
LD_PRELOAD=/usr/lib64/libjemalloc.so /BMK/sysbench-0.4.13 --num-threads=$1 --test=oltp --oltp-table-size=10000000 \ --oltp-dist-type=uniform --oltp-table-name=sbtest_10M_$n \ --max-requests=0 --max-time=$2 --mysql-host=127.0.0.1 --mysql-port=5700 \ --mysql-user=dim --mysql-password=dim --mysql-db=sysbench \ --mysql-table-engine=INNODB --db-driver=mysql \ --oltp-point-selects=1 --oltp-simple-ranges=0 --oltp-sum-ranges=0 \ --oltp-order-ranges=0 --oltp-distinct-ranges=0 --oltp-skip-trx=on \ --oltp-read-only=on run > /tmp/test_$n.log &
Saturday, 21 September, 2013
Yes, we've done it! ;-)
Tomas just announced we've reached 500K QPS performance level in
OLTP_RO Point-Selects 8-tables benchmark, and I may only confirm it and
say you little bit more:
This is the best-to-best comparison between the all listed engines obtained on the same 32cores-HT server that I've used in my previously published benchmark results. Same workload, same conditions, updated players. All details about this and other tests results I'll provide during my tomorrow's talk at MySQL Connect conference (11:30AM), and then later publish them within another blog post..
Well, what else to say.. - MySQL 5.7 is preparing to become the next "the best ever MySQL Server release in history" ;-)
The full story is published here now : The Road to 500K QPS with MySQL 5.7
Thursday, 19 September, 2013
Heading to MySQL Connect right now.. Seems there will be a lot of interesting stuff, my prediction ;-)
This year I have 2 talks about MySQL Performance:
Performance: Benchmarks, Tuning, and “Best” Practices
Sunday, 11.30 am, Imperial Ballroom B
5.6 Performance Tuning and Best Practices
Monday, 4.00 pm, Moscone South - 236
Looking forward to exchange and to learn from others.. And, of course there will be some surprises ;-))
So, stay tuned ;-)