Monday, 17 November, 2014
It was very fun for me to read last week the announces about RDS Aurora - specially the parts related to its performance gain over MySQL: Aurora is claimed to show x5 times better performance than MySQL! However, without publishing any details about ;-) -- and the only details I was able to find until now and group together are the following:
- Aurora is a proprietary closed source database engine, "compatible" with MySQL (so, not an improved MySQL fork, as many expected..)
- Sysbench workloads were used to evaluate Aurora performance
currently published Sysbench results are the following:
- 500K SELECT/sec
- 100K UPDATE/sec
- again, no details about any of the tests..
- in some articles instead of UPDATE/sec performance was mentioned INSERT/sec, but keeping in mind that there is simply no pure INSERT test available by default within Sysbench workloads, I may only suppose the mentioned 100K writes/sec are corresponding to the UPDATE/sec performance..
Fine then. But, so what?..
Let's see now where we are with MySQL Performance for today?.. ;-))
Yet more than a year ago we already announced 500K SELECT/sec with MySQL 5.7 on Sysbench (and you may read a full story about, if you got it missed). And this year during MySQL Central @OpenWorld we already presented 645K SELECT/sec with the latest MySQL 5.7! - here are the graphs taken from a live load just this morning :
- the first graphs is showing the SELECT/sec level
- the second graph - the corresponding amount of concurrent users (sessions) starting with 8, then 16, 32, 64, 128, 256 users
Sysbench SELECT Performance @MySQL 5.7 :
This is a single MySQL instance running on a single HW server. And as you see, we're reaching 645K SELECT/sec since 128 concurrent users. Is the 645K QPS level is the max limit for this server?.. - of course not, it should be possible to do yet more better here by improving our SQL layer, because on the same HW we're able to reach 1M QPS via InnoDB Memcached plugin (where all the data are going the same way directly from InnoDB, but bypassing all the SQL layers).. Work in progress ;-))
Then, what about UPDATE/sec performance?..
Again, a single HW server, single MySQL 5.7 instance, few simple SSD disks (even not a super fast flash storage as we have from Fusion-io and LSI in our LAB) :
- the first graphs is showing the UPDATE/sec level
- the second graph - the corresponding amount of concurrent users (sessions) starting with 8, then 16, 32, 64, 128, 256 users
Sysbench UPDATE Performance @MySQL 5.7 :
As you can see, we're doing even slightly more than 100K UPDATE/sec here ;-)
And you may ask then: Why MySQL Team @Oracle did not publish such a great result on UPDATE performance?.. - Well, just because there is nothing to be proud of for the moment.. ;-)) Of course, if you'll run the same workload on HDD storage you'll get much more worse results (for ex. on this server with HDD I have only 2K UPDATE/sec).. - so, for sure, SSD is much more efficient for random I/O writes. However, we're not doing better when using a much more faster flash storage than SSD, and this is our main headache today for all IO-bound workloads ;-) -- the limit today is still in the MySQL/InnoDB code itself.. - and we know where, and we're working hard to get it fixed.. And once this stuff will be improved, then we'll have something to be proud of, and then you'll hear about us for sure ;-))
Now, looking back to RDS Aurora results on Sysbench.. - so, what?.. MySQL 5.7 is already doing better! ;-)
Well, still looking for claimed x5 times performance difference..
(NOTE: if you have any troubles to reach over 500K SELECT/sec or 100K UPDATE/sec on MySQL 5.7, I have a webinar right this Wednesday to tell you all about MySQL/InnoDB internals and related tuning, don't hesitate to join: http://www.mysql.com/news-and-events/web-seminars/tuning-and-best-practices-for-developers-and-dbas/)
Wednesday, 29 October, 2014
Percona Live London 2014 is starting next week and I'm happy to speak there about MySQL 5.7 Performance, Scalability & Benchmarks latest news. Indeed, MySQL 5.7 has a lot of positive changes and impressive improvements. And, looking on MySQL 5.7 Performance, there are many cases where the gap comparing to MySQL 5.6 today is simply amazing! ;-) -- but, of course, there are still some pending issues which are requiring yet more work to reach the expected speed-up.. So, I'll tell you all about the most hot stories around, while I'm also always curious about your issues you're hitting on your production systems. Please, don't hesitate to share them - this is helping us to make MySQL yet more better! ;-)
UPDATE - my slides are here : http://dimitrik.free.fr/Presentations/MySQL_Perf-Benchmarks-PLUK_2014-dim-key.pdf
Wednesday, 08 October, 2014
This is the next part of the stories about MySQL 5.7 Performance..
So far, the previous story was about reaching 645K QPS with SQL queries, while in reality it's only a half of the full story ;-) -- because when last year we've reached 500K QPS due a huge improvement on the TRX-list code, the same improvement made a negative impact on the all single-table test workloads..
What happened finally :
- the new code changes dramatically lowered contention on TRX-list (trx_sys mutex)
- which is made MDL related locking much more hot..
- and if one table becomes hot on a workload, MDL lock contention then is hitting its highest level..
So far, it was clear that MDL is needed a fix. Specially seeing that on 8-tables workload we're reaching 645K QPS. However there was a dilemma: should we push the TRX-list change to the 5.7 code, or wait first for MDL improvement?.. -- and we finally decided to push the changes, even if for some period of time we'll need to accept a regression on all single-table workload..
So, what we got the last year :
- 500K QPS on 8-tables point-select
- less than 200K QPS on 1-table.. - which was even worse than in MySQL 5.6 ;-)
It was clear, we have to fix MDL code asap, but the MDL story was not that simple either :
- Dmitry Lenev made a quick dirty patch just removing completely all MDL related code from the source tree..
- so, of course, we all expected to see a huge QPS jump after that on this experimental code, right? ;-)
- however, for our big surprise, QPS just remained on exactly the same level...
- what is going odd?..
- in fact instead of the MDL contention we moved to the THR_lock mutex contention!
- what is interesting that until the MDL code is present, we don't see any sign of THR_lock ;-)
- and it's only since MDL is no more here -- THR_lock is firing!..
- finally Dmitry made another dirty patch removing all THR_lock related code too..
- and bingo! - without MDL & THR_lock code we doubled QPS at once, and it became clear that both contentions should be fixed to bring a speed-up on single-table workloads...
All my kudos here to Dmitry Lenev, who worked very hard to find the most optimal solution to fix both problems. There were several prototypes, less or more successful, until the final solution out-passed all our expectations -- just because it provided us the same QPS level as the binary which has no MDL nor THR_lock code at all!!! - and this is awesome! ;-))
I was happy to follow all this work very closely and play with each new update. Let me show you now just a short "making off" which is representing pretty well a summary of the one step in the progress in this work:
there are 4 tests running the same OLTP_RO point-select workload on
8/16/32..1024 users :
- #1 is using the original 5.7 code
- #2 is using #1 + MDL fix
- #3 is using #2 + removed THR_lock code
#4 is using #1 + removed MDL code + removed THR_lock code
- as you can see, the original code (#1) is suffering of MDL rwlock contention and not out-passing 200K QPS..
- the MDL fix in #2 is helping little bit to reach more than 200K QPS, but then the THR_lock mutex contention is killing an overall performance..
- by removing completely the THR_lock code (#3) we can see the potential level of performance we should have if THR_lock contention was also fixed
- then in #4 we can see the QPS level on 5.7 if there was no MDL & THR_lock code at all..
- and what remarkable here that #3 and #4 are showing exactly the same performance!
- so, the MDL fix worked extremely well to run as efficiently as if MDL code was removed ;-)
- the next step was to do the same with THR_lock to get the solution simply perfect! ;-)
- and it's exactly what was done finally in 5.7 and can be seen now in DMR5..
May only say kudos MySQL Runtime Team! kudos Dmitry Lenev!!! ;-))
Let me show you now the final impact of all these changes (and keep in mind that just a year ago the MySQL 5.7 results here were worse on all the following tests comparing to MySQL 5.6)..
Sysbench OLTP_RO 1-table 10M :
Sysbench OLTP_RO Point-Selects 1-table 10M :
Sysbench OLTP_RO Simple-Ranges 1-table 10M :
Sysbench OLTP_RW 1-table 10M trx2 :
Sysbench OLTP_RW 1-table 10M trx1 :
As you can see, the same performance improvement we can see on OLTP_RW workloads as well, where :
- trx2 : means innodb_flush_log_at_trx_commit=2
- trx1 : means innodb_flush_log_at_trx_commit=1
to be continued..
Tuesday, 07 October, 2014
And I'm not saying this because I'm MySQL/Oracle employee ;-)) - just
expressing my personal feeling while flying back to Paris..
I really enjoyed this conference now much more than all previous years :
first of all having 5 days instead of 2 allows to have much more
networking than before (and usually it's the main goal for every
conference to create a lot of informal discussions and exchange on
then, the content - all excellent, no words.. there was so much new
stuff to discover and learn from other experiences, that I'm pretty
sure that each person at one moment during the conference should feel
a kind of "brain overflow" ;-) and what is good that there were talks
for any levels of skills -- so, it was very easy to progress on
understanding of each topic quickly and smoothly..
organization was just perfect : all in one place, no more running over
different floors/buildings, etc. ;-) having 30min breaks between talks
is really good as it allows to have an exchange related to presented
topics without impacting timing of the next talk / session (while we
could probably finish little bit later as well to have yet more
MySQL Customer Advisory Forum (CAF) - yet bigger and very well
organized as usually (while I'm not sure if all customers / users are
aware about this great possibility to discuss directly with MySQL
engineering, express all your wishes & issues, get the info from the
source about what is new, what is coming, what is planned, etc.) --
well, I know that many regretted on the last minute to have missed
subscription, so keep in mind this for the next year ;-))
- MySQL Reception - really good! specially that doors were open to anyone, including persons who are not attending the conference -- a true MySQL Community event!
Few wishes anyway for the next year :
would be great to have MySQL posters around the place where conference
is going ;-)
- would be great to have a kind of "closing conference" keynote(s), etc. to make a summary, to thank all attendees & organizers, to have some fun instead just see doors closing and people leaving..
But well, all-in-all - simply great MySQL Conference, which may become only better over a time! so, book your agenda ahead for the next year -- pretty sure it'll be yet more fun ;-)
As promised, here are my presentation slides from my talks on MySQL Central @OpenWord 2014 :
Feel free to ping me if you have any questions, while I'll also try to
cover some of explained topics within the next blog posts, so stay tuned
Monday, 29 September, 2014
Indeed, MySQL 5.7 rocks ;-)
This is the part1 of the following blog posts about various benchmark results on MySQL 5.7 - and this particular one is dedicated to the Sysbench OLTP_RO Point-Select 8-tables workload.
We've already published the over 500K QPS on SQL Point-Selects before @32cores-HT server, so may reconfirm it again with MySQL 5.7 DMR5 -vs- other engines :
and now on a similar server, but with 40cores-HT we're able to confirm 645 QPS on the same workload!
If you missed the long story about how we arrived on such a performance level and how to reproduce the test - you may find all here. This workload is the most killing from all Read-Only Sysbench OLTP tests.. And it's really for the first time we started to scale here with MySQL! - look here (all results are from the 40cores-HT server) :
MySQL 5.5 :
MySQL 5.6 :
MySQL 5.7 :
Percona 5.6 :
MariaDB 10.1 :
As you can see, only MySQL 5.7 rocks here ;-)
However, if you remember the full story, this workload is extremely depending on every extra bits and even insignificant overhead.. - personally it's very rare that I saw a RDBMS workload to be so close to HPC problematic ;-)) -- to remind you that to reach 500K QPS (and 645K QPS here) was necessary to use UNIX socket instead of IP socket and an older Sysbench version binary (v0.4.8) which has just little bit less instructions than the v0.4.13 I'm currently use and will be able to get only 575K QPS on this server..
will tell you yet more during my talk on Wednesday, 1/Oct.
to be continued..
Sunday, 28 September, 2014
Oracle OpenWorld 2014 is starting today, and MySQL Central Conference since this year is taking a full part of it. I'm very happy to be there and have 2 talks :
- MySQL 5.7 : Performance and Scalability Benchmarks | Wed, 1/Oct. | 2:00 PM - 2:45 PM | Moscone South - 252 | CON5066
- MySQL Performance : Demystified Tuning and Best Practices | Thu, 2/Oct. | 10:45 AM - 11:30 AM | Moscone South - 252 | CON5097
so I'll be able to say more this time and mostly focus on the Benchmarks results & analyzes on the first talk, and then mainly focus on the tuning and the latest MySQL 5.7 design improvements to allow you to reach yet more higher & stable performance levels.
I have several "fun" stories to tell you around the latest findings around MySQL 5.7 Performance we made, and problems we've solved (or newly found ;-)) Don't know if I'll have a time to blog about all of this one day.. - while telling you these stories live during the conference is much more easier, so hope to keep you awake and have fun ;-) Indeed, there is no doubt, MySQL 5.7 rocks and it will be the next the best MySQL release ever.. - and I'll tell you why ;-)
Then, once again, I'm regretting I cannot clone myself to attend all the conference talks in parallel.. - there are so many valuable presentations in the program that it's really hard to decide where to go.. I may only advice you to attend all of them (you and your clones ;-)) -- this excellent Agenda Summary is telling all..
And don't miss the MySQL Community Reception @OOW 30/Sep. 7PM-9PM (you need just to register yourself via this link to help organizers with logistics ;-))
see you there!
Friday, 19 September, 2014
MySQL Performance: More in depth with LinkBench Workload on MySQL 5.7, MariaDB 10.1 and also Percona Server 5.6
This is the next chapter of the story started in my previous
article and related to the updated
results on LinkBench workload published by MariaDB..
Keeping in mind that the obtained results are completely opposite from both sides, I've started to investigate then the same LinkBench-150GB 64 concurrent users workload from the "most favorable" possible test conditions on the same 40cores-HT server as in my previous article:
- InnoDB Buffer Pool (BP) = 150G (so, all the data may remain in memory)
- innodb_flush_log_at_trx_commit=2 (so, REDO write are not flushed once per second and not on every transaction)
- no checksum, no double-write, no compression, no atomic IO, etc..
- just a base-to-base comparison..
And the probe result was the following, with a still better TPS on MySQL 5.7 (left) than on MariaDB 10.1 (right) :
NOTE: I'm obtaining my TPS numbers from reported by MySQL server its Commit/sec rate. I did not use for that the results from the LinkBench log file.. From what I can read in the log file, my TPS x 3 is less or more equal to the level of operations/sec reported by LinkBench, but I'm not looking for the moment for the best result in absolute, but rather a difference in performance between MySQL and MariaDB engines..
But well, my surprises started then come one ofter one ;-)
- first, discussing details with Fusion-io Team, I've discovered that my flash card is not supporting atomic IO ;-)
- so, all the previous tests with atomic IO on MariaDB you may ignore (and indeed, this explains also why the results are exactly the same with and without atomic IO ;-))
- then I discovered that 32cores reported in MariaDB config testing are in fact 16cores-HT (16cores with hyper-threading, 2 threads per core).. - and this could play a significant role as having less CPU cores may lower an overall concurrency during processing, and as the result - a lower lock contention.. - but is it so? - have to check..
- but the most killing one was the last one: from Jan's comments I understood that MariaDB 10.1 is using XtraDB by default, and even this version of XtraDB code was modified to have some features from MariaDB's own changes on InnoDB, and InnoDB engine should be used than as plugin (while I was pretty sure I'm using their InnoDB code as all their new options were accepted)..
So far, I've then replayed the same test again, but within the following conditions:
- MySQL server is bound to 20cores-HT only
- MariaDB is using its own InnoDB plugin
- EXT4 is used as filesystem to avoid any other potential surprises ;-)
- 1h or 1.5h test duration to avoid doubts about dramatically dropping TPS levels..
The result was very surprising, and I've finally added also Percona Server 5.6 into the game to clarify the things ;-)
And here is what I've observed, from left to right :
- MySQL 5.7
- MariaDB 10.1 with XtraDB (default)
- MariaDB 10.1 with InnoDB (plugin)
- Percona Server 5.6
- indeed, MariaDB with its InnoDB plugin is performing better then with its XtraDB which is used by default..
- on the same time Percona Server 5.6 is preforming way better than MariaDB with its XtraDB..
- so, I may just suspect that there are some problems in MariaDB with XtraDB integrations..
- while Percona Server is doing better than MariaDB with any engine
- and MySQL 5.7 is doing better than anyone ;-)
What about lock contentions?
The picture is completely different here too:
- for my big surprise, the index lock contention is going down on both MariaDB and Percona Server
- log_sys mutex contention is dominating for all..
So far, let's see if now running on full 40cores we have a better performance?
From left to right, the same test is executed on 20cores-HT and then on 40cores-HT on:
- MySQL 5.7
- MariaDB 10.1 with InnoDB (plugin)
- Percona Server 5.6
- surprisingly a very small gain on MySQL 5.7 on move from 20 to 40cores..
- near no gain at all on MariaDB..
- and yet more surprising - a regression on Percona Server..
- indeed, there is something yet more to investigate as there is no reason to not win at least 50% in TPS on 40cores with 64 concurrent users..
- NOTE: I'm not blaming LinkBench here, as it's very possible that the show-stopper is inside of the database code rather in the test code itself..
And hope an analyze of the lock contentions will say us little bit more:
- as you can see, the log_sys contention is growing on the move from 20 to 40cores on all engines
- while Percona Server is also hitting a growing lock_sys time..
But if this is only about the locking, then probably we can improve the thing little bit by tuning the InnoDB spin wait delay?
The same test on MySQL 5.7, but with progressively increased spin wait delay :
- innodb_spin_wait_delay=6 (default)
- innodb_spin_wait_delay= 12, 24, 48, 96, 192, ...
- the best TPS level is reached with spin delay=48 : 35K TPS instead of the initial 30K ;-)
- on 96 TPS is less or more the same as with 48, but on 192 performance is going down..
And it's well seen on the graph with contentions :
Now, let's replay the same test on the same 40cores-HT config, but with applied innodb_spin_wait_delay=48 on all engines and adding one more for REDO log flushing:
- the first test for each engine with innodb_flush_log_at_trx_commit=2 (trx2 : REDO fsync once per second)
- and the second one with innodb_flush_log_at_trx_commit=1 (trx1 : REDO fsync on every transaction)
- there are several surprises here..
- MySQL 5.7 : 35K TPS on trx2, but only 24.5K TPS on trx1..
- MariaDB 10.1 : 28K TPS on trx2, and 24.5K TPS on trx1..
- Percona 5.6 : only 27K TPS on trx2, but 28K TPS on trx1! ;-))
- well, if on MariaDB the gap between trx2 and trx1 results is as expected (redo writes fsync() on every transaction with trx1 cannot be free)..
- while on MySQL 5.7 it looks rather as regression (should be fixed)
- and on Percona Server there is rather a regression on trx2 ;-)
Then, to understand why all this happens you have just to look on the corresponding mutex contentions graph reported by PFS:
However, the most surprising for me here the drop on MySQL 5.7 from 35K TPS to 24.5K TPS just because we switched from innodb_flush_log_at_trx_commit=2 to innodb_flush_log_at_trx_commit=1.. - I'd not expect such a big regression just because of REDO fsync, specially when we're using an extremely fast flash storage.. And indeed, the story will likely change for 5.7 very quickly -- the following graph representing the result on exactly the same workload, but on the next-step-dev 5.7 version (after DMR5), get a look on this:
While the result with innodb_flush_log_at_trx_commit=2 is slightly lower than 35K TPS (need some more tuning/testing to find an optimal internals setting), but with innodb_flush_log_at_trx_commit=1 the result is even little bit bigger than 35K TPS!!!
All this means that in MySQL 5.7 we may be potentially not too far from having near the same level of performance for innodb_flush_log_at_trx_commit=1 setting as for innodb_flush_log_at_trx_commit=2 whenever it's possible.. - stay tuned! ;-)
Ok, now what will happen if the Buffer Pool size will be reduced to 75GB ? (instead of 150GB)...
The same test but with BP size=75G :
- there is not too much difference anymore between trx2 and trx1 on all engines..
- a huge wait time part now is attributed to the IO reads..
- all engines are starting from the less or more the same TPS level on the beginning (except MySQL 5.7 which is starting from a higher TPS)
- but then over a time starting to decrease...
- MySQL 5.7 is loosing less than any others, and remains the best from all engines (and if one day I'll have a patience to run such a test for 24h I'll do to see the end of the graph ;-))
Any difference to expect if only 20cores will be user for database server?
The same test, but on 20cores-HT :
On 20cores-HT the overall result just little bit slower, but very similar to 40cores-HT..
And contentions levels are looking very similar as well, including growing over a time the index lock waits on MariaDB and Percona Server:
Instead of SUMMARY :
- indeed, the plugin InnoDB in MariaDB 10.1 seems to run better than their default XtraDB..
- however, it looks like there is just something wrong with XtraDB integration in MariaDB, because Percona Server itself is showing the same or a better result than MariaDB running with its own InnoDB plugin..
- as well I did not see any single point where MariaDB could be better on base-to-base conditions comparing to MySQL 5.7
- further investigation will come later once I'll have in my server the Fusion-io flash card supporting atomic IO and all the latest NVMFS stuff..
- more code changes in MySQL 5.7 are in pipe to better analyze these strange scalability issues and more higher and bigger database workloads.. - note: we're yet far here from reaching even a half of potential performance of this flash storage, so there is definitively a lot of things to improve ;-)
So, stay tuned ;-)
Additional NOTES : (as you're asking)
>> HOWTO monitor Commit/sec in MySQL ?
- you're getting on a regular interval (say every 10sec.) the value of the "Com_commit", for ex. :
- mysql> show global status like 'Com_commit';
- then do a subtraction from the current value the previous one, and divide it by your timeout interval (say 10sec)
- that's all ;-)
>> HOWTO monitor mutex time waits in MySQL ?
- you're starting MySQL server with PFS enabled and wait events instrumentation ON:
- then on a regular time interval executing the following SQL query, and getting a diff from the current values vs the previous ones and divide them by your time interval
- SQL query:
mysql> select EVENT_NAME, max(SUM_TIMER_WAIT)/1000000000000 as WaitTM from events_waits_summary_by_instance group by 1 order by 2 desc limit 25; +-------------------------------------------+----------+ | EVENT_NAME | WaitTM | +-------------------------------------------+----------+ | wait/io/file/innodb/innodb_data_file | 791.3204 | | wait/synch/mutex/innodb/fil_system_mutex | 25.8183 | | wait/synch/rwlock/innodb/btr_search_latch | 5.2865 | | wait/io/file/innodb/innodb_log_file | 4.6977 | | wait/synch/rwlock/sql/LOCK_grant | 4.4940 | ...... +-------------------------------------------+----------+ 25 rows in set (0.06 sec)
As well you can use MySQL Enterprise Monitor (MEM) or use my tools from my site (dim_STAT @ http://dimitrik.free.fr) to get all this and other stuff ready for action..
Friday, 15 August, 2014
Reading the article published on MariaDB site about performance
evaluation of MySQL 5.7 -vs- MariaDB 10.1 I've got a double feeling:
from one side I have no any reason to have a doubt in obtained results,
and from another side - my feeling says me there is something going
odd.. But well, I don't have any attention to blame anybody (and sorry
if you think so) -- my willing here is only to understand what is going
on, why, and what are the challenges on the observed test workload(s).
I'll cover here only my observations on the LinkBench workload as it was
not yet enough tested and involving many questions around..
LinkBench Benchmark :
- developed by Facebook Team, so represents a true real life workload
- written in Java
- from the beginning since it creation was reported as very sensible to the index lock contention in InnoDB
- for this reason was modified to have partitions in the most hot table (to split a single index contention)
- this helped to improve performance, but index lock contention remained limiting..
- the issue with InnoDB index lock was fixed only in MySQL 5.7 giving a significant performance boost in LinkBench workload
What is looking odd for me in MariaDB's test results :
- indeed, having compression feature within any database engine is very valuable
- however, compression by itself cannot improve an overall performance (except if you have some limitation on other levels, like slow storage, etc.)..
- and on extremely fast Fusion-IO flash storage there is no doubt that performance result will be better when compression is not used, and it's exactly what we see in the presented results..
- but then it's hard to believe for me that without using any compression MySQL 5.7 which is having the fix for InnoDB index lock contention is running slower than MariaDB 10.1 which is based on InnoDB from MySQL 5.6 and not having such a fix..
Well, I may be wrong as well, but the following is what I'm observing on LinkBench workload on my system.
- Linux box with 40cores-HT 2.3Ghz running Oracle Linux 6.5
- Fusion-IO flash storage using NVMFS filesystem
- LinkBench database of 150G
- let's start with a Buffer Pool (BP) = 50% of database size to follow MariaDB test conditions (so, 75G in my case)
- I'm usually using 32 BP instances
- as BP will not be able to keep the whole data set, we may expect constant IO reads
- so a high activity on LRU to expect + LRU flushing
- means LRU depth should be tuned well, I'll set it to 4000 (not too big, not too small, as it's per BP instance, this will give me 32 x 4000 free pages amount to expect which should cover a potential page IO read speed from Fusion-IO)
- REDO log size is also important, as a bigger REDO is allowing to delay page flushing and keep flushing/checkpoint activity more smooth -- 12GB REDO here seems to be enough
- adaptive hash index (AHI) -- in most RW workloads it's better to have disabled (and even in some RO), but curiously here has no impact at all..
- using AIO + O_DIRECT is the must
- all other setting is just following general "best practices" for most of workloads..
- Performance Schema (PFS) is enabled + mutex instrumentation too
TEST #1 : LinkBench-150G 64users with 75GB Buffer Pool
There are 5 test cases executed in the following order:
- #1 - MySQL 5.7 no doublewrite (innodb_flush_log_at_trx_commit=1, checksums crc32, innodb_doublewrite=0)
- #2 - MySQL 5.7 secure ((innodb_flush_log_at_trx_commit=1, with checksums crc32, innodb_doublewrite=1)
- #3 - MariaDB 10.1 no doublewrite (same options as MySQL 5.7 in #1)
- #4 - MariaDB 10.1 secure
- #5 - MariaDB 10.1 with atomic writes (without doublewrite, but atomic writes instead)
Each test case was running during 30min (if was several hours initially, but then I've decided to reduce the test duration time --because once the load became stable on the beginning, then not too much is happening later, and I don't have unlimited time ahead ;-))
- without doublewrite MySQL 5.7 is near x2 times faster than MariaDB 10.1 (20K TPS -vs- 12K TPS)
- within "secure" configuration MySQL 5.7 is loosing 25% in performance (20K -> 15K TPS)
- interesting that this workload is pretty sensible to REDO flushing (sync)
- but the main overhead is coming from a doublewrite anyway..
- however, as you can see, it's still better than MariaDB
- interesting to see that atomic writes in MariaDB are not lowering too much TPS
- so, once atomic writes will be available in MySQL 5.7 I may expect the same 20K TPS in secure config here
- NOTE: to avoid doublewrite overhead you may also use O_DSYNC flush setting combined with EXT4 data journaling as it was recently demonstrated by Percona
So far, why MariaDB is getting a lower TPS than MySQL 5.7 ?..
Let's get a look on what Performance Schema is reporting:
- MariaDB 10.1 is hitting a severe contention on the index lock
- while MySQL 5.7 is not having it anymore
- and that's why MariaDB is running slower on this workload..
Let's check there is nothing abnormal going with page flushing :
- Checkpoint Age did not reach any critical level (max 7GB from 12GB in REDO was used)
- no LRU single page flush either, so the amount of free pages was always enough..
What is the page IO read rate in this workload?
- MySQL 5.7 is requiring 20K / 18K free pages/sec
- while MariaDB 10.1 is only 10K..
- the excessive page scanning seen in MariaDB is fixed in MySQL 5.7 (still remains in MySQL 5.6, but in fact not impacting a final performance)
Well, all observed is confirming my initial worry.
Now, let's see how both engines will perform within "expected" test conditions (Facebook Team is mentioning in LinkBench notes that the data set should be bigger at least x10 times than a Buffer Pool size to reproduce their environment).. -- and for sure, having only 16GB Buffer Pool within the same workload will involve much more page IO reads, where storage IO level will play a huge role.
TEST #2 : LinkBench-150G 64users with 16GB Buffer Pool
- without a doublewrite MySQL 5.7 is out passing MariaDB 10.1 by x2.5 times (10K TPS -vs- 4K TPS)..
- in a "secure" mode MySQL 5.7 is still x1.5 times better (over 6K TPS -vs- 4K TPS)
- NOTE: I've also replayed the MariaDB test with using its multithreaded flushing feature (innodb_mtflush_threads=16, innodb_use_mtflush=1) -- this is helping to reach 4500-5000 TPS, but still not enough to attend the MySQL 5.7 level..
What about internal contentions?
- MySQL 5.7 is blocked mainly by fil_system mutex and log_sys mutex contentions
- fils_system mutex contention is killing on all IO-bound workloads (and that's why testing Read-Only workloads is important too to better understand the problems on Read+Write ;-))
- bunch of other lock contentions is blocking MariaDB here..
Any flushing related issues?
- Checkpoint Age did not reach even 2GB
- no LRU single page flush either
- all ok..
What about page IO read rate?
- free page rate is much higher, but not enormous (I've already observed 70K-80K pages/sec on this server with Fusion-IO on IO-bound workloads)
- page rate on MariaDB is lower than on MySQL 5.7
Well, yet another reason why MySQL 5.7 just rocks and will be the next the best ever MySQL release :-)
I'm not pretending either the config setting I'm using and workload conditions are the most optimal here. I'm just sharing what I'm observing and curious for any input/idea about how this workload performance could be improved..
Few other notes:
- Compression : indeed, compression on IO level is much more simple an optimal for any application (not only databases), and more and more will come over a time for sure with a goal to reach as max as possible a performance level of "uncompressed" workload..
- LinkBench : seems to me needs some more love and more testing.. - would be happy to discuss with developers and any other observations related to this workload. It give me an impression to be "self limited" as with 64 concurrent users CPU is not really used, looks like there are some internal waits are happening during processing. I've also suspected garbage collection time on JVM level, but monitoring JVM stats showed that it's not so.. Open to any discussion ;-)
MySQL configuration setting I've used :
[mysqld] max_connections=4000 # myisam key_buffer_size=4000M ft_max_word_len = 16 low_priority_updates=1 # general table_open_cache = 8000 table_open_cache_instances=16 back_log=1500 query_cache_type=0 # files innodb_file_per_table innodb_log_file_size=1024M innodb_log_files_in_group=12 innodb_open_files=4000 # buffers innodb_buffer_pool_size= 75000M / 16000M innodb_buffer_pool_instances=32 innodb_log_buffer_size=64M # tune innodb_checksums=1 innodb_checksum_algorithm=crc32 innodb_doublewrite= 0 / 1 innodb_support_xa=0 innodb_thread_concurrency=0 innodb_flush_log_at_trx_commit=1 innodb_flush_method=O_DIRECT innodb_max_dirty_pages_pct=90 innodb_max_dirty_pages_pct_lwm=10 innodb_lru_scan_depth=4000 innodb_page_cleaners=4 join_buffer_size=32K sort_buffer_size=32K innodb_use_native_aio=1 innodb_stats_persistent = 1 innodb_spin_wait_delay=6 # perf special innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_read_io_threads = 16 innodb_write_io_threads = 16 innodb_io_capacity=15000 innodb_purge_threads=4 innodb_max_purge_lag_delay=30000000 innodb_max_purge_lag=1000000 innodb_adaptive_hash_index=0 # Monitoring innodb_monitor_enable = '%' performance_schema=ON performance_schema_instrument='%sync%=on' # MariaDB specific: innodb_compression_algorithm=0 # innodb_mtflush_threads=16 # innodb_use_mtflush=1 # with atomic writes : innodb_use_fallocate=1 innodb_use_atomic_writes=1
Any comments are welcome! ;-)
Thursday, 15 May, 2014
The next MySQL TechDay is taking place in Paris, 22/May (the next week!!!) - if you're MySQL lover and will be in Paris area this day - hurry up to register on the event page and attend it - trust me, you'll not regret ;-))
We're continuing to follow our TechDay tradition:
- the event is completely free (but places are limited, so you have to be registered to attend)
- the content is pure technical and directly from Oracle engineering, no marketing ;-)
- this is a true full day event, and we're reserving enough time to go in depth for each presented stuff..
- the event is taking place in Oracle office in a pretty wide and comfortable amphitheater, covered by WiFi, so you may twit live about #mysqltechday and remain "connected" if this is a part of your constrains ;-)
- we're starting at 10:30 to let you arrive "stressless" regardless traffic issues and distance (we know from previous experience that many arriving from different places in France, far away from Paris, and also some will come even from different countries! - Brussels, London, Birmingham, Dublin are already in our map for now ;-))
- for those who will arrive earlier, a hot coffee with some sweats will be already waiting since 10:00 as a bonus ;-)
- note: if you're arriving via public transportation keep in mind there is a direct tram going to the Oracle office from La Defense station (15min and you're arrived)..
and to finish with organization points:
- around 13:00 we'll have a lunch in Oracle enterprise restaurant,
- around 15:30 a coffee break
- and around 17:30 we're expecting to finish (and let you in the same "stressless" conditions arrive at home ;-))
And now about the content..
very briefly we'll provide you an overview about the latest tech news
from the MySQL Team
then, as promised from the last TechDay, I'll tell you the whole story
about heavy OLTP workloads:
- In-Memory and IO-bound, Read-Only and Read+Write..
- their problems, solutions, workarounds, and improvements already made in MySQL 5.7
- there was a long and hard work made since then, the result are surprising and amazing on the same time - and there are still many questions remaining without an answer.. ;-)
- and, as promised, this time with a full deep dive into InnoDB internals -- we'll dig in details all the story with InnoDB flushing and purge, what was wrong before MySQL 5.5, what remained wrong in 5.5, improved in 5.6, redesigned and probably fixed in 5.7 -- how read-on-write issues were resolved, why parallel + improved flushing was implemented, what can be wrong and how to tune LRU flushing -- I'll tell you ALL ;-))
I'll have 2 hours to tell you the whole story, so be sure, you'll
have for your time ;-)
then, again, as promised, we'll have Mark LEITH as our special guest
during this event!
- last time Mark was unable to come due unexpected "management issues"..
- while this time we fixed all issues ahead, and just crossing fingers now for the flying conditions, as Mark will fly from UK to Paris the same day ;-)
- if you did not attend any Mark's talks before, I'd present him as a "Practical MySQL Performance Schema Magician" !! ;-)
- Performance Schema (PFS) is a gold mine of various valuable information about your MySQL instance
- while entering in any huge gold mine you may feel yourself little bit lost.. ;-)
- but Mark will show you how easy to find there your way in practice, and how powerful solutions built around PFS could be..
- Mark will also present you his "ps_helper" - a collection of scripts he made to simplify practical PFS usage - this is a really great stuff, I'd compare it to what DTrace Toolkit made for DTrace -- you may use many scripts as they are just straightforward, then learn by example and create many new ones adapted explicitly to what you need, etc..
- and trust me, some examples will really surprise you about how deep you may go with PFS ;-)
- the best will be if you'll come with your laptop with installed latest MySQL 5.7 (or 5.6) on it and play with presented stuff by yourself..
- BTW, ps_helper is fully integrated now within 5.7 and taking part of "sys" schema
- as well, if you are not already doing, think to use MySQL Workbench 6.1+ : while this GUI tool is simply great for many general DBA tasks, it also introduced since v.6.1 a very helpful interface to discover, request and configure PFS via GUI.. - the tool is free and can be downloaded from here: http://dev.mysql.com/downloads/tools/workbench/ (Linux, MacOSX and Windoze versions)
- and of course Mark will speak about the latest MySQL Enterprise Monitor (MEM) version - it's fully using now PFS in its metrics and the result is really amazing.. - Mark will tell you all about and show you a live demo, and if you want to try your hands on - you may start from here: http://www.mysql.com/products/enterprise/monitor.html (the tool is not free, but has a long enough trial period to try)..
Well, I hope you'll have a lot of fun and a lot of food for your brain this day! ;-))
See you there!
And also think ahead about other tech topics you'll happy so see covered the next MySQL TechDay..
Useful event links: