Tuesday, 17 February, 2015
There were so many valuable articles already written by others over past
years explaining all details about InnoDB transaction isolation modes
and how to deal with this. So, I'll avoid to repeat what was already
said ;-) -- my attention attracted the performance study made by PeterZ
and published in the following article: http://www.percona.com/blog/2015/01/14/mysql-performance-implications-of-innodb-isolation-modes/
-- the article is very good and providing a good analyze of the observed
problem which is solved by using READ-COMMITTED transaction isolation
instead of REPEATABLE-READ (which is default in InnoDB).. The natural
question is coming then: why don't we have then the READ-COMMITTED mode
by default?.. Is there any danger?..
Let's then investigate together..
First of all, you should keep in your mind not only the theory, but also a way in which all this stuff is implemented within InnoDB :
- transaction isolation / MVCC in InnoDB is implemented via ReadViews
- every time a ReadView is created, a mutex (trx_sys) should be acquired
- in REPEATABLE-READ mode a ReadView is created on transaction start
- in READ-COMMITTED mode a ReadView is created on every statement(!)
- means, if your statements are short -- you may hit a storm on trx_sys mutex contention in your workload..
Of course, all depends on a workload, and if you're lucky, you will probably see only a benefit here (all is possible, right? - at least in theory)..
Let me show you now some cases where you'll feel yourself much less lucky ;-)
For my test cases I'll use :
- 40cores-HT server
- running OEL 6.5 2.6.32-504 kernel
- extremely fast flash storage (Fusion-io ioMemory)
- each workload is using 64 concurrent users
3 test cases executed on each workload :
- REPEATABLE-READ mode is configured (RR)
- READ-COMMITTED mode is configured (RC)
- READ-UNCOMMITTED mode is configured (RU)
DBT-2 500W Workload (TPC-C) :
- you may already observe here a slightly lower TPS on both 2nd (RC) and 3rd (RU) test cases comparing to the first one (RR)
- the "regression" is not very big, but notable
- let's get a look now on internal lock contentions within InnoDB..
- a jumping contention on trx_sys mutex is very well seen for RC and RU tests
- however it's not yet too big to make a serious damage..
Now, let's move to a heavy Sysbench OLTP_RW -- I've slightly changed the "classic" OLTP_RW here by adding a Read/Write ratio in my tests :
- initially the load is starting with 128:1 ratio (mostly Reads, less Writes)
- then 16:1
- then 4:1
- then 2:1
- and finally 1:1
I'm using this test case also to evaluate the impact of writes in transactions, etc..
Sysbench OLTP_RW 32x10M-tables, rw128/16/4/2/1 :
- here the impact is more then just notable ;-)
- and this is only due trx_sys mutex contention? or something else?..
- oh, indeed, trx_sys is jumping too high now!..
- and could it be even more worse??
let's see ;-)
The next workload has a code name "OLTP_RW-p_sel10" - all reads in this test are replaced by 10 point-selects, that's all, making the load much more aggressive on writes and short and fast on reads :
Sysbench OLTP_RW-p_sel10 32x10M-tables, rw128/16/4/2/1 :
- indeed, seeing a x2 time worse performance is really killing..
- and still due trx_sys mutex??
no comments ;-))
Well, you may still say that it's just because this server is too big and that's why I'm observing all these contentions, and you'll be not far from the reality -- on smaller machines all these contentions are, of course, lower - but! still "notable" ;-))
The same OLTP_RW-p_sel10, but on 20cores-HT :
(while many x2 CPU Intel machines today are having more than 32cores-HT in total, so a "small" HW becomes a big one ;-))
INSTEAD OF SUMMARY :
- so, what should we finally conclude from all this presented stuff???..
- PeterZ told us a so nice story, and now you're coming with your b*** and showing that PeterZ was wrong...
- Stop, guys, PeterZ was not wrong!!! ;-)
- ?? -- so, you're lying ??????
- and I'm not lying either ;-))
- well, what you should keep in mind is that there is no a "silver bullet" and the most universal answer in most of the cases will be "it depends" ;-))
- and with InnoDB Transaction Isolation is the same story here!
THE GENERAL RULE could be the following :
- if your queries and transactions are short : use rather the default REPEATABLE-READ mode!
- if your queries are long and reading a lot of data which are likely to be modified by other transactions in parallel : then use the READ-COMMITTED mode - this will allow you to read already committed changes while your query is progressing and avoid to be lost in scanning of the old page images instead (as PeterZ well showed you in his story ;-))
- Repeatable Read Isolation Level in InnoDB - How Consistent Read View Works
- Differences between READ-COMMITTED and REPEATABLE-READ transaction isolation levels
- The basics of the InnoDB undo logging and history system
- If Eventual Consistency Seems Hard, Wait Till You Try MVCC
- SET TRANSACTION Syntax
As usual, any comments are welcome!