by Dimitri |
OSC Team, 2010
Oracle / Sun Microsystems Inc.
|1. Benchmark Information|
Customer Name(s): MySQL Performance Team
Contact Information: dimitri (dot) kravtchuk (at) oracle.com
Keywords: InnoDB, Buffer Pool, Page Management, Purge
|2. Hardware Configuration|
Server(s): 32cores Intel Server
Storage: Internal SSD disks
|3. Software Configuration|
- Fedora 12 Linux 64bit
- MySQL 5.5
The issues analyzed in the following report were discovered yet in May 2009 during various MySQL 5.4 testings (ref: InnoDB dirty pages and log size impact ), but on that time it was not the first priority :-)) But now, when there are so many things are fixed within InnoDB - it's probably a time to get a closer look on the Buffer Pool page management issue..
So far, what I've observed before:
- Test workload: Read+Write, where write transactions are UPDATE statements only (generated via dbSTRESS)
- Each UPDATE statement is modifying only non-indexed fields
- Each UPDATE statement new value is not longer that the previous one for the corresponding field
- So, each UPDATE is applied in place by InnoDB and no new space will be needed for updated record..
- The number of database records is fixed for this test, and all records are easily fitting to the Buffer Pool
- So we may expect that once all records are seating in the Buffer Pool there will be no more disk reads and the Buffer Pool occupancy will remain stable...
- But it's not exactly what is going on: it's true there is no reads :-)) but the Buffer Pool occupancy is far from stable and growing all the time of the test!
My understanding on that time was that the Buffer Pool occupancy is growing due UNDO pages, and the solution I've found on that time was a combination of several settings:
- limiting InnoDB thread concurrency to 16
- set dirty page percentage to 15
- until purge activity is able to follow the workload - free pages level in the Buffer Pool remains stable
But do we still have this issue in MySQL 5.5 ???
|4.1. Probe Test|
As a probe test I'll try to reproduce the old conditions, but within MySQL 5.5 configuration:
- buffer pool instances = 1
- purge threads = 0
- max purge lag = 0
So?.. What about the Buffer Pool pages?..
Detailed STATs: Probe Test @
Well, the problem is still here...
From the graph above you may see 2 phases:
- the test is started by the Read-Only warm up, and very quickly all pages are seating in the buffer pool, and the free buffers list level remains stable..
- then the Read+Write activity starts (just before 20:20)
- the dirty pages level (modified pages) is growing but then remains stable..
- while database pages level continue to increase!! (and of course free pages level is decreasing)..
So, we have in-place UPDATEs, but continue to occupy Buffer Pool with.. UNDO pages? other?..
From the following detailed STAT graphs you may see that History Length is growing too during this test..
|4.2. Test with Buffer Pool Instances = 4|
Same test, but now with 4 buffer pool instances.
Curiously, it does not change the issue with Buffer Pool pages, BUT dramatically reduces History Length !! Similar results are observed with 8 and 16 buffer pool instances..
Detailed STATs: Test with BP instances=4 @
Detailed STATs: Test with BP instances=8 @
|4.3. Activating Purge Thread|
I did not really expected any improvement by activating a Purge Thread as during all the last tests the History Length remained stable and did not out passed even 200K..
But I was happy to be positively surprised, because once the Purge Thread was enabled the Buffer Pool usage start to look like something I've expected to see for the current workload :-))
Detailed STATs: Test with BP instances=8 + purge_thread @
|4.4. Higher Workload|
But I was curious now if on the higher workload I'll still see the same stable Buffer Pool usage.. So, I've just start the same test with 2 times more concurrent users ( 64 users now) and observed what I've afraid about - the same issue began again..
Detailed STATs: Test 64 users, with BP instances=8 + purge_thread @
|4.5. Activating Max Purge Lag Limit|
As during the last test History Length started to grow too, the last try I've decided to do is to see if the Max Purge Lag setting fix may help here.. And finally it helps well :-)) As you can see, the Buffer Pool usage become "normal" again even with double load :-))
The max purge lag is set to 400K.
Detailed STATs: Test 64 users, with BP inst=8 + prg_thread + fixed_lag 400K
Detailed STATs: Test 64 users, with BP instances=8 + purge_thread + fixed_purge_lag 400K
- How do you explain all these observations?..
- By what kind of data the Buffer Pool is occupied in worse cases?..
- Don't we waste the memory here? ;-)
- As far as purge is able to follow the workload activity everything is working well..
- So keep your purge activity on eye and configure your server in way to purge as fast as possible ;-)
To be continued.. :-))