I belive the issue is not related to long running queries.
We found this severely annoying and now use My SQL's event scheduler to update a timer table on each master every second, so we can actually see actual delay from the global master (in a non-ring topology) or delay from any peer in a ring.) - and move your way up as the read performance increases without reaching "Waiting for query cache lock" on writes."Be cautious about sizing the query cache excessively large, which increases the overhead required to maintain the cache, possibly beyond the benefit of enabling it. Sizes in the hundreds of megabytes might not be." is not a useful answer.In basic terms the processing steps involved are compression of time series data in per second resolution into per minute, hour and day resolutions. It is the Query #3 that eventually produces the error "", presumably because it is the first one to timeout.When My SQL locks up, I execute SHOW PROCESSLIST command and I see the following queries: N User Time Status SQL query 1 system user XX update INSERT INTO `db A`.`table A` (...) VALUES (...) 2 ???? It looks like some sort of dead lock, but I cannot understand why.