Opened 12 months ago
Last modified 12 months ago
#51372 new defect (bug)
Race condition causes an autoload option to leak outside of alloptions
Reported by: | kovshenin | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | 2.2 |
Component: | Cache API | Keywords: | dev-feedback |
Focuses: | Cc: |
Description
There is a problem with add_option()
which causes an autoloaded item to leak outside of the alloptions
cache key and into its own item under the options
group.
This becomes pretty broken with persistent object caching, because it leads to a state, where the option is stuck under its own cache key in the options
group, and is unable to be deleted or updated, while the underlying database value is completely gone.
I've been able to reproduce this in multiple ways, with persistent object caching turned on or off. The backend I used in my testing is wp-memcached, but it should be reproducable with any other backend.
The first and easiest way is to just add an error_log()
to wp-includes/options.php in get_option()
right before wp_cache_add()
that writes to the options
group:
function get_option() { ... if ( is_object( $row ) ) { ... if ( $option == 'foo' ) { error_log( 'leaked' ); } wp_cache_add( $option, $value, 'options' );
Now in a simple plugin we can start writing the foo
option, like this:
add_action( 'init', function() { delete_option( 'foo' ); add_option( 'foo', 'bar' ); die(); } );
The autoload
flag defaults to true
, so in this scenario the foo
item should always end up in the alloptions
cache key. However, if you run this in multiple parallel threads (with ab
for example), you'll see the leaked
message in your error log, which means foo
has been written to the foo
key under the options
group.
ab -c 100 -n 1000 http://localhost/
Another way to reproduce this is to turn on a persistent object caching plugin, visit the site once to trigger a single add_option()
, then check some of site options and cache keys:
wp option get foo # says bar wp cache get foo options # error, because foo is autoloaded and stored in alloptions wp cache get alloptions options # big array with foo => bar at the end
You can also confirm the database value is there:
In MySQL:
SELECT * FROM wp_options WHERE option_name = 'foo'; +-----------+-------------+--------------+----------+ | option_id | option_name | option_value | autoload | +-----------+-------------+--------------+----------+ | 2403 | foo | bar | yes | +-----------+-------------+--------------+----------+
Looks good so far.
Now run the same ab
test with concurrent requests, and check again.
wp option get foo # says bar wp cache get foo options # says bar, because the value leaked from alloptions into its own item wp cache get alloptions options # big array, but NO foo => bar
And finally, in MySQL shell:
SELECT * FROM wp_options WHERE option_name = 'foo'; Empty set (0.00 sec)
At this point the value is gone, and only remains as a stale item in Memcached under the wrong key. Deleting, adding, or updating the item will not work:
$ wp option delete foo Warning: Could not delete 'foo' option. Does it exist? $ wp option add foo bar Error: Could not add option 'foo'. Does it already exist? $ wp option update foo baz Error: Could not update option 'foo'.
Flushing Memcached (or having the key evicted) will "fix" the stalemate, but will cause the data to be lost forever.
Here's a brief overview of what could happen inside add_option()
in two concurrent threads:
Thread 1: add_option( 'foo', 'bar' ); Thread 2: add_option( 'foo', 'bar' ); 1: .. get_option( 'foo' ) // false 1: .. INSERT INTO // true 2: .. get_option( 'foo' ) 2: .. .. isset( $alloptions[ 'foo' ] ) // false 2: .. .. ->get_row() 2: .. .. wp_cache_add( 'foo', 'bar', 'options' ); // LEAKED 1: $alloptions[] = ... 1: wp_cache_set( 'alloptions', $alloptions, 'options' );
This is then followed by the next delete_option()
call, which successfully deletes the data from the database and the alloptions
key, so then future calls to add_option()
will fail, because get_option()
will always return the data from cache.
I'm not sure about the best way to fix it. Here are a few thoughts:
- Maybe
add_option()
should not rely so heavily onget_option()
, and do its own checks with cache functions, depending on theautoload
function argument delete_option()
could clear both thealloptions
array item as well as the$option
key in theoptions
group- In
get_option()
if we were unable to retrieve the data from$alloptions
and option cache, maybe query theautoload
column together withoption_value
, before just assuming it's a no:
$row = $wpdb->get_row( $wpdb->prepare( "SELECT option_value, autoload ...
Then handle the cache addition differently, based on that autoload flag. This sounds the most reasonable to me, because this is exactly the place where the item is being put into the wrong key, which causes the other problems.
I haven't actually tested this with 2.2, but it seems like that's where the alloptions
vs options
cache keys appeared inside get_option()
right around r4855.
Related: #51352