XML-RPC memory improvements
I’ve been looking into #21098 since yesterday, trying to figure out how to minimize out of memory errors on media uploads through XML-RPC. Results were a bit unexpected, but interesting nonetheless. Any ideas on how to reduce memory usage are more than welcome.
I’m still not sure where all the memory goes, for instance a 1MB file upload takes 50M of memory, while a 10M file takes 153M. The base64 encoding represents just a 33% overhead, so that’s not it. Also, it seems that even if you disable $HTTP_RAW_POST_DATA, php keeps the whole POST body in memory, but still, that’s a lot of memory to account for.
Fix #1: read from php://input directly
After a lot of playing with xdebug, IXR_Message->parse() didn’t look too efficient, so I tried https://gist.github.com/koke/5720860.
The current implementations relies a lot on substr, which works fine for smaller requests, but not so much on media upload. After some early testings, my solution seemed to require a bit more memory on small requests, but then performed much better as request size increased.
How did I test
I added this plugin (xmlrpc-test-upload.php), which adds a new wp.testUpload method: it replicates metaWeblog.newMediaObject, but it also returns peak memory usage. I had some problems with integer overflows so I ended up using the values output to the error log.
Then did a quick client on test.php and created a bunch of blank videos with file sizes from 1 to 200 mb. For instance, to get a ~1MB video:
ffmpeg -t 32 -s 1280x720 -f rawvideo -pix_fmt rgb24 -r 25 -i /dev/zero -y test-1M.mpg
Memory results
The memory reductions weren’t as good as I expected, but still worth a try
Peak memory usage | |||
File size (MB) | Baseline | php://input | Reduction |
1 | 50M | 49M | 2.09% |
2 | 61M | 58M | 4.31% |
5 | 94M | 88M | 6.96% |
10 | 153M | 139M | 9.08% |
50 | 649M | 580M | 10.63% |
100 | 1267M | 1129M | 10.87% |
200 | 2503M | 2228M | 10.99% |
Time results
After some trial/error process while running the tests, I did notice the new version felt faster, so I tested for that as well. Results were more pleasing this time:
Response time (s) | |||
File size (MB) | Baseline | php://input | Reduction |
1 | 0.189 | 0.197 | -4.30% |
2 | 0.228 | 0.202 | 11.41% |
5 | 0.330 | 0.282 | 14.34% |
10 | 0.620 | 0.414 | 33.34% |
50 | 5.48 | 1.26 | 76.92% |
100 | 18.93 | 2.34 | 87.66% |
200 | 93.23 | 5.39 | 94.22% |
What’s next
Even after this, I’m not sure yet where most of the memory goes. Running a xdebug trace on the new code, with a 200MB file upload:
- When xmlrpc.php is called, xdebug already reports a memory usage of 786MB.
- Once WordPress is initialized and IXR_Server->serve() is called, memory usage is 1083MB.
- Just before IXR finishes parsing, it reaches peak memory usage at 1607MB.
- At the end of execution memory usage is 1345MB.
I’ve uploaded the xdebug trace (trace-200.txt) in case someone else can spot another improvements.
daniloercoli 12:52 pm on June 6, 2013 Permalink | Log in to Reply
PHP always populates the variable $HTTP_RAW_POST_DATA on POST requests with “text/xml” content-type. Even if you set the directive always_populate_raw_post_data to Off, PHP populates that variable. The only exception are requests with content-type of “application/x-www-form-urlencoded” or “multipart/form-data”. I investigated more, and seems that there isn’t a simple way to get rid of $HTTP_RAW_POST_DATA without modifying the PHP src code and recompile it.
I’ve also wrote a plugin that does the following:
https://github.com/daniloercoli/WordPress-streaming-xml-rpc
Jorge Bernal 1:49 pm on June 6, 2013 Permalink | Log in to Reply
I tried setting application/x-www-form-urlencoded as well, but even if $HTTP_RAW_POST_DATA was empty, memory usage was similar.
I have the feeling that PHP stores the POST body in memory, unless it’s a “multipart/form-data” upload, so not sure if there’s really anything we can do there without switching to a different API.
I’m trying to test your plugin, but I’m getting a problem opening the temp files
Jorge Bernal 2:11 pm on June 6, 2013 Permalink | Log in to Reply
Tried with your plugin, and peaked at 1085MB (vs 1607MB) for a 200MB file. So that was a bit more efficient although a bit slower (9s vs 5s)
But the interesting part is looking at the trace: when it starts, memory is already at 786MB, so it seems PHP is keeping everything in memory anyway.
Jorge Bernal 2:19 pm on June 6, 2013 Permalink | Log in to Reply
More funny bits, if I set “Content-Type: multipart/form-data”:
Cheating? I have to read more about the spec, but the file was uploaded successfully
daniloercoli 2:29 pm on June 6, 2013 Permalink | Log in to Reply
Dan 2:50 pm on June 6, 2013 Permalink | Log in to Reply
That’s a funny hack. Think that’d work on all hosts?
Jorge Bernal 2:57 pm on June 6, 2013 Permalink | Log in to Reply
I’m getting
But it seems to work
Eric 1:13 pm on June 6, 2013 Permalink | Log in to Reply
That is a huge improvement in response time! For large files anyway.