From:  php_lists@realplain.com ("Matt Wilmas")
Date:  08 Dec 2015 00:54:00 Hong Kong Time
Newsgroup:  news.php.net/php.internals.win
Subject:  

Re: [PHP-DEV] Windows (Visual Studio) compiler stuff

NNTP-Posting-Host:  null

Hi Anatol, all,

CFG's effect on Wordpress at the end... :-/

----- Original Message -----
From: "Anatol Belski"
Sent: Wednesday, November 25, 2015

> Hi Matt,
>
> I wonder really how much research you do :)

Not much on this...  Hope there aren't major inaccuracies.

I just came across stuff while doing other things, otherwise maybe I would 
have discovered this sooner!

>> -----Original Message-----
>> From: Matt Wilmas [mailto:php_lists@realplain.com]
>> Sent: Monday, November 23, 2015 2:28 AM
>> To: internals@lists.php.net; internals-win@lists.php.net
>> Cc: Dmitry Stogov ; Anatol Belski
> ;
>> Pierre Joye ; Matt Tait ;
> Nikita
>> Popov 
>> Subject: Re: [PHP-DEV] Windows (Visual Studio) compiler stuff
>>
>> Hi Anatol, Dmitry, all,
>>
>> Will reply about the original subject issues soon, but this is about new
> stuff I
>> noticed the other day...  Adding Matt Tait and Nikita because of PR
>> #1418 and comments.
>>
>> Anyway, the new Control Flow Guard (/guard:cf) is causing a big slowdown
> on
>> bench.php. :-(  14% on a Yorkfield (Q9400) and 19% on a Sandy Bridge
> (Celeron
>> G530).  Ouch.  Did anyone else check the performance impact?  Is this
>> acceptable?  On any other platform...?
>>
>> I'll definitely remove that from my builds (Elephpant Sanctuary, coming
>> soon) since it's useless on all but the latest Windows versions anyway.
>>
>> But if that "feature" must remain enabled otherwise, I think we can
> eliminate
>> most of the performance hit.  As Nikita wondered about, I first wanted to
> look
>> at the indirect calls to the opcode handlers.  I tried
>> separating out zend_execute.c in the Makefile and added /guard:cf-
> Bingo!
>> That restored about 98% of the speed on bench.php.  It reduced the
> --disable-all
>> NTS DLL by 13.5 KB (of the 67 KB added by full CFG).
>>
>> Or could maybe change back to the old SWITCH executor?  I didn't try 
>> that.
>>
>>
>> It seems like it would be a good "rule" to not use any MS stuff that 
>> isn't
> done on
>> other compilers/platforms. :-)
>>
>> /GS [1] is another that is/was starting to get annoying (function
> prolog/epilog);
>> luckily I was able to suppress it in most cases with changes I'm making.
> It's
>> enabled by default, of course, although I see it's
>> commented out in a line (old?) of confutils.js.  /GS-   ;-)   I really
> hope
>> there aren't places where we are not doing range checks, etc. ourselves
> (that
>> the compiler can't see).  So, either /GS is a waste, or it's only a 
>> matter
> of time
>> with other compilers?!
>>
>> [1]
>> http://stackoverflow.com/questions/6607410/understanding-buffer-security-
>> check-gs-compiler-option-in-msvc
>>
> We're unlikely to remove the security options in favor of performance. But
> that's for one.

I didn't expect that it would be totally removed (though I will since I 
consider it a useless MS "feature" *when applied to PHP*; again, to not use 
anything that doesn't exist for other platform builds).  It'd be a different 
story if it didn't kill performance.

That's why I pointed out that removing it from zend_execute.c ONLY 
eliminates most of the penalty.  Doesn't look like many indirect calls there 
to worry about besides the executor.

Are we saying that an exploit is going to modify the opcodes, etc.?  Has 
that ever happened (serious question)?


And since I mentioned using the old SWITCH executor, I checked and it's no 
good.  The specialized version is ~2.3x slower than CALL! :-O 
 The --without-specializer version takes *forever* to compile (10x longer) 
and is much faster, but about ~2% slower than CALL with CFG...

> /guard:cf is documented to have possible performance impact on systems 
> that
> don't support it.

Why only on systems that don't support it?  That doesn't make any sense.  If 
anything, I'd expect the opposite.  I didn't try to check the code, but 
wonder why it's not more nop-ish on unsupported systems...  And I don't know 
if it's just extra instructions slowing things down, or if a lookup table or 
whatever for valid targets is destroying the cache.

> However no such side effects was noticed even on win7.

So what did you see...?  I said 19% slowdown on bench.php, Sandy Bridge, Win 
7 x64, 32-bit.  Have since checked 64-bit build and that's 20%.

> There was also no bug reports in this regard.

Bug reports against what...?  Probably nobody was looking, and first it was 
VS 2012 before 2015 and then enabling CFG.  I guess I could file one now 
since testing Wordpress?

> We definitely can't test any
> possible HW, but it's more about OS, not HW.

I'd wager the opposite...  It's affecting CPU stuff, after all.

> Is win7 your case? Then just upgrade :)

Yeah, on the one system.  The Yorkfield is Windows XP *gasp*.  Definitely 
won't just "upgrade" any system unless desired for reasons overall.  And as 
long as I'm not satisfied with desktop Linux, Hackintosh, ....

Anyway, I *need* XP, and there's really no legitimate reason to not support 
it (only FUD/lies, sorry ;-))...  I was just waiting on MS bugs [1][2] to 
allow VC14 to be used, which 2015 Update 1 fixes! :-O  Even full-of-XP-lies 
Microsoft still supports the VC runtime, at least.

[1] 
https://social.msdn.microsoft.com/Forums/vstudio/en-US/52b0c797-6fa7-4933-8bb2-fe90e8764e27/visual-c-2015-express-stat-not-working-on-windows-xp?forum=vcgeneral
[2] 
https://connect.microsoft.com/VisualStudio/feedback/details/1557168/wstat64-returns-1-on-xp-always


To give another idea of how much CFG hurts: it takes bench.php performance 
back to VS 2008 level (works with just a few more changes).  7 years of 
compiler improvement just eliminated!

> With /GS is basically same. It's not supposed to fix the programmer
> mistakes, but to add protection against exploits. Stability and
> compatibility matters more than a performance trade off.

I didn't check anything (performance) with /GS- just complaining about the 
extra code it adds to functions. :-)  (Again useless in almost all cases 
since we already have checks...)

> Another thing is that just one synthetic test is unlikely to reveal the 
> big
> picture. You should probably also test on some real apps, that will bring
> more realistic results.

Wordpress 4.3.1 results (MySQL 5.5, no query cache):

~5% slower on Win 7, 32-bit, Sandy Bridge
6~7% slower on Win XP, Yorkfield

(Had to fix Wordpress to allow persistent connections ("p:localhost"), since 
Windows connections (TCP and pipes) are EXTREMELY slow since some years 
(5-10) ago, for some reason. :-(  Dunno if it's after mysqlnd or what, never 
really investigated.)

That's a pretty big deal, considering the work that has been done for 1-2% 
speedups ("big enough" or "worthwhile" improvements).

Obviously anything that uses opcodes is affected (read: everything).

> Thanks for your work.
>
> Regards
>
> Anatol

- Matt