From:  Eric Rahm <erahm@mozilla.com>
Date:  14 Nov 2015 09:09:30 Hong Kong Time
Newsgroup:  news.mozilla.org/mozilla.dev.memory
Subject:  

Re: e10s Memory Usage w/ Multiple Content Processes

NNTP-Posting-Host:  63.245.214.181

Latest numbers on OS X. This was with a custom m-c opt build that added a resident-unique reporter:

                      0             1           2           4         8
Start             334639104    367407104   358420480   352612352   352194560
StartSettled      326307840    412536832   401289216   402436096   400433152
TabsOpen          932483072   1088946176  1303126016  1464836096  1776570368
TabsOpenSettled   918941696   1024069632  1158631424  1312661504  1710915584
TabsOpenForceGC   833212416   1012817920  1149247488  1294901248  1614741504
TabsClosed        832528384   1043906560  1024188416   932352000   926244864
TabsClosedSettled 773730304    969981952   918355968   863203328   872923136
TabsClosedForceGC 651497472    838488064   837312512   791310336   783257600

                       1       2       4      8
Start                9.79%   7.11%   5.37%   5.25%
StartSettled        26.43%  22.98%  23.33%  22.72%
TabsOpen            16.78%  39.75%  57.09%  90.52%
TabsOpenSettled     11.44%  26.08%  42.84%  86.18%
TabsOpenForceGC     21.56%  37.93%  55.41%  93.80%
TabsClosed          25.39%  23.02%  11.99%  11.26%
TabsClosedSettled   25.36%  18.69%  11.56%  12.82%
TabsClosedForceGC   28.70%  28.52%  21.46%  20.22%


The takeaway:
  • OS X is using 2X the memory of Linux (even in the non-e10s case)
  • % increases are somewhat in line w/ Linux, there's less overheard for the 'Start' case but after 30s that win is lost
Up next I'll take a look at Windows.

On Fri, Nov 6, 2015 at 4:07 PM, Eric Rahm <erahm@mozilla.com> wrote:
As part of a Q4 goal I'm digging into the memory usage of Firefox w/ and w/o e10s enabled. Below I will give preliminary results for Linux.

To support this I updated AWSY to support multiple content processes and did an initial test run against a linux64 nightly from 11/3. For the memory comparison I settled on using a combination of the RSS value for the main process and the USS values for the content processes.

For example:
memory_footprint_2_content_processes = rss_main + uss_content_1 + uss_content_2

The following table gives the breakdown for the total memory consumption at each AWSY checkpoint for 0, 1, 2, 4, 8 content processes:

                     0             1            2           4            8
Start             199389184    243347456    233750528    234110976    240480256
StartSettled      181604352    230133760    226234368    229756928    223780864
TabsOpen          478851072    570781696    614490112    748322816    913575936
TabsOpenSettled   470085632    568053760    610627584    730226688    913879040
TabsOpenForceGC   434724864    534331392    587403264    702947328    860110848
TabsClosed        404824064    531988480    420941824    399577088    399527936
TabsClosedSettled 277299200    375967744    340418560    322555904    317845504
TabsClosedForceGC 253362176    337911808    318586880    298467328    294383616


As with all AWSY measurements there's a fair amount of variance b/w runs (for example a 25 MiB swing in TabsOpen is not unheard of, but that works out to about 5%).

Another way to look at this is % increase from non-e10s:

                     1       2       4       8
Start             22.05%  17.23%  17.41%  20.61%
StartSettled      26.72%  24.58%  26.52%  23.22%
TabsOpen          19.20%  28.33%  56.27%  90.78%
TabsOpenSettled   20.84%  29.90%  55.34%  94.41%
TabsOpenForceGC   22.91%  35.12%  61.70%  97.85%
TabsClosed        31.41%  3.98%   -1.30%  -1.31%
TabsClosedSettled 35.58%  22.76%  16.32%  14.62%
TabsClosedForceGC 33.37%  25.74%  17.80%  16.19%


A few loose observations:
  • There's a ~20-25% overhead right off the bat just by having 1 content process running (that's the case for Start measurements)
  • The growth is mostly sublinear
  • It actually seems like 2 content processes might have a good tradeoff. Naively we'd expect the %increase for the TabsOpen cases to double, but that's clearly not the case.
  • The more content processes that were used, the better we did on the TabsClosed metrics

I will be continuing my measurements on OSX and Windows next. Let me know of any other stats that might be useful (heap-unclassified comes to mind).

-e