Edgewall Software

Ticket #4399 (new defect)

Opened 2 years ago

Last modified 2 months ago

trac.fcgi process memory usage increases with HTTP hits

Reported by: Pistos Owned by: jonas
Priority: high Milestone: not applicable
Component: general Version: 0.11.1
Severity: critical Keywords: memory
Cc: docwhat@…

Description

# ps aux | grep trac | grep -v grep
apache   28347  1.0  1.8  51164 13816 ?        S    22:39   0:01 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.0  1.8  51164 13816 ?        S    22:39   0:01 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.2  1.8  51752 14420 ?        S    22:39   0:01 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.4  1.9  51992 14600 ?        S    22:39   0:02 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.6  1.9  52104 14760 ?        S    22:39   0:02 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.7  1.9  52320 14848 ?        S    22:39   0:02 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.8  1.9  52320 14856 ?        S    22:39   0:02 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.8  1.9  52320 14944 ?        S    22:39   0:03 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi

Above shows the trac.fcgi process going up in memory usage after every page refresh of a subtree in my svn repository. I only noticed this behaviour two or three weeks ago. I also have several lines in /var/log/messages indicating the Linux OOM killer killing off trac.fcgi processes when I did not notice them myself to SIGKILL them; these by-kernel kills are spread out maybe once every one or two days.

At first I thought this had to do with the svn problems that were fixed from 0.10.2 -> 0.10.3, but I just upgraded, and the behaviour remains. The memory also increases when refreshing a ticket list.

I periodically upgrade the packages on my system, so that might have caused something. e.g. a Python upgrade, library upgrade, or somesuch.

This is on a Gentoo server (kernel version 2.6.17), running FastCGI (2.4.0) under Apache (2.0.58). Python version 2.4.3.

Attachments

Change History

  Changed 2 years ago by Pistos

  • summary changed from trac.fcgi process memory usage increases with repo browser hits to trac.fcgi process memory usage increases with HTTP hits

  Changed 2 years ago by anonymous

Any thoughts on this issue? It remains a problem up to today, and I am having to SIGKILL trac processes at least once a day.

  Changed 2 years ago by Pistos

Looks like ticket:4081 is related.

  Changed 2 years ago by cboos

  • keywords memory added
  • severity changed from normal to major
  • milestone set to none

Is the PySqlite db backend involved in this case? If yes, what versions of the bindings and the sqlite library itself are in use?

  Changed 2 years ago by Pistos

Yes, it's an SQLite backend. sqlite-3.3.5, pysqlite-2.3.1.

  Changed 2 years ago by Pistos

It may also be worth mentioning that this behaviour was not always the case. I've run trac in the past before without this issue. I think that was with the 0.9 series.

  Changed 2 years ago by Pistos

Further data: When I only SIGTERM the process instead of SIGKILL it, it grows in memory at an alarming rate. In the area of 1-2 MB per second.

  Changed 22 months ago by Pistos

I've used an SQLite to PostgreSQL Trac converter script to change from SQLite to PostgreSQL as my backend.

It doesn't help the problem. I am still having to SIGKILL two to five times a day (or else trac.fcgi processes consume 15-40% of my RAM). This is very annoying and inconvenient...

  Changed 22 months ago by Pistos

Switching to .cgi and/or re-emerging trac with the postgres USE flag (in Gentoo) seems to have made the problem go away. So, it looks like there may be a problem with FastCGI, my FastCGI settings, and/or Trac's usage of FastCGI.

  Changed 21 months ago by mgood

Memory will never accumulate when using CGI since it starts a new process for each request, so the processes are very short-lived.

  Changed 9 months ago by esm-trac@…

Seeing the same behavior here as well, with trac 0.10.4 running under lighttpd as a fastcgi process: a single refresh of any trac page grows the RSS of the process by anywhere from 80-256k.

This is new behavior since switching to fastcgi; on previous hosting, I was using mod_python without any noticable issues.

  Changed 8 months ago by docwhat@…

  • cc docwhat@… added

  Changed 8 months ago by Joschi

  • priority changed from normal to high
  • severity changed from major to critical

yep, same probleme here... sometime the fcgi proccess is going crazy... will try to switch back to cgi...

  Changed 6 months ago by kiniry@…

We are seeing similar problems on our five Tracs running on Apache 2 and OS X Leopard Server.

Additionally, approximately half a dozen FCGIs per day stop responding and consume as much CPU as they can. E.g.,

kind:BONc# peek fcgi
_www     89173  52.0  0.6   110604  27140   ??  R     4:28PM 360:08.64 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/csi-trac.fcgi
_www     11346  48.7  0.4    98716  16664   ??  R    10:41PM 353:11.91 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/csi-trac.fcgi
_www     11624  48.1  0.6   103268  23136   ??  R    11:00PM 329:11.37 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/mobius-trac.fcgi
_www     89431  45.0  0.6   101188  23080   ??  R     4:41PM 605:35.12 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/mobius-trac.fcgi
_www     18227  43.7  2.1   164836  88048   ??  R     6:42AM  79:45.42 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/mobius-trac.fcgi
_www     89167  41.8  0.8   113012  33124   ??  R     4:28PM 394:37.74 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/mobius-trac.fcgi
_www     11345  40.9  0.4    98912  16876   ??  R    10:41PM 356:56.74 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/csi-trac.fcgi

Tracing them indicates that they are all blocked on semaphores inside of Python.

I suspect that we are witnessing a concurrency issue (perhaps a livelock) related to other mod_python concurrency bugs found here. As we are running one or more FCGI processes per Trac we were hoping that switching from mod_python to FCGI would avoid these concurrency problems that have bit us in the past, but perhaps we were wrong.

Are requests shuttled through Apache to a FCGI application serialized in some fashion? How does Apache know when to spawn new FCGI processes for a given ScriptAlias?? I am new to the whole FCGI API and am digging into it now, but some of these simple questions have not yet been answered in my initial reading.

Joe Kiniry

  Changed 5 months ago by jon

I'm having this problem as well. Is there anything I can do to debug this and get it fixed? I'm not sure how other sites can handle using the fastcgi interface, with how the memory gets used...

follow-up: ↓ 17   Changed 2 months ago by emilien@…

  • version changed from 0.10.3 to 0.11.1

Hi I used to have the same problem with FastCGI, WSGI and the HTTP server included in tracd. I'm using sqlite and lighttpd. For now I must restart Trac in a cron to avoid this memory leak. ++

in reply to: ↑ 16   Changed 2 months ago by cboos

Replying to emilien@…:

... same problem with FastCGI, WSGI and the HTTP server included in tracd. I'm using sqlite and lighttpd. For now I must restart Trac in a cron to avoid this memory leak.

tracd?

If you'd like to help the project, please create a new ticket detailing your tracd configuration, the kind of repository used (svn/hg, cached or not, scoped or not), the other modules (like Genshi) and plugins you use, and then the usage patterns that make the memory grow. See #6614 for an example of the kind of information we need.

Sorry, I can't help for the FCGI issue (discussed here), but I'm highly concerned about the presence of leaks with tracd, in Trac > 0.11.1. So by any means, please help us reproduce the problem so that we could eventually fix it.

Add/Change #4399 (trac.fcgi process memory usage increases with HTTP hits)

Author



Change Properties
<Author field>
Action
as new
as The resolution will be set. Next status will be 'closed'
to The owner will change from jonas. Next status will be 'new'
The owner will change from jonas to anonymous. Next status will be 'assigned'
 
Note: See TracTickets for help on using tickets.