Edgewall Software

Ticket #7739 (new enhancement)

Opened 3 months ago

Last modified 2 months ago

trac & memcached

Reported by: antonbatenev@… Owned by:
Priority: normal Milestone: 2.0
Component: general Version:
Severity: normal Keywords: memcached
Cc:

Description

I think it will be useful to add memcached support to Trac to improve performance on high-load projects.

Attachments

Change History

  Changed 3 months ago by rblank

memcached does look interesting. Do you have any experience in adding memcached support to a web application? I wonder how concurrency and locking issues are managed.

Also, we need some profiling beforehand to make sure Trac is actually slowed down by DB accesses in typical situations, as that's what memcached is built to optimize.

  Changed 3 months ago by ebray

I've looked into hacking memcached supported into Trac before. At the time our primary bottleneck *was* in fact database lookups performed by our custom permission system. However, said permissions system was written by someone else originally, and was horribly inefficient. Rather than just throwing more technology at the problem I was able to optimize the permissions system so that instead of dozens of DB queries per page view for permissions alone there are now only one or two.

So now DB access is not really a bottleneck. I also doubt that most Trac sites are high volume enough that this would be worthwhile, though I don't have any real data to support that. It just doesn't seem like something that needs to be in Trac's core, and would be better off as a plugin. Genshi optimization is where the effort needs to go.

follow-up: ↓ 5   Changed 3 months ago by cboos

  • keywords memcached added
  • milestone set to 2.0

Well, that can't possibly be done as a plugin. Memcached support involves adding a check in the cache before every query, and writing results in the cache after every effective query following a cache miss.

One not so intrusive way to achieve this would be to do something at the cursor wrapper level for queries, but we'd also need to handle cache invalidation after every relevant change written to the database.

Not sure if this is possible at large, but possibly in some dedicated areas?

Anyway, keeping the ticket open as a scratchpad, if someone wants to contribute further ideas or even patches.

  Changed 3 months ago by rblank

I just found out that Noah had already started something in th:CacheSystemPlugin. He seems to have stopped at 0.10, though.

in reply to: ↑ 3 ; follow-up: ↓ 6   Changed 3 months ago by ebray

Replying to cboos:

Well, that can't possibly be done as a plugin. Memcached support involves adding a check in the cache before every query, and writing results in the cache after every effective query following a cache miss.

When I suggested it could be done as a plugin, what I meant was that it could be implemented as a DB backend that wraps one of the other backends. It would mostly involve using a cursor wrapper, as you suggested. Perhaps cursor.execute() could even have a keyword argument added to it for whether or not the cache should be used on that particular query, allowing it to be used in just some areas, like you suggested.

in reply to: ↑ 5   Changed 2 months ago by cboos

Replying to ebray:

Replying to cboos:

Well, that can't possibly be done as a plugin. Memcached support involves adding a check in the cache before every query, and writing results in the cache after every effective query following a cache miss.

When I suggested it could be done as a plugin, what I meant was that it could be implemented as a DB backend that wraps one of the other backends.

Mostly agreed. I just don't think it can be done at the DB backend only, i.e. without taking into account cache keys at a higher level. As hinted in comment:25:ticket:6436, a more general cache infrastructure would be a good thing. In that infrastructure, memcached support could be added as a plugin. But there could be other "backends":

  • Simple Cache
    This is mostly what we have now.
    • Each component can maintain its own "builtin" caches like the WikiSystem the page index cache, the TicketSystem the ticket fields cache, etc.
    • The cache validity checks will always fail for most of the usual SQL queries, meaning there will be no caching, but when checking for the information related to the builtin cache, we only fetch the information from the DB once
    • If any other save action invalidates one of those caches, config.touch() is called. This will trigger a reload of the environment at the next request and therefore the builtin caches will be cleared and rebuilt as needed.
      Btw, using config.touch() after creation or deletion of pages, instead of the periodic reload of the page index cache would already be an improvement.
  • DB Cache
    Very much like the above, but with a finer granularity. Instead of doing a config.touch() for invalidating all the caches at once, we increment a counter for a the specific key that has been invalidated (e.g. update cache set value = value + 1 where key = 'wiki_title_index'). At the beginning of each request, we get the keys and the environment can be told which key is no longer up-to-date, and the built-in caches can then be rebuilt when needed.

Well, that's just a rough sketch, feel free to expand on it or beat it down ;-)

Add/Change #7739 (trac & memcached)

Author



Change Properties
<Author field>
Action
as new
as The resolution will be set. Next status will be 'closed'
to The owner will change from (none). Next status will be 'new'
The owner will change from (none) to anonymous. Next status will be 'assigned'
 
Note: See TracTickets for help on using tickets.