Built with 
HomeBrave Tech WorldAbout SiteMarcelo CalbucciMy Videos

Brave Tech World

Week 3
SMTWTFS
19202122232425

January 24, 2008


THU
24
JAN

How do you optimize you service?

By Marcelo Calbucci

[Warning: This post contains code, which might cause seizure in certain readers]

 

    Every single piece of software ever created had to go through a decision process (even if implicit) of what is the most common scenario it will be used. Then, the code will be optimize accordingly. For example, if Outlook expects that 99.9% of its users will have 3,000 or less messages on their Inbox, developers don't have to over think how to make it perform well if somebody has 30,000 emails.

 

    When I started Sampa I put a lot of thought into performance, too much even. I put too much emphasis in reducing Disk I/O, Network round-trips, network bandwidth and CPU usage.

 

    From one perspective it did pay off. We have absolutely no issues on the items that I optimized for. But if you look close at that list, you'll see there is one thing that I didn't optimize for is suffering the most right now... Memory usage, aka, RAM.

 

    We are running roughly 25,000 sites per server, but over the last month or so it became clear that memory consumption was out of control. The process needed to be restarted every 24-48 hours. For end users they didn't feel a blip, except during a few seconds during the restart. That is no good. What if the process needs to start every 12 hours, then every 6 hours ...

 

    Well, investigating where memory is being used is very hard. I don't have the time or a test team to help me. So, during a Saturday night over red wine talking with a friend (here on being called D.C.) that is working on C#-related stuff at Microsoft, I learned that a huge assumption I've made was incorrect.

 

    I thought that all strings were internalized by default, this means, only one copy of the string "SITE" would be created in memory and all references would point to it (wasn't this the reason strings are immutable?). Anyway, they are not and you must call String.Intern() to do that.

 

    Next step was to understands what kind of strings had the most copy in memory and would benefit the most of being internalized (if you internalize everything the GC will never collect them). After D.C. sent me some simple CDB commands and found out some pretty amazing stuff...

 

    In a process with about 500 MB of memory being used, here is a sample of the strings in memory and their memory usage:

 

  • "BasePlan" = 690,552 instances = 19MB of memory
  • "type" = 615,776 instances = 22MB of memory
  • "site" = 537,124 instances = 19MB of memory
  • and another 25-30 strings like those.

 

    After this fix, the process has not even reached 500MB after running for 24h, and before it would have between 750-900Mb.

 

    Two points to make on this:

 

  • Don't over optimize for any kind of scenario until you have real usage data.
  • Don't make wrong assumptions when writing your code.

 



Comments for "How do you optimize you serv...

No comments posted.
Similar Content
Powered by Google