Gavin Pugh - A Videogame Programming Blog

XNA/C# – Avoiding garbage when working with StringBuilder

1 April, 2010 at 7:23am | XNA / C#

Garbage

In my previous coding post, I spoke about some issues with converting a mutable StringBuilder string back to a regular ‘string’ object without generating garbage. Well, specifically without requiring an unnecessary heap allocation. One thing I hinted at was that StringBuilder has a number of other methods which generate garbage too. In fact, there are a lot of important fundamental methods which do, which are difficult to live without.

As I’ve mentioned before, worrying about this sort of thing may not be necessary for the game you’re working on. It’s much more of a concern on Xbox 360 than PC, due to the poorly performing garbage collector on 360. If your game isn’t something that’s going to remotely push the hardware, or be impacted by dropped frames, then you really don’t need to worry. This article is just for those who may see this as an issue, and want to explore ways to eliminate this particular method of generating garbage.

Problem StringBuilder methods

So, here’s a table of all the methods StringBuilder has. My test case used a StringBuilder constructed with an initial capacity of 1024 characters. This is plenty for anything I threw at it, so any garbage I found with CLRProfiler was not something associated with reallocating the StringBuilder internal string. In the cases where there’s garbage generated I’ve commented with the most interesting or appropriate function in the callstack. In many cases it’s just exactly the same method, but with the function signature reported by CLRProfiler.

Garbage Pertinent Allocation / Notes
StringBuilder Append(bool value); No
StringBuilder Append(byte value); Yes Append(unsigned int8)
StringBuilder Append(char value); No
StringBuilder Append(char[] value); No
StringBuilder Append(decimal value); Yes Append(System.Decimal)
StringBuilder Append(double value); Yes Append(float64)
StringBuilder Append(float value); Yes Append(float32)
StringBuilder Append(int value); Yes Append(int32)
StringBuilder Append(long value); Yes Append(int64)
StringBuilder Append(object value); Yes object::ToString()
StringBuilder Append(sbyte value); Yes Append(int8)
StringBuilder Append(short value); Yes Append(int16)
StringBuilder Append(string value); No
StringBuilder Append(uint value); Yes Append(unsigned int32)
StringBuilder Append(ulong value); Yes Append(unsigned int64)
StringBuilder Append(ushort value); Yes Append(unsigned int16)
StringBuilder Append(char value, int repeatCount); No
StringBuilder Append(char[] value, …); No
StringBuilder Append(string value, …); No
StringBuilder AppendFormat(…); (all five) Yes In all cases, even without args
StringBuilder AppendLine(); No
StringBuilder AppendLine(string value); No
void CopyTo(…); No
int EnsureCapacity(int capacity); Yes If capacity param > current capacity
bool Equals(StringBuilder sb); No
StringBuilder Insert(int index, bool value); No
StringBuilder Insert(int index, byte value); Yes Insert(int32, unsigned int8)
StringBuilder Insert(int index, char value); Yes String::CtorCharCount()
StringBuilder Insert(int index, char[] value); No
StringBuilder Insert(int index, decimal value); Yes Insert(int32, System.Decimal)
StringBuilder Insert(int index, double value); Yes Insert(int32, float64)
StringBuilder Insert(int index, float value); Yes Insert(int32, float32)
StringBuilder Insert(int index, int value); Yes Insert(int32, int32)
StringBuilder Insert(int index, long value); Yes Insert(int32, int64)
StringBuilder Insert(int index, object value); Yes object::ToString()
StringBuilder Insert(int index, sbyte value); Yes Insert(int32, int8)
StringBuilder Insert(int index, short value); Yes Insert(int32, int16)
StringBuilder Insert(int index, string value); No
StringBuilder Insert(int index, uint value); Yes Insert(int32, unsigned int32)
StringBuilder Insert(int index, ulong value); Yes Insert(int32, unsigned int64)
StringBuilder Insert(int index, ushort value); Yes Insert(int32, unsigned int16)
StringBuilder Insert(int index, string value, …); No
StringBuilder Insert(int index, char[] value, …); No
StringBuilder Remove(int startIndex, int length); No
StringBuilder Replace(…); (all four) No
override string ToString(); Yes* See my previous article
string ToString(int startIndex, int length); Yes* String::InternalSubString()
Garbage Pertient Allocation
StringBuilder Append(bool value); No
StringBuilder Append(byte value); Yes Append(unsigned int8)
StringBuilder Append(char value); No
StringBuilder Append(char[] value); No
StringBuilder Append(decimal value); Yes Append(System.Decimal)
StringBuilder Append(double value); Yes Append(float64)
StringBuilder Append(float value); Yes Append(float32)
StringBuilder Append(int value); Yes Append(int32)
StringBuilder Append(long value); Yes Append(int64)
StringBuilder Append(object value); Yes ToString()
StringBuilder Append(sbyte value); Yes Append(int8)
StringBuilder Append(short value); Yes Append(int16)
StringBuilder Append(string value); No
StringBuilder Append(uint value); Yes Append(unsigned int32)
StringBuilder Append(ulong value); Yes Append(unsigned int64)
StringBuilder Append(ushort value); Yes Append(unsigned int16)
StringBuilder Append(char value, int repeatCount); No
StringBuilder Append(char[] value, …); No
StringBuilder Append(string value, …); No
StringBuilder AppendFormat(…); (all five) Yes String::ToCharArray(), even without args
StringBuilder AppendLine(); No
StringBuilder AppendLine(string value); No
void CopyTo(…); (all) No
int EnsureCapacity(int capacity); Yes, if capacity param > current capacity StringBuilder::GetNewString()
bool Equals(StringBuilder sb); No
StringBuilder Insert(int index, bool value); No
StringBuilder Insert(int index, byte value); Yes Insert(int32, unsigned int8)
StringBuilder Insert(int index, char value); Yes String::CtorCharCount()
StringBuilder Insert(int index, char[] value); No
StringBuilder Insert(int index, decimal value); Yes Insert(int32, System.Decimal)
StringBuilder Insert(int index, double value); Yes Insert(int32, float64)
StringBuilder Insert(int index, float value); Yes Insert(int32, float32)
StringBuilder Insert(int index, int value); Yes Insert(int32, int32)
StringBuilder Insert(int index, long value); Yes Insert(int32, int64)
StringBuilder Insert(int index, object value); Yes ToString()
StringBuilder Insert(int index, sbyte value); Yes Insert(int32, int8)
StringBuilder Insert(int index, short value); Yes Insert(int32, int16)
StringBuilder Insert(int index, string value); No
StringBuilder Insert(int index, uint value); Yes Insert(int32, unsigned int32)
StringBuilder Insert(int index, ulong value); Yes Insert(int32, unsigned int64)
StringBuilder Insert(int index, ushort value); Yes Insert(int32, unsigned int16)
StringBuilder Insert(int index, string value, …); No
StringBuilder Insert(int index, char[] value, …); No
StringBuilder Remove(int startIndex, int length); No
StringBuilder Replace(…); (all four) No
override string ToString(); Yes String::InternalCopy()
string ToString(int startIndex, int length); Yes String::InternalSubString()
* Technically you could describe these as not generating garbage, simply allocating a new string for the client to use. Certainly for the latter of these it makes sense, the term garbage is a little incorrect. But for the former, hence my previous article, it’s unnecessary and can be avoided with a little work.


The common theme for garbage here is type-conversion. Anything that isn’t a string or char type is pretty much guaranteed to generate it.  ‘bool‘ is an oddball I think probably because it inserts string literals: ‘false’ and ‘true’, so no conversion is needed. The other peculiar one is the Insert( int, char ) one, it generates garbage when logically it doesn’t really need to. Oddly the .NET library source code says it calls Char.ToString() on the char parameter. Possibly just an oversight in the library?

The reasoning for the type conversions generating garbage, is that they return a temporary string converting that type to a string. This string is then fed into the StringBuilder, and then discarded. Whilst it could have been done in-place, I think the reason why it isn’t is to support CultureInfo modifiers on the conversion. Where that conversion could take place in a different way based on the passed in CultureInfo. StringBuilder internally uses CultureInfo.CurrentCulture.

I think the implementation chosen was for simplicity and clarity of the type conversions. Writing a system to perform these type conversions in-place and have the flexibility of what CultureInfo offers, would have likely made the code significantly more complex. I can understand their reasoning completely.

Appending numeric types without generating garbage

For this article I specifically wanted to detail a replacement for those type conversion methods. These are ones you’d definitely need for a game, for at least your HUD readouts. Since these could be updated every frame, using the garbage-churning .NET library methods isn’t going to be pretty.

The replacement methodology I used was pretty simple; a number of methods don’t generate garbage, so use those to build up the string. This effectively means implementing an itoa() in C#, a conversion of an integer into string form. Floating point numbers too being handled in much the same way. My implementation is via C#’s extension methods. So the new garbage-free versions of Append() can be called directly on a StringBuilder object, as if they came with the original framework.

Here’s the code for download:

C# FileStringBuilderExtNumeric.cs

Since Append() is already taken, I chose Concat() as my alternative. There’s additional functionality over what’s offered by default in StringBuilder, to aid formatting of text. For floats you’re able to specify the amount of decimal places. For all numeric types you can specify the amount of padding, and the padding character used (most likely zero or space). Lastly, integers can be output with a specific base value, so your code could output hex, binary and octal if so desired.

Here’s a short made-up example of it’s usage:

StringBuilder m_hud_health_string = new StringBuilder( 64, 64 );
StringBuilder m_hud_ammo_string = new StringBuilder( 64, 64 );

private void UpdateHUDStrings()
{
    // Note: It's sensible to create a wrapper for Concat( string ), which just calls
    // Append() to avoid mixing these method names. When JIT-ing, it would be inlined.

    m_hud_health_string.Length = 0;                    //< Clear the string
    m_hud_health_string.Concat( GetCurrentHealth());   //< This method returns an int
    m_hud_health_string.Append( " / " );
    m_hud_health_string.Concat( GetMaximumHealth());   //< This method returns an int

    m_hud_ammo_string.Length = 0;                      //< Clear the string
    m_hud_ammo_string.Concat( GetBulletsInAmmoClip()); //< This method returns an int
    m_hud_ammo_string.Append( " ( " );
    m_hud_ammo_string.Concat( GetNumAmmoClips());      //< This method returns an int
    m_hud_ammo_string.Append( " )" );
}

But, what about Format(), I hear you cry? It can sometimes be easier to format more complex strings, and can result in much cleaner code than multiple appends/concats. Well, I have a garbage-free one of those too. I'll cover its implementation next time I write about C# here.

Some other things to try

The code I’ve written probably (definitely) could be optimized further. In case you need to do this, or if you’re just curious and want to play around, I’ve got a few ideas of things to try:

  • Different code for the oft-used base ten. You’re now modulo-ing and dividing by a constant, and passing one less parameter around. You can also just add characters by using ( ‘0’ + value ), instead of using the static char array. The static char array is just used since the hex ‘A-F’ don’t follow ‘9’ in the ascii table.
  • Try using a static char[] array to generate the string you’re about to append, and just call Append() once on the StringBuilder. StringBuilder keeps itself thread-safe and has other overhead on most of its members. So I think this would result in a speedup, by just making one mutable interaction with the StringBuilder.
    Keep in mind that you’ll want to make this array a [ThreadStatic] (thread-local store), so that you can use the code on multiple threads without issue.
  • Take a look at these two websites for some more ideas:
    Coverage of some common C implementations, along with some performance comparisons.
    A bit of a different approach using C++, potentially may throw up something applicable to C#.

Another thing worth considering is to wrap up the StringBuilder into a new type. Maybe a struct is a good idea, to save making another heap allocation when creating them. The reasoning behind this is to hide away the garbage-generating methods. So all you’re exposing are the safe, garbage-free methods. It also saves you having to awkwardly come up with a different method name, such as ‘Concat()’, like I did. Another win for this methodology is that you could write operator overloads to support ‘+=’, to perform concatenations too.

References

Comments

  • MatthewT says:

    Really nice article, thanks!

  • […] Gavin Pugh’s blog post: Avoiding StringBuilder Garbage […]

  • Brett says:

    Very nice article. I was profiling an XNA game and discovered that StringBuild.Append(int) was creating string garbage. Your StringBuilder extensions look great, I was about to create my own if I hadn’t found yours first!

  • […] that you can use to help you out. I eventually took to using Gavin Pugh’s garabage-free StringBuilder extension for formatting numerical values. To make sure I’d get rid of all the string problems, I […]

  • batchprogram says:

    Your float append method will not work with numbers that have leading zeros in the decimal portion. For example the value 123.000456 would result in the value 123.456 being appended to the StringBuilder. To support numbers with leading(or trailing) zeros, change the do-while loop to this:

    do
    {
    remainder *= 10;
    Concat((uint)remainder % 10);
    decimal_places–;
    }
    while (decimal_places > 0);

    Then remove the rounding and concatenation after the loop.

  • zylazla says:

    More efficient fix for float append method:

    // ACM: Fix for leading zeros in the decimal portion
    remainder *= 10;
    decimal_places–;

    while ( decimal_places > 0 && ((uint)remainder % 10) == 0)
    {
    remainder *= 10;
    decimal_places–;
    string_builder.Append(‘0’);
    }

    // Multiply up to become an int that we can print
    while ( decimal_places > 0 )
    {
    remainder *= 10;
    decimal_places–;
    }

  • Hugh says:

    You forgot to test an important case: StringBuilder Append(Stringbuilder value);
    Given that they messed up char, I wonder if they got this right.

  • will motil says:

    All numerical value types converted to string or char in string builder generate garbage do it in a loop and test it at high speed you will instantly see a ton of trash.

    I wrote out a big wrapper for StringBuider when using MonoGame. I unrolled all the methods completely and caught tons of edge cases it can actually go faster then stringbuilder. I don’t have the format extensions in it though. I also wrote out a frame rate and gc display class to proof it. MgStringBuilder can be found here. https://github.com/willmotil/MonoGameUtilityClasses

  • Leave a Reply

    Your email address will not be published. Required fields are marked *