In my previous coding post, I spoke about some issues with converting a mutable StringBuilder string back to a regular ‘string’ object without generating garbage. Well, specifically without requiring an unnecessary heap allocation. One thing I hinted at was that StringBuilder has a number of other methods which generate garbage too. In fact, there are a lot of important fundamental methods which do, which are difficult to live without.
As I’ve mentioned before, worrying about this sort of thing may not be necessary for the game you’re working on. It’s much more of a concern on Xbox 360 than PC, due to the poorly performing garbage collector on 360. If your game isn’t something that’s going to remotely push the hardware, or be impacted by dropped frames, then you really don’t need to worry. This article is just for those who may see this as an issue, and want to explore ways to eliminate this particular method of generating garbage.
Problem StringBuilder methods
So, here’s a table of all the methods StringBuilder has. My test case used a StringBuilder constructed with an initial capacity of 1024 characters. This is plenty for anything I threw at it, so any garbage I found with CLRProfiler was not something associated with reallocating the StringBuilder internal string. In the cases where there’s garbage generated I’ve commented with the most interesting or appropriate function in the callstack. In many cases it’s just exactly the same method, but with the function signature reported by CLRProfiler.
Garbage | Pertinent Allocation / Notes |
|
StringBuilder Append(bool value); | No | |
StringBuilder Append(byte value); | Yes | Append(unsigned int8) |
StringBuilder Append(char value); | No | |
StringBuilder Append(char[] value); | No | |
StringBuilder Append(decimal value); | Yes | Append(System.Decimal) |
StringBuilder Append(double value); | Yes | Append(float64) |
StringBuilder Append(float value); | Yes | Append(float32) |
StringBuilder Append(int value); | Yes | Append(int32) |
StringBuilder Append(long value); | Yes | Append(int64) |
StringBuilder Append(object value); | Yes | object::ToString() |
StringBuilder Append(sbyte value); | Yes | Append(int8) |
StringBuilder Append(short value); | Yes | Append(int16) |
StringBuilder Append(string value); | No | |
StringBuilder Append(uint value); | Yes | Append(unsigned int32) |
StringBuilder Append(ulong value); | Yes | Append(unsigned int64) |
StringBuilder Append(ushort value); | Yes | Append(unsigned int16) |
StringBuilder Append(char value, int repeatCount); | No | |
StringBuilder Append(char[] value, …); | No | |
StringBuilder Append(string value, …); | No | |
StringBuilder AppendFormat(…); (all five) | Yes | In all cases, even without args |
StringBuilder AppendLine(); | No | |
StringBuilder AppendLine(string value); | No | |
void CopyTo(…); | No | |
int EnsureCapacity(int capacity); | Yes | If capacity param > current capacity |
bool Equals(StringBuilder sb); | No | |
StringBuilder Insert(int index, bool value); | No | |
StringBuilder Insert(int index, byte value); | Yes | Insert(int32, unsigned int8) |
StringBuilder Insert(int index, char value); | Yes | String::CtorCharCount() |
StringBuilder Insert(int index, char[] value); | No | |
StringBuilder Insert(int index, decimal value); | Yes | Insert(int32, System.Decimal) |
StringBuilder Insert(int index, double value); | Yes | Insert(int32, float64) |
StringBuilder Insert(int index, float value); | Yes | Insert(int32, float32) |
StringBuilder Insert(int index, int value); | Yes | Insert(int32, int32) |
StringBuilder Insert(int index, long value); | Yes | Insert(int32, int64) |
StringBuilder Insert(int index, object value); | Yes | object::ToString() |
StringBuilder Insert(int index, sbyte value); | Yes | Insert(int32, int8) |
StringBuilder Insert(int index, short value); | Yes | Insert(int32, int16) |
StringBuilder Insert(int index, string value); | No | |
StringBuilder Insert(int index, uint value); | Yes | Insert(int32, unsigned int32) |
StringBuilder Insert(int index, ulong value); | Yes | Insert(int32, unsigned int64) |
StringBuilder Insert(int index, ushort value); | Yes | Insert(int32, unsigned int16) |
StringBuilder Insert(int index, string value, …); | No | |
StringBuilder Insert(int index, char[] value, …); | No | |
StringBuilder Remove(int startIndex, int length); | No | |
StringBuilder Replace(…); (all four) | No | |
override string ToString(); | Yes* | See my previous article |
string ToString(int startIndex, int length); | Yes* | String::InternalSubString() |
Garbage | Pertient Allocation | |
StringBuilder Append(bool value); | No | |
StringBuilder Append(byte value); | Yes | Append(unsigned int8) |
StringBuilder Append(char value); | No | |
StringBuilder Append(char[] value); | No | |
StringBuilder Append(decimal value); | Yes | Append(System.Decimal) |
StringBuilder Append(double value); | Yes | Append(float64) |
StringBuilder Append(float value); | Yes | Append(float32) |
StringBuilder Append(int value); | Yes | Append(int32) |
StringBuilder Append(long value); | Yes | Append(int64) |
StringBuilder Append(object value); | Yes | ToString() |
StringBuilder Append(sbyte value); | Yes | Append(int8) |
StringBuilder Append(short value); | Yes | Append(int16) |
StringBuilder Append(string value); | No | |
StringBuilder Append(uint value); | Yes | Append(unsigned int32) |
StringBuilder Append(ulong value); | Yes | Append(unsigned int64) |
StringBuilder Append(ushort value); | Yes | Append(unsigned int16) |
StringBuilder Append(char value, int repeatCount); | No | |
StringBuilder Append(char[] value, …); | No | |
StringBuilder Append(string value, …); | No | |
StringBuilder AppendFormat(…); (all five) | Yes | String::ToCharArray(), even without args |
StringBuilder AppendLine(); | No | |
StringBuilder AppendLine(string value); | No | |
void CopyTo(…); (all) | No | |
int EnsureCapacity(int capacity); | Yes, if capacity param > current capacity | StringBuilder::GetNewString() |
bool Equals(StringBuilder sb); | No | |
StringBuilder Insert(int index, bool value); | No | |
StringBuilder Insert(int index, byte value); | Yes | Insert(int32, unsigned int8) |
StringBuilder Insert(int index, char value); | Yes | String::CtorCharCount() |
StringBuilder Insert(int index, char[] value); | No | |
StringBuilder Insert(int index, decimal value); | Yes | Insert(int32, System.Decimal) |
StringBuilder Insert(int index, double value); | Yes | Insert(int32, float64) |
StringBuilder Insert(int index, float value); | Yes | Insert(int32, float32) |
StringBuilder Insert(int index, int value); | Yes | Insert(int32, int32) |
StringBuilder Insert(int index, long value); | Yes | Insert(int32, int64) |
StringBuilder Insert(int index, object value); | Yes | ToString() |
StringBuilder Insert(int index, sbyte value); | Yes | Insert(int32, int8) |
StringBuilder Insert(int index, short value); | Yes | Insert(int32, int16) |
StringBuilder Insert(int index, string value); | No | |
StringBuilder Insert(int index, uint value); | Yes | Insert(int32, unsigned int32) |
StringBuilder Insert(int index, ulong value); | Yes | Insert(int32, unsigned int64) |
StringBuilder Insert(int index, ushort value); | Yes | Insert(int32, unsigned int16) |
StringBuilder Insert(int index, string value, …); | No | |
StringBuilder Insert(int index, char[] value, …); | No | |
StringBuilder Remove(int startIndex, int length); | No | |
StringBuilder Replace(…); (all four) | No | |
override string ToString(); | Yes | String::InternalCopy() |
string ToString(int startIndex, int length); | Yes | String::InternalSubString() |
* Technically you could describe these as not generating garbage, simply allocating a new string for the client to use. Certainly for the latter of these it makes sense, the term garbage is a little incorrect. But for the former, hence my previous article, it’s unnecessary and can be avoided with a little work.
The common theme for garbage here is type-conversion. Anything that isn’t a string or char type is pretty much guaranteed to generate it. ‘bool‘ is an oddball I think probably because it inserts string literals: ‘false’ and ‘true’, so no conversion is needed. The other peculiar one is the Insert( int, char ) one, it generates garbage when logically it doesn’t really need to. Oddly the .NET library source code says it calls Char.ToString() on the char parameter. Possibly just an oversight in the library?
The reasoning for the type conversions generating garbage, is that they return a temporary string converting that type to a string. This string is then fed into the StringBuilder, and then discarded. Whilst it could have been done in-place, I think the reason why it isn’t is to support CultureInfo modifiers on the conversion. Where that conversion could take place in a different way based on the passed in CultureInfo. StringBuilder internally uses CultureInfo.CurrentCulture.
I think the implementation chosen was for simplicity and clarity of the type conversions. Writing a system to perform these type conversions in-place and have the flexibility of what CultureInfo offers, would have likely made the code significantly more complex. I can understand their reasoning completely.
Appending numeric types without generating garbage
For this article I specifically wanted to detail a replacement for those type conversion methods. These are ones you’d definitely need for a game, for at least your HUD readouts. Since these could be updated every frame, using the garbage-churning .NET library methods isn’t going to be pretty.
The replacement methodology I used was pretty simple; a number of methods don’t generate garbage, so use those to build up the string. This effectively means implementing an itoa() in C#, a conversion of an integer into string form. Floating point numbers too being handled in much the same way. My implementation is via C#’s extension methods. So the new garbage-free versions of Append() can be called directly on a StringBuilder object, as if they came with the original framework.
Here’s the code for download:
StringBuilderExtNumeric.cs
Since Append() is already taken, I chose Concat() as my alternative. There’s additional functionality over what’s offered by default in StringBuilder, to aid formatting of text. For floats you’re able to specify the amount of decimal places. For all numeric types you can specify the amount of padding, and the padding character used (most likely zero or space). Lastly, integers can be output with a specific base value, so your code could output hex, binary and octal if so desired.
Here’s a short made-up example of it’s usage:
StringBuilder m_hud_health_string = new StringBuilder( 64, 64 );
StringBuilder m_hud_ammo_string = new StringBuilder( 64, 64 );
private void UpdateHUDStrings()
{
// Note: It's sensible to create a wrapper for Concat( string ), which just calls
// Append() to avoid mixing these method names. When JIT-ing, it would be inlined.
m_hud_health_string.Length = 0; //< Clear the string
m_hud_health_string.Concat( GetCurrentHealth()); //< This method returns an int
m_hud_health_string.Append( " / " );
m_hud_health_string.Concat( GetMaximumHealth()); //< This method returns an int
m_hud_ammo_string.Length = 0; //< Clear the string
m_hud_ammo_string.Concat( GetBulletsInAmmoClip()); //< This method returns an int
m_hud_ammo_string.Append( " ( " );
m_hud_ammo_string.Concat( GetNumAmmoClips()); //< This method returns an int
m_hud_ammo_string.Append( " )" );
}
But, what about Format(), I hear you cry? It can sometimes be easier to format more complex strings, and can result in much cleaner code than multiple appends/concats. Well, I have a garbage-free one of those too. I'll cover its implementation next time I write about C# here.
Some other things to try
The code I’ve written probably (definitely) could be optimized further. In case you need to do this, or if you’re just curious and want to play around, I’ve got a few ideas of things to try:
- Different code for the oft-used base ten. You’re now modulo-ing and dividing by a constant, and passing one less parameter around. You can also just add characters by using ( ‘0’ + value ), instead of using the static char array. The static char array is just used since the hex ‘A-F’ don’t follow ‘9’ in the ascii table.
- Try using a static char[] array to generate the string you’re about to append, and just call Append() once on the StringBuilder. StringBuilder keeps itself thread-safe and has other overhead on most of its members. So I think this would result in a speedup, by just making one mutable interaction with the StringBuilder.
Keep in mind that you’ll want to make this array a [ThreadStatic] (thread-local store), so that you can use the code on multiple threads without issue. - Take a look at these two websites for some more ideas:
Coverage of some common C implementations, along with some performance comparisons.
A bit of a different approach using C++, potentially may throw up something applicable to C#.
Another thing worth considering is to wrap up the StringBuilder into a new type. Maybe a struct is a good idea, to save making another heap allocation when creating them. The reasoning behind this is to hide away the garbage-generating methods. So all you’re exposing are the safe, garbage-free methods. It also saves you having to awkwardly come up with a different method name, such as ‘Concat()’, like I did. Another win for this methodology is that you could write operator overloads to support ‘+=’, to perform concatenations too.
References
Comments
[…] Gavin Pugh’s blog post: Avoiding StringBuilder Garbage […]
Very nice article. I was profiling an XNA game and discovered that StringBuild.Append(int) was creating string garbage. Your StringBuilder extensions look great, I was about to create my own if I hadn’t found yours first!
[…] that you can use to help you out. I eventually took to using Gavin Pugh’s garabage-free StringBuilder extension for formatting numerical values. To make sure I’d get rid of all the string problems, I […]
Your float append method will not work with numbers that have leading zeros in the decimal portion. For example the value 123.000456 would result in the value 123.456 being appended to the StringBuilder. To support numbers with leading(or trailing) zeros, change the do-while loop to this:
do
{
remainder *= 10;
Concat((uint)remainder % 10);
decimal_places–;
}
while (decimal_places > 0);
Then remove the rounding and concatenation after the loop.
More efficient fix for float append method:
// ACM: Fix for leading zeros in the decimal portion
remainder *= 10;
decimal_places–;
while ( decimal_places > 0 && ((uint)remainder % 10) == 0)
{
remainder *= 10;
decimal_places–;
string_builder.Append(‘0’);
}
// Multiply up to become an int that we can print
while ( decimal_places > 0 )
{
remainder *= 10;
decimal_places–;
}
You forgot to test an important case: StringBuilder Append(Stringbuilder value);
Given that they messed up char, I wonder if they got this right.
All numerical value types converted to string or char in string builder generate garbage do it in a loop and test it at high speed you will instantly see a ton of trash.
I wrote out a big wrapper for StringBuider when using MonoGame. I unrolled all the methods completely and caught tons of edge cases it can actually go faster then stringbuilder. I don’t have the format extensions in it though. I also wrote out a frame rate and gc display class to proof it. MgStringBuilder can be found here. https://github.com/willmotil/MonoGameUtilityClasses
Really nice article, thanks!