XNA/C# – A garbage-free StringBuilder Format() method

5 April, 2010 at 8:42am | XNA / C#

So, another entry in this StringBuilder and garbage series… This time I’m exploring the Format() method, and implementing a new alternative that does not generate any garbage. The existing AppendFormat() method on StringBuilder generates a significant amount of garbage.

So in what way does the .NET one generate garbage? Well, the parameter type is ‘object’, so you’ll get boxing and unboxing of value types. Since integers and floats are pretty oft-used with Format(), that’s not good news. With CLRProfiler I also see temporary allocations made in ‘String::ToCharArray()’, and ‘String::CtorCharArrayStartLength()’. There’s also more garbage if you use more than three arguments; for that it requires a temporary array to be created.

All in all, it’s not a pretty picture if you want to avoid garbage collections in your game.

Variable arguments

The AppendFormat() method takes a string parameter, then a series of parameter arguments. The amount of these can vary, and they can be of a different types too. So how best pass these in? Well, there are a number of different methods. Take a look at the ones StringBuilder provides as standard:

  • StringBuilder AppendFormat( string format, params object[] args );
  • StringBuilder AppendFormat( IFormatProvider provider, string format, params object[] args );
  • StringBuilder AppendFormat( string format, object arg0 );
  • StringBuilder AppendFormat( string format, object arg0, object arg1 );
  • StringBuilder AppendFormat( string format, object arg0, object arg1, object arg2 );

Each of these uses object params. Object params are subject to boxing and unboxing with value types. Since I’d want to pass integers and floats through, this isn’t good. The other thing I noticed was that array parameters using the ‘params’ keyword cause a temporary allocation too. Even without using the object type, take this simplified example:

public void TestMethod( params int[] args );

Any call to this method generates garbage. The ‘params’ keyword essentially converts any params you might specify into an array. The array of integers in this case is allocated as a temporary and then destroyed. The same thing happens when using class types too, not just value types. A different tact is required here.

There were two approaches I explored to support a variable set of arguments, without generating garbage. One of these I went with and will explain now, the other one I explain later on this article in the ‘Things to try’ section. The approach I chose uses generics, and unfortunately a method is required to be implemented for each number of parameters I’d want to support. Other than that though, it’s a really good fit for this problem.

The implementation

So, there are two main problems to solve here:

  • Support a variable number of arguments of arbitrary types
  • Be able to use these types, perform any conversion required and concatenate them onto a StringBuilder

For the former, I’m just using generics. For each number of parameters I need to support, I’ll need to write a new method. This multiple versions of the same method, but with a different number of parameters isn’t too pretty. It reminds me of C++ template abuse, stuff like the sig_slot library.

But here it gets the job done, and it’s kind of your classic C# stuff in a way. There’s no support for default parameters in C#. So for different arguments you generally have to have wrapper methods to support different sets of parameters.

Casting generics is not like C++. My first thought was something like:

static public void FormatArgument( StringBuilder string_builder, T arg )
{
    if ( arg.GetType() == typeof(int) )
    {
        int int_arg = (int)arg;
        string_builder.Concat( int_arg );
    }
}

But the straight casting like this is not allowed. There does seem to be a couple of methods to convert types though. At least these were the two I found, both these examples are with integers:

  • Convert.ToInt32( arg )
  • Arg.ToInt32( System.Globalization.NumberFormatInfo.CurrentInfo )

Both have their downfalls. Convert.ToInt32() generates garbage, which makes it unusable for my purpose. The latter one though doesn’t generate garbage, but it is only available on types that implement IConvertible. Fortunately all the types I wanted to use (string, int, float) all implement IConvertible. However some other types that would be nice to support, such as Math.Vector don’t. I went with this method, and in turn had to limit the arguments to IConvertibles using the ‘where’ keyword.

That’s about it for the nitty-gritty on some of my decisions. The rest of the code is just a classic char-by-char over the formatted string. It’ll look for the ‘{‘ open curly bracket character and then use the Concat() methods from my previous article to push in the parameters. So, here’s the code:


Zip File StringBuilderExtFormat.zip

Performance

I was definitely curious as to how my code performs. There’s certainly more scope for optimization, but I’m not keen on spending time on it unless it’s something to worry about. Using the Stopwatch class I measured the runtime of this piece of code:

string_builder.ConcatFormat(
    "Test {0:0.0000} Test {1:X} Test {2} {3}",
    3.45111111f, 0xBEEF, 12345678, "Hello World" );

A nice taxing formatted string with varied parameters. Here’s a table of the performance figures, they are measured from a loop calling this method 1,000,000 times.

Windows PC Xbox 360
.NET’s Mine .NET’s Mine
Unoptimized build
– In debugger
2.43s 3.23s 27.57s 32.03s
Optimized build
– In debugger
2.43s 2.61s 27.57s 28.43s
Optimized build
– No debugger, jit-ed
2.38s 1.84s 20.33s 18.03s
Garbage generated 284MB 0 bytes 381MB 0 bytes

 
For both times and garbage, smaller is better! 🙂

The notable thing is that the native AppendFormat() performs the same when optimized or not. I’m not entirely familiar with the in’s and out’s of .NET assemblies just yet. But I do wonder if I could put these routines in a separate library. Then whether I can set things up so that they are jit-ed, regardless of whether my main code module is ran in the debugger or not. Not something I’m too interested in right now, but just throwing it out there. I’m sure google is my friend on this one!

Anyhow, I’m happy that on the final build of the game I’ll have better performance than the stock .NET method. Also with significantly less garbage of course. There’s plenty of scope for optimization on my methods too, they’re pretty much a naïve first-go in all cases. With a little time and effort I think a good chunk could come off these times. Knowing it performs well though is good enough for me right now, I’ll give it more attention down the road if I need to.

Things to try

A Format() that does everything?

Obviously the version I’ve written doesn’t offer anywhere near the same fully-featured functionality as the stock .NET Format(). I don’t envision myself needing any more features for a game though to be honest. If I did come across something I wanted it wouldn’t be too much of a pain to implement I’m sure. But, take a look at the .NET docs, and the AppendFormat() source code:

http://labs.developerfusion.co.uk/SourceViewer/view/SSCLI/System.Text/StringBuilder/

The AppendFormat() in StringBuilder is the one that is also used by System.String and I’d assume the rest of the .NET code. Using this source and any documentation for the formatting specifiers, it’d be possible to implement the same level of functionality but using my techniques to avoid garbage. Quite a task though I’d think!
 

Supporting arbitrary parameter types

Another thing to give a go, is a second method of variable arguments I played around with. A method that doesn’t use generics, but instead uses a base class. For an example, here’s a method prototype with two arguments:

public static StringBuilder ConcatFormat( this StringBuilder string_builder,
    String format_string, FormatArgument arg1, FormatArgument arg2 )

The ‘FormatArgument’ class is an abstract base class. What you’d do is implement subclasses for each supported type, i.e.:

public class FormatArgumentInt : FormatArgument

Then have a virtual method for appending the type into the StringBuilder. The crux of this whole technique is in the use of implicit constructors. So that depending on the argument you specify to ConcatFormat(), you’ll get it converted into a new FormatArgument type. Here’s an example:

public static implicit operator FormatArgument( int value )
{
    FormatArgumentInt arg = new FormatArgumentInt( value );
    return arg;
}

Hey, this example allocates a temporary class therefore generates garbage when used? Indeed it does, but you can use a pool container to deal with this. Pre-allocating all of these and dealing out pool items when they’re created solves the garbage issue. You’ll also need to manually dispose of them and add them back to the pool.

It’s a lot of work. The big upside is that you can support any types, such as Math.Matrix and Vector. It’d be cool to be able to use those if you’re using Format() for debugging. In my experiments though, performance wasn’t too hot, it was in the region of 1.5-2 times slower than the generics method I detailed earlier. The code also becomes a lot more unwieldy; the generics method looked like a work of art by comparison!

Maybe there’s a way of using the generics method on arbitrary non-IConvertible types? Without incurring garbage on the conversions too of course.
 

Printf

I like the positional arguments of C#’s Format. C/C++’s printf() stipulates that the argument list has to be used in the same order that the params are specified in the formatting string. This makes the .NET way useful for localizations particularly, where something like:

Player {0} got {1} points!

Could become:

Allez {1} punts spiele {0}!  << Just a made-up foreign language

Essentially the translation requires those parameters to appear in a different order. With printf() it’s painful to do this correctly.

That said though if you’ve used printf() for over a decade like I have, switching to Format() might not be quite as fun. So why not implement printf() in C#? It’s really only a minor modification over the code I provide above. I was a little tempted myself, but I really want to stick with doing things the C# way for now. 🙂

References

Comments

Pingback from Creators Club Communiqué 49 | Adibit
Time: April 16, 2010, 12:59 pm

[…] Pugh shares a “garbage-free StringBuilder Format() method” using the XNA framework on his […]

Pingback from 3 Simple Tips to Avoid Memory Allocations with XNA/C# | Frozax Games Dev Blog
Time: May 21, 2010, 2:32 pm

[…] are some awesome tips (and source code) about StringBuilder on Gavin Pugh’ blog. Especially this and this. You have everything you need […]

Comment from zigzag
Time: November 9, 2010, 8:21 am

I must say that these extensions are very very nice. They have helped me reduce GCs in my project. And having garbage free format methods is just great. Good job!

Comment from Jason Doucette
Time: June 7, 2011, 11:47 pm

Hello Gavin, I was just about to program a custom StringBuilder.Format() function when I found yours. While I don’t need something extravagant, I think yours will do fine. I’ve already made similar things just for custom situations, nothing as general as what you’ve made. I’m impressed by your work.

But it doesn’t cove one issue that I was having with the creation of my own — and that’s proper culture information (internationalization), since not everyone uses a period for the decimal point, or uses a comma for the thousands separator.

In any case, I’d love to help you update your source, since I’ve already done these things, and can point you quickly in the right direction. Please contact me at jason@xona.com. I’m Jason, as you can see, and I’m the lead programmer for Xona Games.

All the best,
Jason

Comment from Jason Doucette
Time: June 8, 2011, 12:03 am

Actually, I now realize that you don’t support the {0:#,0} format which is the same as {0:0} except that it inserts commas as the thousands separator — good stuff for showing large scores. That’s too bad.

I’ve already updated the code you have already for proper culture information. You just have to replace the ‘.’ (and the ‘,’ if you ever implement that) with a reference to a char (actually a string) extracted from CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator (and CultureInfo.CurrentCulture.NumberFormat.NumberGroupSeparator).

Anyway, give me a shout, if you like.
Jason

Comment from Gavin
Time: June 14, 2011, 3:39 pm

Thanks, dropping you an email now. Hopefully I can edit this page to include a link to your updated code.

Pingback from » Eureka! Garbage-free string manipulation in Unity!Defective Studios Devblog
Time: May 28, 2014, 7:31 am

[…] upon the solution! Oddly enough, it took finding two different articles, both from the same blog: http://www.gavpugh.com/2010/04/05/xnac-a-garbage-free-stringbuilder-format-method/ […]

Comment from twobob
Time: August 6, 2014, 7:17 pm

Very helpful on the xbox. TYVM

Comment from MrKii
Time: February 14, 2015, 1:42 pm

Would love a WinRT version to use with MonoGame! (they broke everything with the new store .NET runtime)

Comment from Alex Darby
Time: May 5, 2016, 2:27 pm

Hey Gavin.

Firstly I wanted to say thanks so much for trailblazing this stuff! Your code was a source of inspiration and saved me days of messing about.

I’m in a situation where I want to get GC churn free string handling in Unity and whilst it’s slightly different (being mono) it’s basically the same.

For various reasons (partly to do with behaviour of mono’s internal string buffer in SrtringBuilder) I chose to wrap StringBuilder rather than use the extension methods approach.

One quirk of this approach is that I wanted to pass instances of the wrapped stringbuilder into the generic formatting functions – as it turns out I did get this working…

Above, you say: “Maybe there’s a way of using the generics method on arbitrary non-IConvertible types? Without incurring garbage on the conversions too of course.”

Well there totally is, but it’s less clean than I’d like in an ideal world.

Essentially it’s this in overview:
1) any type you want to use as a format parameter needs to implement IConvertible (or at least IConvertible.GetTypeCode() )
2) You also need to write a function that can append this type onto a StringBuilder
3) Where your code switches on GetTypeCode(), you then need to check it’s actual type calling GetType() (GC free on mono)…
4) …once you know its type you have to pass it to an adaptor function that takes IConvertible and returns the type you want to pass to the function you wrote for point 2
5) Point 4 is the sneaky bit – it’s essentially forcing the generic type to be resolved into a type you can use to call a specific function, albeit manually.

It’s a trick I got from template meta programming pre-C++11. Happy to share the source with you via email if you wish.

Thanks again,

Alex

Comment from Alex Darby
Time: May 5, 2016, 2:30 pm

Argh! Typo! I put “it’s” instead of “its” in point 3 of my overview. Apostrophe abuse. I am so ashamed….

Comment from Julien
Time: September 5, 2016, 8:41 am

Thanks for sharing this, i take the time to comment because with your extensions i now have so much less Garbage related to string in my game!

Thank you so much and very good job!

Write a comment



 (not shown when published)



 (comments will appear after moderation)

*