[DotNetBook] Stackalloc: forgotten C # command

With this article, I continue to publish a series of articles, the result of which will be a book on the work of the .NET CLR, and .NET as a whole. The whole book will be available on GitHub (link at the end of the article).

In C #, there is a rather interesting and very rarely used stackalloc keyword. It is so rarely found in the code (here I even with the word “rare” underestimated. Rather, “never”), that finding a suitable example of its use is difficult enough and it is all the more difficult to invent: indeed, if something is rarely used, then experience working with him is too small. Why all? Because for those who finally decide to find out what this command does, stackalloc becomes more frightening than useful: the dark side of stackalloc is unsafe code. The result that it returns is not a managed pointer: value is a regular pointer to a section of unprotected memory. And if you make an entry at this address after the method has completed, you will start writing either to local variables of some method, or you will alter the return address from the method altogether, after which the application will end up with an error. However, our task is to penetrate the very corners and find out what is hidden in them. And to understand, in particular, that if they gave us this tool, it’s not so easy so that we could find the secret rake and step on them from the very beginning. On the contrary: we were given this tool so that we could use it and make truly fast software. I hope inspired you? Then let's get started.

Note

The chapter published on Habré is not updated and it is possible that it is already somewhat outdated. So, please ask for a more recent text to the original:
')
CLR Book: GitHub, table of contents
CLR Book: GitHub, chapter, link to the part `Memory allocation on the stack: stackalloc`
Release 0.5.2 of the book, PDF: GitHub Release

To find the right examples of using this keyword, you must first proceed to its authors: Microsoft and see how they use it. This can be done by searching the full-text search in the coreclr repository. In addition to the various tests of the keyword itself, we will find no more than 25 uses of this keyword by library code. I hope that in the previous paragraph, I motivated you strongly enough so that you did not stop reading, seeing this small figure and did not close my work. To be honest, the CLR team is far more visionary and professional than the .NET Framework team, and if it did something, it should help us a lot. And if this is not used in the .NET Framework ... Well, here we can assume that not all engineers are aware that there is such a powerful optimization tool. Otherwise, the volume of its use would be much greater.

Interop.ReadDir class

 unsafe { // s_readBufferSize is zero when the native implementation does not // support reading into a buffer. byte* buffer = stackalloc byte[s_readBufferSize]; InternalDirectoryEntry temp; int ret = ReadDirR(dir.DangerousGetHandle(), buffer, s_readBufferSize, out temp); // We copy data into DirectoryEntry to ensure there are no dangling references. outputEntry = ret == 0 ? new DirectoryEntry() { InodeName = GetDirectoryEntryName(temp), InodeType = temp.InodeType } : default(DirectoryEntry); return ret; }

What is stackalloc used stackalloc ? As we can see, after memory allocation, the code goes to the unsafe method to fill the created buffer with data. Those. unsafe method, which requires a plot to write allocated space directly on the stack: dynamically. This is a great optimization if you consider the alternatives: request a section of memory from Windows or a fixed (pinned) array of .NET, which besides the load on the heap loads the GC so that the array is nailed so that the GC does not push it during access to its data. Allocating memory on the stack, we do not risk anything: the allocation takes place almost instantly and we can safely fill it with data and exit the method. And along with the exit from the method, the stack frame of the method will disappear. In general, the time savings are significant.

Let's take another example:

Class Number.Formatting :: FormatDecimal

 public static string FormatDecimal( decimal value, ReadOnlySpan<char> format, NumberFormatInfo info) { char fmt = ParseFormatSpecifier(format, out int digits); NumberBuffer number = default; DecimalToNumber(value, ref number); ValueStringBuilder sb; unsafe { char* stackPtr = stackalloc char[CharStackBufferSize]; sb = new ValueStringBuilder(new Span<char>(stackPtr, CharStackBufferSize)); } if (fmt != 0) { NumberToString(ref sb, ref number, fmt, digits, info, isDecimal:true); } else { NumberToStringFormat(ref sb, ref number, format, info); } return sb.ToString(); }

This is an example of formatting numbers, based on an even more interesting example of the ValueStringBuilder class, based on Span<T> . The essence of this part of the code is that in order to collect the textual representation of the formatted number as quickly as possible, the code does not use memory allocation for the character accumulation buffer. This fine code allocates memory directly in the stack frame of the method, thereby ensuring that the garbage collector does not work on StringBuilder instances if the method worked on its basis. Plus, the time of the method itself decreases: memory allocation on the heap also takes time. And the use of the Span<T> instead of bare pointers introduces a sense of security to the operation of code based on stackalloc .

And lastly, let's ValueStringBuilder at another example: the ValueStringBuilder class ValueStringBuilder , which is designed to use stackalloc . Without it, there would be no this class.

Class ValueStringBuilder

 internal ref struct ValueStringBuilder { private char[] _arrayToReturnToPool; private Span<char> _chars; private int _pos; public ValueStringBuilder(Span<char> initialBuffer) { _arrayToReturnToPool = null; _chars = initialBuffer; _pos = 0; } public int Length { get => _pos; set { int delta = value - _pos; if (delta > 0) { Append('\0', delta); } else { _pos = value; } } } public override string ToString() { var s = new string(_chars.Slice(0, _pos)); Clear(); return s; } public void Insert(int index, char value, int count) { if (_pos > _chars.Length - count) { Grow(count); } int remaining = _pos - index; _chars.Slice(index, remaining).CopyTo(_chars.Slice(index + count)); _chars.Slice(index, count).Fill(value); _pos += count; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public void Append(char c) { int pos = _pos; if (pos < _chars.Length) { _chars[pos] = c; _pos = pos + 1; } else { GrowAndAppend(c); } } [MethodImpl(MethodImplOptions.NoInlining)] private void GrowAndAppend(char c) { Grow(1); Append(c); } [MethodImpl(MethodImplOptions.NoInlining)] private void Grow(int requiredAdditionalCapacity) { Debug.Assert(requiredAdditionalCapacity > _chars.Length - _pos); char[] poolArray = ArrayPool<char>.Shared.Rent( Math.Max(_pos + requiredAdditionalCapacity, _chars.Length * 2)); _chars.CopyTo(poolArray); char[] toReturn = _arrayToReturnToPool; _chars = _arrayToReturnToPool = poolArray; if (toReturn != null) { ArrayPool<char>.Shared.Return(toReturn); } } [MethodImpl(MethodImplOptions.AggressiveInlining)] private void Clear() { char[] toReturn = _arrayToReturnToPool; // for safety, to avoid using pooled array if this instance is erroneously appended to again this = default; if (toReturn != null) { ArrayPool<char>.Shared.Return(toReturn); } } //   private void AppendSlow(string s); public bool TryCopyTo(Span<char> destination, out int charsWritten); public void Append(string s); public void Append(char c, int count); public unsafe void Append(char* value, int length); public Span<char> AppendSpan(int length); }

This class in its functionality is similar to its older brother `StringBuilder`, while possessing one interesting and very important feature: it is a significant type. Those. passed entirely by value. And the newest type modifier `ref`, which is assigned to the type declaration signature, tells us that this type has an additional limitation: it has the right to be only on the stack. Those. displaying its instances in class fields will result in an error. Why all these squats? To answer this question, just look at the `StringBuilder` class:

Class StringBuilder

 public sealed class StringBuilder : ISerializable { // A StringBuilder is internally represented as a linked list of // blocks each of which holds a chunk of the string. It turns // out string as a whole can also be represented as just a chunk, // so that is what we do. // The characters in this block internal char[] m_ChunkChars; // Link to the block logically before this block internal StringBuilder m_ChunkPrevious; // The index in m_ChunkChars that represent the end of the block internal int m_ChunkLength; // The logical offset (sum of all characters in previous blocks) internal int m_ChunkOffset; internal int m_MaxCapacity = 0; // ... internal const int DefaultCapacity = 16;

StringBuilder is a class within which there is a reference to an array of characters. Those. when you create it, at least two objects are created: a StringBuilder itself and an array of characters of at least 16 characters (by the way, this is why it is so important to set the expected length of the string: its construction will go through the generation of a simply connected list of 16-character arrays. Agree - waste). What does this mean in the context of our conversation about the type ValueStringBuilder: by default is absent, since it borrows memory from the outside, plus it itself is a significant type and forces the user to place a buffer for characters on the stack. As a result, the entire instance of the type falls on the stack along with its contents, and the question of optimization here becomes resolved. No heap allocation? No problem with subsidence performance. But you tell me: why then do not use ValueStringBuilder (or its self-written version: it’s internal itself and is not available to us) always? The answer is this: you have to look at the problem that you are solving. Will the resulting string be of known size? Will she have some known maximum length? If the answer is "yes" and if the size of the string does not go beyond some reasonable limits, then a meaningful version of StringBuilder can be used. Otherwise, if we expect long lines, we switch to using the regular version.

Also, before turning to conclusions it is worth mentioning how to do it is impossible or simply dangerous. In other words, which code can work well, but at one point it will fire at the most inappropriate moment. Again, consider an example:

 void GenerateNoise(int noiseLength) { var buf = new Span(stackalloc int[noiseLength]); // generate noise }

The code is small and deleted: it is impossible to take and accept the size for allocating memory on the stack from outside. If you so need the size specified outside and at the same time your code is known only to the consumer known to you, take, for example, the buffer itself:

 void GenerateNoise(Span<int> noiseBuf) { // generate noise }

This code is much more informative, since makes the user think and be careful when choosing numbers. The first option, under unfortunate circumstances, can throw a StackOverflowException if the method is rather shallow in the thread stack: just pass a large number as a parameter.

The second problem I see is: if we randomly failed to get into the size of the buffer that we allocated ourselves on the stack, but we don’t want to lose performance, then, of course, we can go several ways: either allocate memory again on the stack either highlight it in a heap. And most likely the second option in most cases would be preferable (and they did in the case of `ValueStringBuffer`), since more secure in terms of getting `StackOverflowException`.

Conclusions to stackalloc

So, what is the best use of `stackalloc` and how?

To work with unmanaged code, when you need to fill in an unmanaged method some data buffer, or accept a certain data buffer from the unmanaged method that will be used within the life of the method body;
For methods that need an array, but again for the duration of the method itself. The formatting example is very good: this method may be called too often to allocate temporary arrays on the heap.
If an unsafe version of the interaction is used, it is necessary to carefully check the operation of those methods that the link goes to (void *) since if they give it away somewhere, then there is a further possibility of damage to the stack: you cannot guarantee that the external method will not decide to pass the link, for example, for caching. If you are sure that this is excluded, the use will be safe.
If you have the ability to use ref struct types or Span type, then working with stackalloc goes into the realm of managed code, which means that the compiler will simply prevent you from using the type differently than intended.

Using this allocator can greatly improve the performance of your applications.

Link to the whole book

CLR Book: GitHub
Release 0.5.0 books, PDF: GitHub Release

Source: https://habr.com/ru/post/348130/

All Articles

[DotNetBook] Stackalloc: forgotten C # command

Note

Conclusions to stackalloc

Link to the whole book

More articles: