This article discusses the incremental garbage collector (incremental GC), which was introduced in Ruby 2.2. We call this algorithm RincGC. RincGC allows for a shorter pause (GC pause time) compared to Ruby 2.1.
About the author:
Koichi Sasada (Koichi Sasada) works in Heroku along with Nobu and Matz on the ruby ​​core. He wrote
YARV , Generation Garbage Collector (RgenGC) for Ruby 2.1, as well as incremental GC for ruby ​​2.2 and this article.
Premise
Ruby uses GC to automatically collect unused objects. Thanks to the garbage collector, ruby ​​programmers do not have to delete objects manually and don’t have to worry about bugs during such a deletion.
The first version of Ruby has already used the mark and sweep (M & S) algorithm. M & S is one of the most simple GC algorithms, which consists of two stages:
1. Mark: walk through all living objects and mark them as “living object”
2. Sweep: dispose of all unlabeled objects as they are no longer used.
M & S is based on the knowledge that all objects found from among living objects are living objects. The M & S algorithm is very simple and because of this it works very well.
')
Mark & ​​sweep GC algorithmThis simple and efficient algorithm (and the conservative garbage collection method) makes it easy to write extensions in C. As a result, there are many useful extensions in Ruby. However, because of this algorithm, it is difficult to apply moving GC algorithms such as compaction and copying.
Now the writing of C-extensions is not so important, because we can use the FFI (foreign function interface). But in the beginning, having a large number of extensions and providing many features via C-extensions was a big advantage, which made the Ruby interpreter more popular.
Despite the fact that the M & S algorithm is simple and works fine, there are several problems. The most important potential problems are performance (throughput) and pause time (pause time). GC slows down your Ruby program due to the overhead of garbage collection. In other words, poor GC performance increases the total execution time of your application. Each garbage collection suspends your application. Long pauses affect web application UI / UX.
To solve the performance problem, Ruby 2.1 has a generational GC. Generational GC breaks the heap space into several parts for several generations (in Ruby, we divide the heap space into: young and old space). The newly created objects are in the “young space” and are accordingly marked as “young objects”. After the young objects survive several garbage collections (3 in Ruby 2.2), they go into the category of “old objects” and will be in the “old space”. We know that in object-oriented programming, most objects die young. Therefore, we need to start garbage collection only for the “young space”. If there is not enough space in the young space to create objects, then the garbage collector for the “old space” is started. We call “Minor GC” when the garbage collector works in a young space, and “Major GC” for everyone: for the young and for the old spaces. We implemented the generational GC algorithm with some changes and called our algorithm and the implementation of garbage collection "RGenGC". You can find out more details by looking at
my presentation and
slides on EuRuKo.
RGenGC significantly increases garbage collection performance due to the very fast Minor GC. It is worth noting that Major GC suspends program execution for a long time and this time is equal to the length of the pause in Ruby 2.0 and earlier versions. Most garbage collections are done by Minor GC.
Pause in Major GC and Minor GCTo solve a problem with a long pause, an incremental garbage collector is best.
Incremental garbage collection
The incremental garbage collection algorithm divides the build process itself into several small processes and alternates GC processes and Ruby processes. Instead of a long pause, the incremental garbage collector produces many short pauses. The total duration of the pauses will remain the same (or slightly more due to the overhead of using the incremental garbage collector), but each individual pause will be shorter. This allows for more stable performance.
In Ruby 1.9.3, a “lazy sweep” GC appeared, which reduces the duration of pauses in the sweep-phase. The meaning of the laze sweep is to start the sweep phase not immediately, but step by step. Lazy sweep reduces the duration of individual pauses and is half the incremental GC algorithm. Now we need to make the Major GC stage of work incremental.
Let's introduce three concepts to explain the process of incrementally marking objects: “white object” (white object) - unlabeled objects, “gray object” (gray object) - marked, but can refer to white objects, “black object” (black object) - marked, but do not indicate any white object.
Using these three colors, we can explain the “mark and sweep” algorithm as follows:
1. All existing objects are marked as white.
2. Explicit living objects, such as objects on the stack, are marked gray.
3. Select any gray object and mark the object to which it refers also gray. Change the color of the original object to black. Repeat until there are gray objects, and there will be only black and white.
4. Collect white objects, since all living objects are painted black.
To make the whole process incremental, it is necessary to make step (3) incremental. The plan is this: select some gray objects and also mark the objects they refer to with gray, then we switch to Ruby code execution, then we return to the tagging process, etc.
The usual tagging process (STW: stop the world) versus incrementalWith the incremental marking process of objects, there is one problem. Black objects may indicate white while Ruby code is being executed. This is a problem, since a black object by definition cannot refer to white objects. To prevent this occurrence, we use a write barrier (write-barrier) to detect the creation of such a black object reference to white.
For example, the array object 'ary' is already marked black.
ary = []
The object 'obj = Object.new' will be white if we execute this code.
ary << obj
Now the black object refers to the white object. If there are no gray objects referring to 'obj', then 'obj' will be white at the end of the tagging of objects and, accordingly, disposed of by mistake. Collecting all living objects is a blunder and should be avoided.
The write barrier is called every time an object starts to reference a new object. The entry barrier determines when the link from the black object to the white one was created. When this happens, the black object changes its color to gray (or gray with a pointer to a white object). Record barriers completely solve the problem of collecting all living objects.
This is the basic meaning of the incremental algorithm. As you can see, it is not so difficult. Perhaps you have a question: “why is Ruby still not using this simple GC algorithm?”
Ruby 2.2 incremental garbage collector
When implementing the incremental tagging process in the Ruby (CRuby) interpreter, a serious problem arises - the lack of write barriers.
Generational GC, which was implemented in Ruby 2.1, also needed write barriers. To implement a generational GC, we invented a new method - “write barrier unprotected objects” (write barrier unprotected objects). This means that we have divided all objects into protected and unprotected. In this way, we can guarantee that all referenced protected objects are under control. We cannot control links from unprotected objects. With the introduction of the concept of an unprotected object, you can implement a generational GC in Ruby 2.1
We can also properly implement incremental GC using unprotected objects:
1. Color all existing objects in white
2. Paint all obviously living objects in gray, including objects on the stack.
3. Take one gray object and paint in gray all the objects to which it refers. Change its color to black. Repeat until no gray objects remain. Only black and white. This phase is carried out in stages.
4. Collect white objects, because all living objects are black
So we can guarantee that we do not have white living objects.
Rescan from the barrier to write unprotected objects (WB unp.) At the end of the tagging processUnfortunately, the fourth stage can create a long pause, which we hope to avoid. However, the total pause time is related to the number of barriers to recording of live unprotected objects. Most objects in Ruby are String, Array, Hash, or simple (pure) objects created by the programmer. They are protected objects. In practice, the pause for the barrier to write unprotected objects does not create any problems in most cases.
We have done an incremental tagging process only for major GC, since Nobody complains about pauses in minor GC. The maximum pause length in our incremental GC is less than in minor GC. If you are satisfied with the pause time in minor GC, then you do not need to worry about the pause time in major GC.
I also used a trick to implement incremental GC in Ruby. We have a set of “black and unprotected” objects. In order for the garbage collector to work quickly, we created an “unprotected” bitmap, which is an unprotected object, and a separate “tagged” bitmap that shows which objects were marked. Using a logical work, we can find “black and unprotected” objects.
Estimation of the duration of pause incremental GC
In order to measure the duration of pauses in the process of garbage collection, we will use
gc_tracer . In gc_tracer, there is a GC: Tracer module, which allows you to track parameters related to the garbage collection process. gc_tracer prints every such parameter to a file.
Garbage collection includes the following events:
start
end_mark
end_sweep
newobj
freeobj
enter
exit
As I described above, GC in Ruby has two phases: “mark” and “sweep”. The event “start” means the start of the mark-phase, and “end_mark” - its completion. The end_mark event also marks the beginning of the sweep phase. Obviously, “end_sweep” speaks of the end of the sweep phase and also means the completion of the garbage collection process.
"Newobj" and "freeobj" are events when memory is allocated for an object and released.
We use the "enter" and "exit" events to measure the duration of pauses. Incremental GC (incremental marking and lazy sweeping) uses the mark-phase and sweep-phase suspension. "Enter" means "entering GC related event". Finally, “exit” means “exitting GC related event”
The following figure shows the distribution of events over time in the current incremental GC:

We can measure the current time (in Linux, the current time is the result of calling gettimeofday ()) for each event. Thus, we can measure the duration of pauses in the GC using the “enter” and “exit” events.
I use the
ko1-test-app to compare performance. ko1-test-app is a simple Rails application written for me by Aaron Patterson.
To use the gc_tracer jam, I added the rake rule "test_gc_tracer":
diff --git a/perf.rake b/perf.rake index f336e33..7f4f1bd 100644 --- a/perf.rake +++ b/perf.rake @@ -54,7 +54,7 @@ def do_test_task app body.close end -task :test do +def test_run app = Ko1TestApp::Application.instance app.app @@ -67,6 +67,22 @@ task :test do } end +task :test do + test_run +end + +task :test_gc_tracer do + require 'gc_tracer' + require 'pp' + pp GC.stat + file = "log.#{Process.pid}" + GC::Tracer.start_logging(file, events: %i(enter exit), gc_stat: false) do + test_run + end + pp GC.stat + puts "GC tracer log: #{file}" +end + task :once do app = Ko1TestApp::Application.instance app.app
And I launched
bundle exec rake test_gc_tracer KO1TEST_CNT = 30000 . A value of "30,000" means that we will simulate 30,000 requests. The results will be written to the file "log.xxxx", where xxxx is the process id of the application. The file should look like this:
type tick major_by gc_by have_finalizer immediate_sweep state enter 1419489706840147 0 newobj 0 0 sweeping exit 1419489706840157 0 newobj 0 0 sweeping enter 1419489706840184 0 newobj 0 0 sweeping exit 1419489706840195 0 newobj 0 0 sweeping enter 1419489706840306 0 newobj 0 0 sweeping exit 1419489706840313 0 newobj 0 0 sweeping enter 1419489706840612 0 newobj 0 0 sweeping ...
My file has 1,142,907 lines.
The “type” column contains the name of the event, “tick” - the current time (the result of gettimeofday (), the number of microseconds since the beginning of the era). We can see the duration of the pause using this information. Using the first two lines above, we can measure the pause length: 10 μs (1419489706840157 - 1419489706840147).
The following small script shows the duration of each pause:
enter_tick = 0 open(ARGV.shift){|f| f.each_line{|line| e, tick, * = line.split(/\s/) case e when 'enter' enter_tick = tick.to_i when 'exit' st = tick.to_i - enter_tick puts st if st > 100
There will be many lines in the log file, as this script prints a pause length every 100ÎĽs.
The following figure shows the measurement result:

We see that generational GC has 7 huge pauses. This should be a pause caused by the launch of major GC. The maximum pause time is approximately 15ms (15KÎĽs). However, incremental GC reduces the maximum pause time to 2ms (2KÎĽs). Fine.
Conclusion
Ruby 2.2 uses the incremental garbage collection algorithm to reduce pause time during garbage collection.
Note that incremental GC is not a “silver bullet”. As I have already described, incremental GC does not affect performance. This will not affect the response time if the request is too long and causes major GC several times. The total garbage collection time will not be reduced by the incremental GC.