📜 ⬆️ ⬇️

Data Virtualization in WPF

Good day.

I have long been interested in the question of writing my class for the optimal loading of information from a database, for example, when the number of records is more than 10 million records.
Delayed loading of information, the use of multiple data sources, etc.

I did not find a post dedicated to this topic in Habré, so I present to you my translation of the article by Paul McClin, which became the starting point in solving the tasks.

Original article: here
Project source files: here
')
Further in the text I will write on behalf of the author.

Introduction


WPF provides some interesting user interface virtualization features for efficient work with large collections, at least in terms of user interface, but does not provide a general method for data virtualization. While in many posts on the forums there is a discussion of data virtualization, no one (as far as I know) published a solution. The article represents one of these solutions.

Prerequisites


UI Virtualization

When a WPF ItemsControl control is associated with a large collection of source data with UI virtualization enabled, the control creates visual containers for visible items only (plus some above and below). This is usually a small part of the original collection. When the user scrolls through the list, new visual containers are created when the elements become visible, and the old containers are destroyed when the elements become invisible. When reusing visual containers, we reduce the overhead of creating and destroying objects.

User interface virtualization means that the control can be associated with a large collection of data and take up less memory due to the small number of visible containers.

Data virtualization

Data virtualization is a term that means achieving virtualization for a data object associated with an ItemsControl. Data virtualization is not provided in WPF. For relatively small collections of basic objects, memory consumption does not matter. However, for large collections, memory consumption can become very significant. In addition, retrieving information from a database or creating objects can take a lot of time, especially during network operations. For these reasons, it is advisable to use some kind of data virtualization mechanism to limit the number of data objects to be extracted from the source and stored in memory.

Decision


Overview

This solution is based on the fact that when the ItemsControl control is associated with the IList implementation, not the IEnumerable, it does not enumerate the entire list, but instead provides only a selection of the elements necessary for the display. It uses the Count property to determine the size of the collection, to set the size of the scroll bar. In the future, it will iterate through the screen items through the list indexer. Thus, you can create an IList, which can report that it has a large number of elements, and receive elements only as needed.

IItemsProvider <T>

In order to use this solution, the basic source must be able to provide information on the number of elements in the collection, and to provide a small part (or page) from the entire collection. These requirements are expressed in the IItemsProvider interface.
/// <summary> ///     of collection details. /// </summary> /// <typeparam name="T">   </typeparam> public interface IItemsProvider<T> { /// <summary> ///      /// </summary> /// <returns></returns> int FetchCount(); /// <summary> ///    /// </summary> /// <param name="startIndex"> </param> /// <param name="count">   </param> /// <returns></returns> IList<T> FetchRange(int startIndex, int count); } 

If the underlying data source is a database query, then you can relatively easily implement the IItemsProvider interface using the aggregate function COUNT (), or the OFFSET and LIMIT expressions provided by most database providers.

VirtualizingCollection <T>

This is an implementation of the IList interface that performs data virtualization. VirtualizingCollection <T> divides the entire collection space into a series of pages. If necessary, the pages are loaded into memory, and destroyed when not needed.

Interesting points will be discussed below. For details, please refer to the source code attached to this article.

The first aspect of the IList implementation is the implementation of the Count property. It is used by the ItemsControl control to estimate the size of the collection and draw the scroll bar.
 Private int _count = -1; public virtual int Count { get { if (_count == -1) { LoadCount(); } return _count; } protected set { _count = value; } } protected virtual void LoadCount() { Count = FetchCount(); } protected int FetchCount() { return ItemsProvider.FetchCount(); } 

The Count property is implemented using a lazy load or lazy load pattern. It uses the special value -1 to indicate that the value has not yet been loaded. On first access, the property will load the current number of items from ItemsProvider.

Another important aspect of the IList interface is the indexer implementation.
 public T this[int index] { get { //        int pageIndex = index / PageSize; int pageOffset = index % PageSize; //    RequestPage(pageIndex); //      50%     if ( pageOffset > PageSize/2 && pageIndex < Count / PageSize) RequestPage(pageIndex + 1); //      50%     if (pageOffset < PageSize/2 && pageIndex > 0) RequestPage(pageIndex - 1); //    CleanUpPages(); //       if (_pages[pageIndex] == null) return default(T); //    return _pages[pageIndex][pageOffset]; } set { throw new NotSupportedException(); } } 

The indexer is the most unique part of the solution. First, it must determine which page belongs to the requested element (pageIndex) and the offset inside the page (pageOffset). Then the RequestPage () method is called, returning the page.

It then loads the next or previous page based on the pageOffset variable. This is based on the assumption that if users are browsing page 0, then there is a high probability that they will scroll down to view page 1. Getting data in advance does not cause data gaps when displayed on the screen.

CleanUpPages () is called to clean (or unload) unused pages.

Finally, a defensive check for page availability. This check is required in case the RequstPage () method does not work in synchronous mode, as when using the derived class AsyncVirtualizingCollection <T>.
 private readonly Dictionary<int, IList<T>> _pages = new Dictionary<int, IList<T>>(); private readonly Dictionary<int, DateTime> _pageTouchTimes = new Dictionary<int, DateTime>(); protected virtual void RequestPage(int pageIndex) { if (!_pages.ContainsKey(pageIndex)) { _pages.Add(pageIndex, null); _pageTouchTimes.Add(pageIndex, DateTime.Now); LoadPage(pageIndex); } else { _pageTouchTimes[pageIndex] = DateTime.Now; } } protected virtual void PopulatePage(int pageIndex, IList<T> page) { if (_pages.ContainsKey(pageIndex)) _pages[pageIndex] = page; } public void CleanUpPages() { List<int> keys = new List<int>(_pageTouchTimes.Keys); foreach (int key in keys) { // page 0 is a special case, since the WPF ItemsControl // accesses the first item frequently if ( key != 0 && (DateTime.Now - _pageTouchTimes[key]).TotalMilliseconds > PageTimeout ) { _pages.Remove(key); _pageTouchTimes.Remove(key); } } } 

Pages are stored in a dictionary (Dictionary), in which the index is used as a key. Also, the dictionary is used to store information about the time of last use. This time is updated each time the page is accessed. It is used by the CleanUpPages () method to remove pages that have not been accessed for a significant amount of time.
 protected virtual void LoadPage(int pageIndex) { PopulatePage(pageIndex, FetchPage(pageIndex)); } protected IList<T> FetchPage(int pageIndex) { return ItemsProvider.FetchRange(pageIndex*PageSize, PageSize); } 

Finally, FetchPage () retrieves the page from ItemsProvider, and the LoadPage () method does the work of calling the PopulatePage () method, which places the page in a dictionary with a given index.

It may seem that there are many irrelevant methods in the code, but they were developed in this way for certain reasons. Each method performs exactly one task. This helps to keep the code readable, and also makes it easy to extend and modify the functionality in derived classes, as will be observed later.

The VirtualizingCollection <T> class achieves the primary goal of implementing data virtualization. Unfortunately, in the process of using this class has one major drawback - all methods for obtaining data are performed synchronously. This means that they are started by user interface threads, which as a result potentially slow down the application.

AsyncVirtualizingCollection <T>

The AsyncVirtualizingCollection <T> class inherits from VirtualizingCollection <T>, and overrides the Load () method to implement asynchronous data loading. A key feature of an asynchronous data source is that at the time of receiving data, it must notify the user interface through its data binding. In normal objects, this is solved using the INotifyPropertyChanged interface. To implement collections, you must use its close relative INotifyCollectionChanged. This interface is used by the ObservableCollection <T> class.
 public event NotifyCollectionChangedEventHandler CollectionChanged; protected virtual void OnCollectionChanged(NotifyCollectionChangedEventArgs e) { NotifyCollectionChangedEventHandler h = CollectionChanged; if (h != null) h(this, e); } private void FireCollectionReset() { NotifyCollectionChangedEventArgs e = new NotifyCollectionChangedEventArgs(NotifyCollectionChangedAction.Reset); OnCollectionChanged(e); } public event PropertyChangedEventHandler PropertyChanged; protected virtual void OnPropertyChanged(PropertyChangedEventArgs e) { PropertyChangedEventHandler h = PropertyChanged; if (h != null) h(this, e); } private void FirePropertyChanged(string propertyName) { PropertyChangedEventArgs e = new PropertyChangedEventArgs(propertyName); OnPropertyChanged(e); } 

The AsyncVirtualizingCollection <T> class implements both the INotifyPropertyChanged and INotifyCollectionChanged interfaces to provide maximum binding flexibility. There is nothing to be noted in this implementation.
 protected override void LoadCount() { Count = 0; IsLoading = true; ThreadPool.QueueUserWorkItem(LoadCountWork); } private void LoadCountWork(object args) { int count = FetchCount(); SynchronizationContext.Send(LoadCountCompleted, count); } private void LoadCountCompleted(object args) { Count = (int)args; IsLoading = false; FireCollectionReset(); } 

In the overridden LoadCount () method, the retrieval is invoked asynchronously via ThreadPool. Upon completion, the new quantity will be set and the FireCollectionReset () method will be called to update the user interface via InotifyCollectionChanged. Notice that the LoadCountCompleted method is called from the user interface thread using the SynchronizationContext. The SynchronizationContext property is set in the class constructor, with the assumption that the collection instance will be created in the user interface thread.
 protected override void LoadPage(int index) { IsLoading = true; ThreadPool.QueueUserWorkItem(LoadPageWork, index); } private void LoadPageWork(object args) { int pageIndex = (int)args; IList<T> page = FetchPage(pageIndex); SynchronizationContext.Send(LoadPageCompleted, new object[]{ pageIndex, page }); } private void LoadPageCompleted(object args) { int pageIndex = (int)((object[]) args)[0]; IList<T> page = (IList<T>)((object[])args)[1]; PopulatePage(pageIndex, page); IsLoading = false; FireCollectionReset(); } 

Asynchronous loading of the page data follows the same rules, and again the FireCollectionReset () method is used to update the user interface.

Note also the property IsLoading. This is a simple flag that can be used by the user interface to indicate the loading of a collection. When the IsLoading property is changed, the FirePropertyChanged () method causes the user interface to be updated via the INotifyProperyChanged mechanism.
 public bool IsLoading { get { return _isLoading; } set { if ( value != _isLoading ) { _isLoading = value; FirePropertyChanged("IsLoading"); } } } 

Demonstration project


In order to demonstrate this solution, I created a simple demonstration project (included in the project source codes).

Firstly, an implementation of the IItemsProvider class was created, which provides dummy data with the flow stopped to simulate the delay in receiving data from a disk or network.
 public class DemoCustomerProvider : IItemsProvider<Customer> { private readonly int _count; private readonly int _fetchDelay; public DemoCustomerProvider(int count, int fetchDelay) { _count = count; _fetchDelay = fetchDelay; } public int FetchCount() { Thread.Sleep(_fetchDelay); return _count; } public IList<Customer> FetchRange(int startIndex, int count) { Thread.Sleep(_fetchDelay); List<Customer> list = new List<Customer>(); for( int i=startIndex; i<startIndex+count; i++ ) { Customer customer = new Customer {Id = i+1, Name = "Customer " + (i+1)}; list.Add(customer); } return list; } } 

The ubiquitous Customer object is used as a collection item.

A simple WPF window with a ListView control was created to allow the user to experiment with different list implementations.
 <Window x:Class="DataVirtualization.DemoWindow" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Title="Data Virtualization Demo - By Paul McClean" Height="600" Width="600"> <Window.Resources> <Style x:Key="lvStyle" TargetType="{x:Type ListView}"> <Setter Property="VirtualizingStackPanel.IsVirtualizing" Value="True"/> <Setter Property="VirtualizingStackPanel.VirtualizationMode" Value="Recycling"/> <Setter Property="ScrollViewer.IsDeferredScrollingEnabled" Value="True"/> <Setter Property="ListView.ItemsSource" Value="{Binding}"/> <Setter Property="ListView.View"> <Setter.Value> <GridView> <GridViewColumn Header="Id" Width="100"> <GridViewColumn.CellTemplate> <DataTemplate> <TextBlock Text="{Binding Id}"/> </DataTemplate> </GridViewColumn.CellTemplate> </GridViewColumn> <GridViewColumn Header="Name" Width="150"> <GridViewColumn.CellTemplate> <DataTemplate> <TextBlock Text="{Binding Name}"/> </DataTemplate> </GridViewColumn.CellTemplate> </GridViewColumn> </GridView> </Setter.Value> </Setter> <Style.Triggers> <DataTrigger Binding="{Binding IsLoading}" Value="True"> <Setter Property="ListView.Cursor" Value="Wait"/> <Setter Property="ListView.Background" Value="LightGray"/> </DataTrigger> </Style.Triggers> </Style> </Window.Resources> <Grid Margin="5"> <Grid.RowDefinitions> <RowDefinition Height="Auto"/> <RowDefinition Height="Auto"/> <RowDefinition Height="Auto"/> <RowDefinition Height="*"/> </Grid.RowDefinitions> <GroupBox Grid.Row="0" Header="ItemsProvider"> <StackPanel Orientation="Horizontal" Margin="0,2,0,0"> <TextBlock Text="Number of items:" Margin="5" TextAlignment="Right" VerticalAlignment="Center"/> <TextBox x:Name="tbNumItems" Margin="5" Text="1000000" Width="60" VerticalAlignment="Center"/> <TextBlock Text="Fetch Delay (ms):" Margin="5" TextAlignment="Right" VerticalAlignment="Center"/> <TextBox x:Name="tbFetchDelay" Margin="5" Text="1000" Width="60" VerticalAlignment="Center"/> </StackPanel> </GroupBox> <GroupBox Grid.Row="1" Header="Collection"> <StackPanel> <StackPanel Orientation="Horizontal" Margin="0,2,0,0"> <TextBlock Text="Type:" Margin="5" TextAlignment="Right" VerticalAlignment="Center"/> <RadioButton x:Name="rbNormal" GroupName="rbGroup" Margin="5" Content="List(T)" VerticalAlignment="Center"/> <RadioButton x:Name="rbVirtualizing" GroupName="rbGroup" Margin="5" Content="VirtualizingList(T)" VerticalAlignment="Center"/> <RadioButton x:Name="rbAsync" GroupName="rbGroup" Margin="5" Content="AsyncVirtualizingList(T)" IsChecked="True" VerticalAlignment="Center"/> </StackPanel> <StackPanel Orientation="Horizontal" Margin="0,2,0,0"> <TextBlock Text="Page size:" Margin="5" TextAlignment="Right" VerticalAlignment="Center"/> <TextBox x:Name="tbPageSize" Margin="5" Text="100" Width="60" VerticalAlignment="Center"/> <TextBlock Text="Page timeout (s):" Margin="5" TextAlignment="Right" VerticalAlignment="Center"/> <TextBox x:Name="tbPageTimeout" Margin="5" Text="30" Width="60" VerticalAlignment="Center"/> </StackPanel> </StackPanel> </GroupBox> <StackPanel Orientation="Horizontal" Grid.Row="2"> <TextBlock Text="Memory Usage:" Margin="5" VerticalAlignment="Center"/> <TextBlock x:Name="tbMemory" Margin="5" Width="80" VerticalAlignment="Center"/> <Button Content="Refresh" Click="Button_Click" Margin="5" Width="100" VerticalAlignment="Center"/> <Rectangle Name="rectangle" Width="20" Height="20" Fill="Blue" Margin="5" VerticalAlignment="Center"> <Rectangle.RenderTransform> <RotateTransform Angle="0" CenterX="10" CenterY="10"/> </Rectangle.RenderTransform> <Rectangle.Triggers> <EventTrigger RoutedEvent="Rectangle.Loaded"> <BeginStoryboard> <Storyboard> <DoubleAnimation Storyboard.TargetName="rectangle" Storyboard.TargetProperty= "(TextBlock.RenderTransform).(RotateTransform.Angle)" From="0" To="360" Duration="0:0:5" RepeatBehavior="Forever" /> </Storyboard> </BeginStoryboard> </EventTrigger> </Rectangle.Triggers> </Rectangle> <TextBlock Margin="5" VerticalAlignment="Center" FontStyle="Italic" Text="Pause in animation indicates UI thread stalled."/> </StackPanel> <ListView Grid.Row="3" Margin="5" Style="{DynamicResource lvStyle}"/> </Grid> </Window> 

Do not go into the details of XAML. The only thing worth mentioning is using the specified ListView styles to change the background and mouse cursor in response to a change in the IsLoading property.
 public partial class DemoWindow { /// <summary> /// Initializes a new instance of the <see cref="DemoWindow"/> class. /// </summary> public DemoWindow() { InitializeComponent(); // use a timer to periodically update the memory usage DispatcherTimer timer = new DispatcherTimer(); timer.Interval = new TimeSpan(0, 0, 1); timer.Tick += timer_Tick; timer.Start(); } private void timer_Tick(object sender, EventArgs e) { tbMemory.Text = string.Format("{0:0.00} MB", GC.GetTotalMemory(true)/1024.0/1024.0); } private void Button_Click(object sender, RoutedEventArgs e) { // create the demo items provider according to specified parameters int numItems = int.Parse(tbNumItems.Text); int fetchDelay = int.Parse(tbFetchDelay.Text); DemoCustomerProvider customerProvider = new DemoCustomerProvider(numItems, fetchDelay); // create the collection according to specified parameters int pageSize = int.Parse(tbPageSize.Text); int pageTimeout = int.Parse(tbPageTimeout.Text); if ( rbNormal.IsChecked.Value ) { DataContext = new List<Customer>(customerProvider.FetchRange(0, customerProvider.FetchCount())); } else if ( rbVirtualizing.IsChecked.Value ) { DataContext = new VirtualizingCollection<Customer>(customerProvider, pageSize); } else if ( rbAsync.IsChecked.Value ) { DataContext = new AsyncVirtualizingCollection<Customer>(customerProvider, pageSize, pageTimeout*1000); } } } 

The window layout is quite simple, but sufficient to demonstrate the solution.

The user can configure the number of items in the DemoCustomerProvider instance and the simulator delay time.

The demonstration allows users to compare the standard List implementation (T), the implementation with synchronous data loading VirtualizingCollection (T), and the implementation with asynchronous data loading AsyncVirtualizingCollection (T). When using VirtualizingCollection (T) and AsyncVirtualizingCollection (T), the user can set the page size and timeout (sets the time after which the page should be unloaded from memory). They must be selected according to the characteristics of the element and the expected pattern of use.



To compare different types of collections, the window also displays the total amount of used memory. A rotating square animation is used to visualize the stopping of user interface flow. In a fully asynchronous solution, the animation should not slow down or stop.

Source: https://habr.com/ru/post/208792/


All Articles