Collections, CollectionViews, and a WPF Binding Memory Leak

A colleague of mine recently came across some code that should work, but doesn’t:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
private ListBox SomeListBox;
private ObservableCollection<string> SomeStrings = new ObservableCollection<string>();
 
public void SomeMethod()
{
    SomeListBox.ItemsSource = null;
    ThreadPool.QueueUserWorkItem(o => AddToStrings());
}
 
private void AddToStrings()
{
    // this line will actually crash the app!
    SomeStrings.Add("blah");
 
    Dispatcher.BeginInvoke(new Action(() => SomeListBox.ItemsSource = SomeStrings);
}

So basically, detaching an ObservableCollection<T> from a view, adding an element on a background thread, and then reattaching it back on the UI thread again. Generally, view-bound objects can only be modified from the UI thread, but shouldn’t detaching the collection first should make that a non-issue?

When you actually run the code, you get one of those obnoxious “InvalidOperationException: The calling thread cannot access this object because a different thread owns it” messages, and then your app will probably die. These usually happen because you’re trying to modify a control or an object bound to a control, neither of which applies here. It also happens when you’re trying to do anything with a subclass of DispatcherObject or a DependencyObject on the wrong thread, but that doesn’t apply here either, right? ObservableCollection<T> does not have any inherent thread affinity, and it is definitely safe to pass between threads so long as access to it is synchronized.

So we took the control out of the picture by writing SomeListBox.ItemsSource = null, and ObservableCollection<T> can be modified from any thread. What’s the problem? Those are the only two objects in the picture, right?

ICollectionView

(If you already know what ICollectionView is, feel free to skip to the end of the story.)

Actually, when any IEnumerable is bound to an ItemsControl, an implementation of ICollectionView is created to wrap the collection, and it provides a means for sorting, grouping, and filtering without affecting the original collection. So when you bind a ListBox, or ListView, or the WPF Toolkit DataGrid, or some other third-party ItemsControl to a grid, a CollectionView is created specifically to maintain intermediate state on behalf of the control without having to touch the original collection.

Consider some List<string>:

Index Value
0 Cheese
1 Apple
2 Tomato
3 Salad

We bind it to a DataGrid, and then we tell the DataGrid to sort itself. The list should look like this:

Index Value
0 Apple
1 Cheese
2 Salad
3 Tomato

Note how the objects now have different indices. It’s an important detail—WPF manages to accomplish sorting, grouping, and filtering all without modifying the original collection because of the use of a CollectionView (or one of its subclasses ListCollectionView, BindingListCollectionView, or the internal CollectionViewProxy class). It also pulls off some shiny tricks occasionally, batching changes where appropriate and postponing actual CollectionView reordering (which can obviously get expensive) until subsequent pumps of the current Dispatcher. And that’s why CollectionView inherits from DispatcherObject.

And of course, if our collection changes, the CollectionView listens for those notifications and responds appropriately, causing hilarious consequences when the change happens on a background thread.

The Fix

The implementation of CollectionView is somewhat broken when collections implement INotifyCollectionChanged:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
internal CollectionView(IEnumerable collection, int moveToFirst)
{
    /* some code snipped for brevity */
    INotifyCollectionChanged changed = collection as INotifyCollectionChanged;
    if (changed != null)
    {
        // this is the problem
        changed.CollectionChanged += this.OnCollectionChanged;
 
        /* some more code */
    }
 
    /* still more code */
}

Nowhere anywhere is this event handler unwired. So collections essentially retain strong references of their corresponding CollectionViews. The data binding system that is responsible for creating the CollectionView maintains a weak reference, but as long as the original collection is alive, the data binding system will be hooked into the CollectionView, and therefore the original collection as well.

The fix actually isn’t so bad:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
using System.Collections;
using System.Collections.ObjectModel;
using System.Collections.Specialized;
using System.Windows;
using System.Windows.Data;
 
public class BetterObservableCollection<T> :
    ObservableCollection<T>, ICollectionViewFactory
{
    public ICollectionView CreateView()
    {
        return new BetterListCollectionView(this);
    }
}
 
public class BetterListCollectionView :
    ListCollectionView, IWeakEventListener
{
    public BetterListCollectionView(IList list) : base(list)
    {
        INotifyCollectionChanged changed = list as INotifyCollectionChanged;
        if (changed != null)
        {
            // this fixes the problem
            changed.CollectionChanged -= this.OnCollectionChanged;
            CollectionChangedEventManager.AddListener(list, this);
        }
    }
}

As soon as the collection is detached from the view, the view’s reference to the CollectionView is broken, and the CollectionView is allowed to disappear because nothing (neither the original collection nor the binding system) is holding on to a strong reference any more.

What “The Fix” Breaks

The CollectionView is responsible for managing sorting, grouping, and filtering. If you disconnect the collection from the view, the CollectionView will get garbage-collected, and the sorting, grouping, and/or filtering you had on the control will be lost. That may or may not matter for your particular use case, but it’s definitely something to think about. If that is important, it may make more sense to bind to a System.ComponentModel.BindingList<T>, which tracks its own grouping, sorting, and filtering inside the list itself—the tradeoff is then you lose the ability to have multiple views on the same collection. —DKT