Filterable Lists

In order to provide a broad, configurable search and filtering interface across areas of our site, we have implemented a custom StreamField block, FilterableList, that allows a user to specify what filters are available, how to order results, and which pages should be included in the search.

How It Works

The journey on how a page gets a filterable form is not necessarily a straight or simple path, but it is something that is important to know. To start, the page must support the FilterableList block within a StreamField as we mentioned earlier, but from there we start to see some divergence. In order to utilize the FilterableList the page must support one of the following two classes: FilterableListMixin or CategoryFilterableMixin.


The more common mixin that pages will extend is the FilterableListMixin. This class defines several important methods, such as get_form_class, which defines the form to use. We also have some methods that retrieve relevant information for the form to use, such as get_filterable_list_wagtail_block, get_filterable_root, and get_filterable_queryset. The bulk of the work is done in the get_context method, which is responsible for getting and populating the form, processing the form, and returning the results to the user.


The CategoryFilterableMixin is an extension of the base FilterableListMixin that exposes some new functionality. It modifies how get_filterable_queryset operates in that it gets an initial list of pages but limits them to only ones that are assigned a category within a set of initial categories, which is defined as the variable filterable_categories on a given page model. We can see this in action with both Newsroom (NewsroomLandingPage) and Recent Updates (ActivityLogPage) pages.


As of our initial release of Elasticsearch-backed filterable lists in March 2021, our filterable forms can be broken into three specific forms: FilterableListForm, EnforcementActionsFilterableListForm, and EventArchiveFilterForm. The majority of our filterable lists rely on FilterableListForm and the other two are each leveraged by a single page.


This is the base form that the vast majority of uses for filterable lists. It defines the core fields that are visible on the form as well as functions to assist in setting initial data and sanitizing form input. The important information regarding FilterableListForm is that it defines the function get_page_set, which is responsible for invoking a search query. The logic regarding how to pass categories into the search object is due to the previously mentioned CategoryFilterableMixin, which modifies the initial search parameters to enforce a category search if and only if the filterable_categories list is passed into the form when initialized.


The EnforcementActionsFilterForm is an extension of FilterableListForm, adding on two fields specific to Enforcement Actions, and using a refined search class to provide search functionality against the new fields and a proper ordering by initial filing date.


The EventArchiveFilterForm is another extension of FilterableListForm, the only real modification being the invocation of an event specific search class that allows us to provide filtering based on event dates rather than page publication dates.


There is currently only one type of document defined, FilterablePagesDocument, which is based off the AbstractFilterPage class. This document is responsible for housing data related to any of the filterable page types that extend AbstractFilterPage, including EnforcementActionPage, BlogPage, EventPage, and NewsroomPage, to name a few. In order to get fields that are specific to a page type, such as the status list for an Enforcement Action, you use the prepare_field function syntax. The use of get_instances_from_related is to enforce the auto-updating of our index when changes occur to a specific page we have indexed, rather than just the relation to AbstractFilterPage that is reflected in the database.

Search is the final piece of the puzzle, where we actually leverage Elasticsearch to filter and match documents and return them in an ordered QuerySet. Before breaking down the search classes, it's important to discuss the current implementation from an Elasticsearch perspective to understand how we're gathering results.

The expanded search for filterable lists is using a multi-match query across the title, topic name, preview description, and content fields of all FilterablePagesDocuments. We are leveraging a phrase_prefix matching style with a currently configured slop of 2, to allow for some looser matching restrictions. We also provide a boost score for matching to the title and topic name fields, indicated by ^10 within the code base. This boost score is to enable better ordering by relevance when desired. Search currently supports two different methods of ordering results: relevance and date published. Relevance is calculated by the Elasticsearch engine when returning results, and the date published is calculated based on page publication date. Enforcement Actions define their own ordering logic based on initial filing date.


FilterablePagesDocumentSearch is the core search class that is used across the majority of our searching. It is invoked from FilterableListForm. This search class defines the common structure for our search function, as well as the base logic for filtering against all common fields and logic behind our multi-match and ordering steps. The core function called from outside the class is the search function, which properly chains all of our filter/match/sorting logic and returns the resulting list as a Django QuerySet.


EventFilterablePagesDocumentSearch is an extension of FilterablePagesDocumentSearch that defines behavior specific to our future and past Events listings. The class overwrites one method from its parent, the filter_date function, to change the behavior to filter based on fields specific to events, the start and end date of an event.


EnforcementActionsFilterForm is an extension of FilterablePagesDocumentSearch that exposes some additional filter logic through the apply_specific_filters function. We also see that filter_date and order_results have been overwritten to leverage an Enforcement Action-specific field, initial filing date.