Monthly Archives: January 2016

SharePoint Search {User.Property} Query Variables and Scalability

This was something I stumbled into when working on a large global Intranet (>100k user platform) being built on SharePoint 2013. This is a WCM publishing site using “Search Driven” content leveraging Managed Metadata tagging combined with {User.Property} tokens to deliver “personalised” content. Now .. if there were 2 “buzz words” to market SharePoint 2013 they would be “search driven content” and “personalised results”, so I was surprised at what I found.

The Problem

So we basically found that page load times were >20 seconds and our SharePoint Web Servers were maxed out at 100% CPU usage. The load test showed that performance was very good with low load, but once we started ramping the load up CPU usage went up extremely quickly and rapidly ended up being almost un-usable.

It is worth bearing in mind that this is a completely “cloud friendly” solution, so zero server-side components, using almost exclusively “out of the box” web parts (mostly “Search Result Web Parts”, they would have been “Content by Search” but this was a Standard SKU install). We also use Output caching and blob caching, as well as minified and cached assets to slim down the site as much as possible,

Also worth noting that we have 10 (ten) WFE servers, each with 4 CPU cores and 32GB RAM (not including a whole battery of search query servers, index servers, and other general “back-end” servers). So we weren’t exactly light on oomph in the hardware department.

We eventually found it was the search result web parts (we have several on the home page) which were flattening the web servers. This could be easily proved by removing those web parts from the page and re-running our Load Tests (at which point CPU load dropped to ~60% and page load times dropped to 0.2 sec per page even above our “maximum capacity” tests).

What was particularly weird is that the web servers were the ones maxing out their CPU. The Search Query Component servers (dedicated hardware) were not too heavily stressed at all!

Query Variables anyone?

So the next thing we considered is that we make quite liberal use of “Query Variables” and in particular the {User.Property} ones. This allows you to use a “variable” in your Search Query which is swapped out “on the fly” for the values in that user’s SharePoint User Profile.

In our example we had “Location” and “Function” in both content and the User Profile Database, all mapped to the same MMS term sets. The crux of if is that it allows you to “tag” a news article with a specific location (region, country, city, building) and a specific function (e.g. a business unit, department or team) and when users hit the home page they only see content “targeted” at them.

To me this is what defines a “personalised” intranet .. and is the holy grail of most comms teams

However, when we took these personalisation values out (i.e. replacing {User.Location} with some actual Term ID GUID values) performance got remarkedly better! We also saw a significant uplift in the CPU usage on our Query Servers (so they were approaching 100% too).

So it would appear that SOMETHING in the use of Query Variables was causing a lot of additional CPU load on the Web Servers!

It does what??

So, now we get technical. I used JetBrains “DotPeek” tool to disassemble some of the SharePoint Server DLLs to find out what on earth happens when a Query Variable is passed in.

I was surprised at what I found!

I ended up delving down into the Microsoft.Office.Server.Search.Query.SearchExecutor class as this was where most of the “search” based activity went on, in particular in the PreExecuteQuery() method. This in turn referred to the Microsoft.SharePoint.Publishing.SearchTokenExpansion class and its GetTokenValue() method.

It then hits a fairly large switch statement with any {User.Property} tokens being passed over to a static GetUserProperty() method, which in turn calls GetUserPropertyInner(). This is where the fun begins!

The first thing it does is call UserProfileManager.GetUserProfile() to load up the current users SharePoint profile. There doesn’t appear to be any caching here (so this is PER TOKEN instance. If you have 5 {user.property} declarations in a single query, this happens 5 times!).

The next thing that happens is that it uses profile.GetProfileValueCollection() to load the property values from the UPA database, and (if it has the IsTaxonomic flag set) calls GetTaxonomyTerms() to retrieve the term values. These are full-blown “Term” objects which get created from calls to either TaxonomySession.GetTerms() or TermStore.GetTerms(). Either way, this results in a service/database roundtrip to the Managed Metadata Service.

Finally it ends up at GetTermProperty() which is just a simple bit of logic to build out the Keyword Query Syntax for Taxonomy fields (the “#0” thing) for each Term in your value collection.

So the call stack goes something like this:

SearchExecutor::PreExecuteQuery()
=> SearchTokenExpansion::GetTokenValue()
=> GetUserProperty()
=> GetUserPropertyInner()
=> UserProfileManager::GetUserProfile()
=> UserProfile::Properties.GetPropertyByName().CoreProperty.IsTaxonomic
If it is (which ours always are) then …
=> UserProfile::GetProfileValueCollection()::GetTaxonomyTerms()
=> TermStore::GetTerms()
Then for each term in the collection
=> SearchTokenExpansion::GetTermProperty()
This just builds out the “#0” + term.Id.ToString() query value

So what does this really mean?

Well lets put a simple example here.

Lets say you want to include a simple “personalised” search query to bring back targeted News content.

{|NewsFunction:{User.Function}} AND {|NewsLocation:{User.Location}}

This looks for two Search Managed Properties (NewsFunction and NewsLocation) and queries those two fields using the User Profile properties “Function” and “Location” respectively. Note – This supports multiple values (and will concatenate the query with “NewsFunction: OR NewsFunction:” as required)

On the Web Server this results in:

  • 2x “GetUserProfile” calls to retrieve the user’s profile
  • 2x “GetPropertyByName” calls to retrieve the attributes of the UPA property
  • 2x “GetTerms” queries to retrieve the term values bound to that profile

And this is happening PER PAGE REFRESH, PER USER.

So … now it suddenly became clear.

With 100k users hitting the home page it was bottlenecking the Web Servers because every home page hit resulted in double the amount of server-side lookups to the User Profile Service and Managed Metadata Service (on top of all of the other standard processing).

So how to get round this?

The solution we are gunning for is to throw away the Search Web Parts and build our own using REST calls to the Search API and KnockoutJS for the data binding.

This allows us to use client-side caching of the query (including any “expanded” query variables, and caching of their profile data) and we can even cache the entire search query result if needed so “repeat visits” to the page don’t result in additional server load.

Finally…
This was a fairly high profile investigation, including Microsoft coming in for a bit of a chat about some of the problems we’re facing. After some investigation they did confirm another option (which didn’t work for us, but useful to know) which is this:

  • Query Variables in the Search Web Part are processed by the Web Server before being passed to the Query Component
  • The same query variables in a Result Source or Query Rule will be processed on the Query Server directly!

So if you have a requirement which you can compartmentalise into a Query Rule or Result Source, you might want to look at that approach instead to reduce the WFE processing load.

Cheers! And good luck!