Category Archives: Infrastructure

SharePoint Search {User.Property} Query Variables and Scalability

This was something I stumbled into when working on a large global Intranet (>100k user platform) being built on SharePoint 2013. This is a WCM publishing site using “Search Driven” content leveraging Managed Metadata tagging combined with {User.Property} tokens to deliver “personalised” content. Now .. if there were 2 “buzz words” to market SharePoint 2013 they would be “search driven content” and “personalised results”, so I was surprised at what I found.

The Problem

So we basically found that page load times were >20 seconds and our SharePoint Web Servers were maxed out at 100% CPU usage. The load test showed that performance was very good with low load, but once we started ramping the load up CPU usage went up extremely quickly and rapidly ended up being almost un-usable.

It is worth bearing in mind that this is a completely “cloud friendly” solution, so zero server-side components, using almost exclusively “out of the box” web parts (mostly “Search Result Web Parts”, they would have been “Content by Search” but this was a Standard SKU install). We also use Output caching and blob caching, as well as minified and cached assets to slim down the site as much as possible,

Also worth noting that we have 10 (ten) WFE servers, each with 4 CPU cores and 32GB RAM (not including a whole battery of search query servers, index servers, and other general “back-end” servers). So we weren’t exactly light on oomph in the hardware department.

We eventually found it was the search result web parts (we have several on the home page) which were flattening the web servers. This could be easily proved by removing those web parts from the page and re-running our Load Tests (at which point CPU load dropped to ~60% and page load times dropped to 0.2 sec per page even above our “maximum capacity” tests).

What was particularly weird is that the web servers were the ones maxing out their CPU. The Search Query Component servers (dedicated hardware) were not too heavily stressed at all!

Query Variables anyone?

So the next thing we considered is that we make quite liberal use of “Query Variables” and in particular the {User.Property} ones. This allows you to use a “variable” in your Search Query which is swapped out “on the fly” for the values in that user’s SharePoint User Profile.

In our example we had “Location” and “Function” in both content and the User Profile Database, all mapped to the same MMS term sets. The crux of if is that it allows you to “tag” a news article with a specific location (region, country, city, building) and a specific function (e.g. a business unit, department or team) and when users hit the home page they only see content “targeted” at them.

To me this is what defines a “personalised” intranet .. and is the holy grail of most comms teams

However, when we took these personalisation values out (i.e. replacing {User.Location} with some actual Term ID GUID values) performance got remarkedly better! We also saw a significant uplift in the CPU usage on our Query Servers (so they were approaching 100% too).

So it would appear that SOMETHING in the use of Query Variables was causing a lot of additional CPU load on the Web Servers!

It does what??

So, now we get technical. I used JetBrains “DotPeek” tool to disassemble some of the SharePoint Server DLLs to find out what on earth happens when a Query Variable is passed in.

I was surprised at what I found!

I ended up delving down into the Microsoft.Office.Server.Search.Query.SearchExecutor class as this was where most of the “search” based activity went on, in particular in the PreExecuteQuery() method. This in turn referred to the Microsoft.SharePoint.Publishing.SearchTokenExpansion class and its GetTokenValue() method.

It then hits a fairly large switch statement with any {User.Property} tokens being passed over to a static GetUserProperty() method, which in turn calls GetUserPropertyInner(). This is where the fun begins!

The first thing it does is call UserProfileManager.GetUserProfile() to load up the current users SharePoint profile. There doesn’t appear to be any caching here (so this is PER TOKEN instance. If you have 5 {user.property} declarations in a single query, this happens 5 times!).

The next thing that happens is that it uses profile.GetProfileValueCollection() to load the property values from the UPA database, and (if it has the IsTaxonomic flag set) calls GetTaxonomyTerms() to retrieve the term values. These are full-blown “Term” objects which get created from calls to either TaxonomySession.GetTerms() or TermStore.GetTerms(). Either way, this results in a service/database roundtrip to the Managed Metadata Service.

Finally it ends up at GetTermProperty() which is just a simple bit of logic to build out the Keyword Query Syntax for Taxonomy fields (the “#0” thing) for each Term in your value collection.

So the call stack goes something like this:

SearchExecutor::PreExecuteQuery()
=> SearchTokenExpansion::GetTokenValue()
=> GetUserProperty()
=> GetUserPropertyInner()
=> UserProfileManager::GetUserProfile()
=> UserProfile::Properties.GetPropertyByName().CoreProperty.IsTaxonomic
If it is (which ours always are) then …
=> UserProfile::GetProfileValueCollection()::GetTaxonomyTerms()
=> TermStore::GetTerms()
Then for each term in the collection
=> SearchTokenExpansion::GetTermProperty()
This just builds out the “#0” + term.Id.ToString() query value

So what does this really mean?

Well lets put a simple example here.

Lets say you want to include a simple “personalised” search query to bring back targeted News content.

{|NewsFunction:{User.Function}} AND {|NewsLocation:{User.Location}}

This looks for two Search Managed Properties (NewsFunction and NewsLocation) and queries those two fields using the User Profile properties “Function” and “Location” respectively. Note – This supports multiple values (and will concatenate the query with “NewsFunction: OR NewsFunction:” as required)

On the Web Server this results in:

  • 2x “GetUserProfile” calls to retrieve the user’s profile
  • 2x “GetPropertyByName” calls to retrieve the attributes of the UPA property
  • 2x “GetTerms” queries to retrieve the term values bound to that profile

And this is happening PER PAGE REFRESH, PER USER.

So … now it suddenly became clear.

With 100k users hitting the home page it was bottlenecking the Web Servers because every home page hit resulted in double the amount of server-side lookups to the User Profile Service and Managed Metadata Service (on top of all of the other standard processing).

So how to get round this?

The solution we are gunning for is to throw away the Search Web Parts and build our own using REST calls to the Search API and KnockoutJS for the data binding.

This allows us to use client-side caching of the query (including any “expanded” query variables, and caching of their profile data) and we can even cache the entire search query result if needed so “repeat visits” to the page don’t result in additional server load.

Finally…
This was a fairly high profile investigation, including Microsoft coming in for a bit of a chat about some of the problems we’re facing. After some investigation they did confirm another option (which didn’t work for us, but useful to know) which is this:

  • Query Variables in the Search Web Part are processed by the Web Server before being passed to the Query Component
  • The same query variables in a Result Source or Query Rule will be processed on the Query Server directly!

So if you have a requirement which you can compartmentalise into a Query Rule or Result Source, you might want to look at that approach instead to reduce the WFE processing load.

Cheers! And good luck!

Windows 8, Hyper-V, BitLocker and “Cannot connect to virtual machine configuration storage”

So I am now working at a new professional services company in South East England (Ballard Chalmers) who use Hyper-V throughout their DEV / TEST environments. I have previously been a VMWare Workstation person myself (and I still think the simplicity and ease of the user interface is unmatched) but for the foreseeable time I will be running Windows 8.1 Pro on my laptop as a Hyper-V host.

Before we get started it is worth describing my setup:

  • Windows 8.1 Pro
  • 3rd Gen Intel i7-3820QM CPU
  • 32GB DDR3 RAM
  • Two physical disk drives
    • C:\ SYSTEM – 512GB SSD (for Operating System, Files and Applications)
    • D:\ DATA – 512GB SSD (for Hyper-V Images and MSDN ISOs) (running in an “Ultra-Bay” where the Optical Drive used to be)

Now like most modern laptops I have a TPM (Trusted Platform Module) on my machine so I also have BitLocker encryption running on both my C: and D: drives (for those who are interested I barely notice any performance drop at all .. and I can still get 550 MB/s sustained read even with BitLocker enabled).

Saved-Critical – Cannot connect to virtual machine configuration storage

Now I noticed from time to time that my Virtual Machines were showing error messages when my computer started up. I noticed it here and there until Thomas Vochten (@ThomasVochten) also mentioned he was getting it every time he started his machine up.

Hyper-V Error

Note – You can get this error for all sorts of reasons, particularly if you have recently changed the Drive Letters, re-partitioned your hard disks or moved a VM. In this case I was getting the error without doing anything other than turning my laptop on!

Read more »

64GB of RAM in a Laptop, and why I want it …

Well, the rumour mills have been well and truly circulating recently about the potential for high capacity DRAM chips which could allow laptops to have up to 64GB of memory. I was recently directed to this article (https://www.anandtech.com/show/7742/im-intelligent-memory-to-release-16gb-unregistered-ddr3-modules) from the ArsTechnica forums.

This article basically describes a new method of DRAM stacking (as opposed to the standard method of NAND stacking) which allows the production of 16GB SODIMMs chips. My current laptop has four SODIMM slots (like pretty much every other high-end laptop on the market) so with the current maximum of 8GB SODIMMs my laptop supports 32GB RAM. If I could use 16GB SODIMMs then I could theoretically swap those chips out for a straight 4x 16GB SODIMMs (i.e. 64GB of RAM).

The best news is that these chips could be on the market this year!

“Mass production is set to begin in March and April, with initial pricing per 16GB module in the $320-$350 range for both DIMM and SO-DIMM, ECC being on the higher end of that range.” (source: Anandtech article linked above)

Read more »