Introduction
Is there a reason why Qlik Sense applications load lightning fast when you have just a few dashboards in the server and, as the number of applications and users grow, the same app needs 15 seconds, sometimes 1 minute to load? How painful and boring is that?
In this post, I would like to share one actionable takeaway I discovered on my journey to scale Qlik Sense to 40 000 users with over 10 000 applications. This is the first of a series of posts around scalability and performance optimization from a user-centric standpoint, so stay tuned.
An App's Journey
To start off, let’s consider what happens when you load a Qlik Sense application on your device, starting from the hub.
- The user will open the URL of the Qlik Sense application
- The server will check if the user is valid, that’s the authentication part
- The server will check if a license is valid for that user
- The server will check which streams and apps the user has access to
- The server will push a list of content to the user device, based on what’s visible for them
- The device browser will download and display the list of available streams and apps
- The user will select one app and click on it
- The browser will load the app in a new tab
- The server will send the application client, everything that makes the user interface
- The server will also check again permissions for the user against every content available within the application: pictures, extensions, sheets, stories, bookmarks, data (section access)
- The server will push the list of content to the user device, based on what’s visible for them
- The device browser will download and display everything that makes an app
- A websocket connection is opened by the server with the client device to instantly push data, based on user selections.
The list could go on and on based on what the user selects, but hang on!
Let’s stop right here and explain what really happens.
Common Behavior
It is common behavior for a web application server to send the list of files and, for the client browser (Chrome, IE, Edge, Safari, Opera, etc.), to further download that content. Nothing wrong with that.
For your information, Chrome can download up to 6 elements from the same URL in parallel. 6 is enough with most websites, but you also have to know that Chrome won’t download another one before one of the elements in the pool is completed. In other words, if for whatever reason Qlik Sense needs some time to send the file to the client browser, then the browser will be “waiting”.
A common practice is to “cache” content in the browser. The client browser would usually download the content the very first time it connects to a website. From that point onwards, the content will be loaded from the cache, creating a much better and faster user experience.
However, it does not work exactly like that with Qlik Sense and there might be situations where applications will load in several minutes during peak hours. Again, it’s important to highlight that I am referring to large Qlik Sense sites with hundreds of applications and thousands of users.
Now what?
Well, let’s beat the status quo and see what we can do about it!
It is possible to understand how your Qlik Server performs by looking into Chrome development tools (F12), more specifically at the network tab. Open a Qlik Sense application and look at the content that takes most of the time.
You can click on any entry and check how much time it took:
- Stalled: waiting for an active slot in the pool of 6 concurrent threads.
- Waiting: waiting for Qlik Sense to send the content
- Content download: how much time it actually took to download the content. That highly depends on the Internet speed and latency.
Save all the results as an .HAR file and go to that following website to visualize a summary: https://toolbox.googleapps.com/apps/har_analyzer/
Interesting, right? As you can see, more than 50% of the time is spent “Waiting”. What the heck!
It also takes a large amount of time to download Javascript (probably because of powerful extensions, which have large Javascript libraries). Interestingly enough, nothing came from the local browser cache.
The repository service evaluates permissions for each static content request. Yes. Each and every file a user wants to download. Such access evaluation can take a few seconds per file during peak hours. If the user has access to an extension, each file within that extension will get evaluated. If the user has access to an application, each picture access request will be evaluated by the server.
A Qlik Sense application requires to download about 100 to 150 files: worst case scenario it means about 1 minute spent only for evaluating permissions.
At this point, the burden is not over yet. Once the access to a file is granted, the user will download it to his device. That’s where network speed and latency play a role. For your information, the repository service is both responsible for evaluating permissions against the security rules and sending the file.
If all pictures, extensions, and static files were cached on the device, then the server would only need to transfer the dynamic content. The dynamic content will change based on the application and the user’s selections.
How can we bypass all that process and force static content to be cached on the device?
My Journey
This is my journey, Just so you know, I’ll tell you what didn’t work.
There is an HTTP Header setting to control how the cache is managed, it can be configured in the virtual proxy settings of the QMC. I tried the option, but it had no effect.
Then, I came across that article from Qlik Support: the no-cache policy is actually forced by design. https://support.qlik.com/articles/000089254
That’s when I looked into ways to bypass this (Is it the right time to mention that I’m French? As a nation, we like to bypass some rules).
I considered using a CDN, at least to store extension content outside of Qlik Sense. It worked like a charm. At least, until we realized it broke the Offline applications on iOS devices! Right, back to square one. Extensions stay in Qlik Sense server.
For the educated reader, CDN remains a valid option for externally hosted mashups. That comes with another layer of complexity around authentication, a topic that deserves its own dedicated article.
Back to Chrome in the developer tools – network panel, the web client resources get generated server side with a dynamic name. That is actually pretty good news, it means dynamic content could not be cached anyway.
However, the static content has its definitions stored within the central node repository database. There is one entry for each file, may it be an application thumbnail, a picture, an extension file (Javascript, CSS, …) or the native client files.
After weeks of chasing Qlik support, adorable folks who chased Qlik R&D on my behalf, I eventually got the key to decipher the secret codes of the Cache Policy (moohaha!)
Ideally, these values and settings should be made available through an API. Let’s see when it gets released.
CachePolicy values:
0: Public, max-age 3600 1: Public, Must revalidate 2: Private 3: Private, Must revalidate 4: No cache
The value 0 is what I was looking for, the static content will be cache for 1 hour: 3600 means 3600 seconds, that’s 1 hour.
To be honest, one hour is not much, but it’s much better than no-cache at all.
Application pictures and extensions don’t change very often in a production system, so I can imagine the cache could last for at least a day without facing much issues. Since we’re talking about one hour, we can live with that.
To make sure everyone understands the benefits of the Cache Policy, please consider this scenario:
- User A will load application 1: all the content access will be evaluated and then transferred to the device.
- 5 minutes later, user A decides to close the tab, because they can!
- 1 minute later, User A forgot to check something and decides to reopen the closed tab.
- All the content access will be evaluated AGAIN and then transferred AGAIN to the device.
When we set the cache policy value to 0, although the first load remains comparatively slow, the subsequent loads of the same static content are read from the cache:
- No need for the server to check permissions
- No need to download the content
So, how would you update the Cache Policy?
First, I recommend you explore your Static Content.
Attached is a Qlik Sense application that will read static content from your Qlik Sense site. With that, you will get a better understanding of the number and types of files in your extensions/applications.
You may discover that users have loaded 20MB pictures in their apps and there is possibly a lot you can do to reduce content size and therefore make applications load faster.
Great, but what if you want to force the Cache Policy under your terms?
Disclaimer: The following SQL commands should be applied on non-productive systems only, after a backup, provided you know and understand what they will do. These are only examples for awareness.
I have actually used them in productive sites, but I dont want to be held responsible if things go wrong in your system.
Making the Updates
The first one will update the cache policy of the Content Libraries.
UPDATE public."StaticContentReferences" set "CachePolicy" = 0 WHERE (public."StaticContentReferences"."CachePolicy" = 3 or public."StaticContentReferences"."CachePolicy" = 1) and public."StaticContentReferences"."ExternalPath" like '/content/%';
The second one targets extensions:
UPDATE public."StaticContentReferences" s set "CachePolicy" = 0 where (s."CachePolicy" = 3 or s."CachePolicy" = 1) and s."ExternalPath" like '/extensions/%' and s."StaticContentDataType" = 0;
The third one will update the cache policy for static content within published applications.
If the application is not published, then someone is probably working on it and they want to see the content update with every change (no cache). The below script only updates the cache policy for the published applications.
UPDATE public."StaticContentReferences" set "CachePolicy" = 0 WHERE "public"."StaticContentReferences"."StaticContentDataType" = 0 and ("public"."StaticContentReferences"."CachePolicy" = 3 or "public"."StaticContentReferences"."CachePolicy" = 1) and "public"."StaticContentReferences"."ExternalPath" like '/appcontent/%' and "public"."StaticContentReferences"."ID" IN ( SELECT distinct "public"."StaticContentReferences"."ID" FROM "public"."StaticContentReferences" INNER JOIN "public"."StaticContentReferenceAppContents" on "public"."StaticContentReferenceAppContents"."StaticContentReference_ID" = "public"."StaticContentReferences"."ID" INNER JOIN "public"."AppContents" on "public"."AppContents"."ID" = "public"."StaticContentReferenceAppContents"."AppContent_ID" INNER JOIN "public"."Apps" on "public"."Apps"."ID" = "public"."AppContents"."App_ID" Where "public"."Apps"."Published" = 'true');
I hope you found this article helpful. I have many more tips and tricks that will help you make your large Qlik Sense deployments great again!
Use the comment section below to let me know if this trick helped you.
6 Comments
Comments are closed.
Hey Armand,
Amazing approach. Thanks for your insights.
How cool would it be to not only access the cache option via API but to modify it accordingly per content type, app or oven object level?
Let’s cross our fingers.
Best regards
Paul
Hi Armand,
Nice article.
For what version of QS is this applicable?
We are on November 2019 and we experience faster load 2nd time the user opens the app. So thinking if Qlik has enabled it by default?
And if not I am wondering why not enable it by default?
Regards,
Kashif
Hi Kashif,
Qlik R&D have indeed introduced server side caching for static content.
Unfortunately it does not work for all content and still requires the repository service to check the request (even though the reponse gets returned much faster).
The performance you will experience very much depends on the load of your Qlik Sense site.
I believe the situation of my client is quite extreme and it creates exponantial performance bottlenecks on the repository service, that’s why skipping requests to the Qlik Sense server for static content resources made a significant difference, a positive one.
On a small Qlik Sense site it would make almost no difference.
The more apps, the more users, the more user attributes (AD groups) and security rules based on the previous, the slower a Qlik Sense site becomes.
I don’t know why Qlik does not enable this by default, I suggested it and shared the same SQL months ago with them.
Probably because only a minority of their clients are impacted by slow performance (on static content, leaving engine performance aside) and they would be afraid of side impacts with browser caching.
Hi Armand,
Great Post. Performance issues remains one of the great challenges for large Qlik applications.
Regards
Anand
Insightful article Armand and very much helpful in our case as you already know 🙂 It is very surprising to know static caching is by default not enabled in Qlik.
In general web applications, there is also a concept of reverse cache where server can implement it’s own cache for backend services. If something like this can be implemented with Qlik, then content can be served from this cache rather that going to repository services.
Looking forward to hear your thoughts on this.
Regards,
Abhijit
Hi Abhijit,
After chasing Qlik R&D on the matter for months, you may have read through the release notes the introduction of repository requests caching.
That newly introduced backend cache will avoid a number of report DB queries and security rules evaluation for identical user requests.
I see it as a great improvement, but after going through the repository performance logs, actually not enough to make Qlik lightning fast.
We are considering adding a layer of reverse proxy cache on top of our Qlik Sense infrastructure, typically with NGINX.
This will require extensive testing before we roll it out, there are indeed many exceptions to be properly managed since a significant number of requests involve dynamic content.
Kind regards,
Armand