An occasional experience I have using large Qlik Sense applications is typing in the global search bar (AKA Smart Search) and waiting, while a progress bar says Qlik is preparing Smart Search.
This isn’t a feature of the software or how it was intended to work, but the result of a developer’s decision to disable search indexing on reload. Why handicap this Qlik differentiator by disabling building the index during the reload? Why make me wait?
The explanation is inevitably that the developer wanted to shorten the reload time, and search indexing was adding time at the end of a reload. But instead of this blunt method for speeding up reloads, let’s focus on speeding up search indexing, which also speeds up reloads, speeds up Smart Searches, reduces noise in search results, and prevents users from having to see the search indexing progress bar.
Disclaimer: if your data aren’t large (10s of millions or rows+) and frequently reloaded (many times per day), don’t worry about search indexing time, unless you’re a perfectionist.
What does search indexing index?
The search index built at the end of the reload applies only to the Smart Search at the top of the user interface. Regardless of search indexing settings, you can still search and filter by any field individually in dimensions, filter panes, or the selection pane.
Exploring the search indexing options
Script complexity | Reload time | Extra user wait time | Search time | Search result quality | |
---|---|---|---|---|---|
Create search index during reload | ✓ | ✕ | ✓ | − | − |
Create search index on first search | ✓ | ✓ | ✕ | − | − |
Index during reload, but only included fields | − | ✓ | ✓ | ✓ | ✓ |
Index everything on reload
How to implement
SET CreateSearchIndexOnReload = 1;
When you create a new app, this variable and value are already set, by default, in the system variables. So do nothing, basically.
Result and recommendation
Some time is added to the end of the reload while all fields in the data model are indexed for Smart Search. Users can get immediate results, but search may take longer and there may be more noise in the results because every field was indexed.
For small-to-medium data sets, this may be perfectly fine.
Delay indexing until the first search is executed
How to implement
SET CreateSearchIndexOnReload = 0;
Result and recommendation
No time is added to the end of the reload for search indexing. When the first user attempts his or her first search, they will wait for the indexing to happen. The searches may also take longer and there may be more noise in the results because every field was indexed.
While this is generally a bad combination of outcomes and one we do not recommend for a typical user-facing application, it can sometimes be appropriate to disable. We disable search indexing on reload in all our QVD generators, as a standard practice. While tiny, this would be wasted compute for whatever metadata is in the QVD generator’s data model. Some Qlik solutions may not expose the global search bar at all, like mashups, embedded solutions, or custom web apps, so there may be no cost to disabling it in those scenarios. And of course you can disable it during development, before you publish.
Refine which fields are indexed for Smart Search
How to exclude only certain fields
SET CreateSearchIndexOnReload = 1; SEARCH EXCLUDE [*Field List];
How to include only certain fields
SET CreateSearchIndexOnReload = 1; SEARCH EXCLUDE *; SEARCH INCLUDE [*Field List];
Note that the syntax for including and excluding fields enables you to use wildcards when referring to field names.
Result and recommendation
A very small amount of time is added to the end of the reload while only the included fields in the data model are indexed for Smart Search. The fields to include will likely be dimensions and perhaps not much else. (The excluded fields will never be part of Smart Search indexing or results, but remember that those fields can still be searched individually.) Users can get immediate results, searching will be faster, and the results will be less noisy, based on the smaller domain of fields/field values searched.
This is the best all-around outcome for the effort for large, user-facing applications. If your application is small enough, you can decide whether it’s worthwhile.
Development best practices that also make it easy to tune search indexing
There are some general best practices that create win-win outcomes if you’re considering trying to speed up search indexing.
Exclude unused fields from data model
Qlik won’t index a field that isn’t there.
Exclude unnecessary detail in field values
Reducing the number of distinct values in fields creates smaller, faster search indexing and searches. Some examples might be fractions of seconds or field values that are not of interest and could be grouped into an “other” bucket.
Autonumber keys with non-user-facing data values; use HidePrefix or HideSuffix for metadata fields
If you already have a practice of Autonumbering fields to reduce the data model size, you will be happy to hear that Autonumbered fields are automatically excluded from search indexing. Remember you should even Autonumber fields that are already numeric (like system-maintained keys) that are not meaningful to users.
Hidden fields, including long text field values, like you might find in a data dictionary or change log, are also automatically excluded from search indexing.
Name fields consistently
For other fields you may wish to exclude, consistent field naming, along with wildcards, make it easy to exclude many fields at the same time. With consistent field naming, the following are possible:
[* Timestamp] // exclude all timestamp fields [* Amount], [* Count] // exclude many measure fields [Customer *] // include all Customer fields