Choosing the indexing engine

In the release 3.5, we have introduced the back-end indexing. How should an administrator choose between the two indexing engines?

In the release 4.0, we’ve made it mandatory.

How should a customer prepare the transition to the Indexation V2?

  • Check whether you are using baselines. If yes, make your key users aware that the contents of requirements indexed before and after the move to Indexation V2 won’t be comparable, due to many tiny changes in the output HTML.

  • Check whether you are using the app named “Scaffolding”. If so, then the support for Scaffolding is only implemented in 3.8.3, with limitations described in https://requirementyogi.atlassian.net/browse/RY-1160

What is the recommended setting?

  • Engine Version 2 - Using the queue - is the only recommended setting for Data Center instances. It weighs less on the performance of Confluence.

What is the indexing engine?

When a user enters requirements on a page and saves it, the indexing engine extracts requirements from text, tables and layouts, and put it in a database.

What is the engine Version 1, “Legacy JS”?

Up until version 3.4, requirements would be extracted while they were displayed for the first time. It produced two side-effects:

  • When saving the page, the user would have to wait until we've identified all requirements on the page. It would be seamless and very fast for small pages, but it could be long when there are hundreds of requirements.

  • The Javascript that would extract the requirements from the page, running while the user views the page, was designed to simplify the extracted text. With time, we couldn't change it, because some customers rely on the comparability of extracted texts across versions of requirements. Also, since it was rendered at runtime, there could be slight differences in text that made comparisons difficult.

We keep the Version 1 to allow for more time for our customers to migrate, but, since we now support Scaffolding, we don’t believe anything is holding you back!

What is the engine Version 2, "Using the queue"?

When a page is saved and the new mode is selected, we don't process it immediately. Rather, we put it in a queue (The same mechanism we are already using to send events to Jira) and we process it whenever Confluence is available.

  • It means saving the page is much faster,

  • It also means we had to rewrite the indexing engine and, this time, we've correctly supported bullet-point lists, tables, images, etc.

Of course, since this is a new engine, it may happen that we've missed some text formatting, so please give us feedback if you see any text format that you would like to be supported.

Why did it have to change?

Atlassian's Data Center line of products aims for enterprise-level quality. Mostly, instant performance while saving the page was an issue for the user experience, and we needed the users to be able to instantly save the page and view it. Secondly, dealing with HTML is difficult and often introduces whitespaces changes even when the original text hasn't changed. Using the new indexing will result in much more deterministic outputs.

Is there an impact to changing modes?

YES. We don't recommend switching modes back-and-forth while creating baselines. When users create baselines, they create a snapshot of the requirements at a given time. If an administrator switches the mode, then requirements created in the next baseline will not be comparable with requirements in the previous baseline.

No performance impact. The new mode was lightening-fast on our test instances for 4 reasons: 1. We don’t make the user wait for the reindexation before displaying pages, since we process everything in the queue, 2. The queue means only 1 task is executed at any time, ensuring we don’t suddenly parse dozens of pages at the same time, 3. the reindex is much shorter due to the change of architecture, and 4. The algorithm itself is faster at parsing XML (but that may change depending on complexity).

Should an existing customer switch to the new mode?

New customers should be on the new mode. Should existing customers switch?

  • YES, existing customers should use the new mode, for the performance.

  • BUT check whether your users have baselines. If they do, explain them that requirements indexed after the baseline will be shown as slightly different.

How to change the mode?

Switching is manual. The option is available in the Confluence administration → Requirement Yogi → Options → section "Indexing" → Change mode.

See at the top of this page for the screenshot.

Did we answer all your questions?

Feel free to ask questions on Requirement Yogi support.

Related pages