Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Description

There are many macros available that allow for external content to be displayed in Confluence. For instance, macros from SQL, HTML, and XL. Even though the content appears on a Confluence page, the data will not be included in the Confluence search index since it is not included in the wiki markup of the page. One way to include the content so that it is searchable in Conflunece is to use the cache macro. How to index external content

...

This means that user can use Confluence search to find information that comes from external sources including databases and web pagesusing the RUN macro is an alternative way to do this. The advantage of using the cache macro technique is that the search result returned will be the actual page the content is on.

Considerations

  • The content should be independent of the user accessing the page, otherwise the content will change depending on who last visited the page and caused the cache to be updated
  • Consider automating the updating of the content by using the CSOAP to render the page on a regular basis
  • Additional processing is required for each time the cache is refreshed. Specifically, the page will be indexed more often than before
  • No change in behavior of the cache macro when the new index feature is not used
  • An additional index extractor is involved in indexing operations
    • Very low overhead for pages that do not use new feature
    • Can be disabled for the site from the plugin screen
  • Requires Cache plugin release 3.3 or above - available in the coming weeks. A beta is available as well for testing.
Info
titleBeta

This support is only available by downloading and installing the following a beta version of the cache plugin. It The support will be available in the plugin repository when the beta is completerolled into the next release (3.3.0).

New parameter

The following additional parameter is available on the cache macro

  • index - If index=true, the cached content will be added to the Confluence search index. Default is "false". Whenever the cached data is updated, the page will be re-indexed.

Examples

SQL queries

No Format
{cache:index=true}
The results from the SQL query will be indexed for search.
{sql-query:dataSource=ReportDS}
select * from report
{sql}
{cache}

Web pages

Using the HTML.

No Format
{cache:index=true}
The page pointed to by the url will be indexed for search.
{html:script=#http://www.atlassian.com/about/}
{html}
{cache}

How does it work?

Assume there is a page that has a cache macro instance that specifies index=true

  1. Whenever the page is viewed, then either
    • cache is valid (cache is current according to the refresh parameter and the age of the cached data) - the cached data is used as normal
    • cache is expired - the data within the cache macro is rendered and the rendered data is stored in the cache as normal. In addition, the page is flagged as needing to be re-indexed since the dynamic content of the page has changed due to the new rendering.
  2. The standard indexing queue is processed as normal by all the registered content extractors. The cache macro has added a new content extractor to process pages with cached content.
    • The page will be recognized by the cache macro extractor and cause the appropriate cached content to be added to the content that will be indexed for the page

What data is indexed?

The cache macro renders contents of the macro into HTML and stores the HTML in the cache. The cache content extractor processes the HTML data from the cache and extracts only the text and attribute fields using the Jericho HTML Parser.

Considerations

...