Skip to end of banner
Go to start of banner

How to globally search and modify content

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 37 Next »

On this page

Description

Often the need arises to find pages that have certain content and modify it slightly. If the number of pages is larger, making manual modifications can be painful and error prone. Automation can help, but you have to make sure that you are modifying exactly what you want and not something unintentional. See also this user question for a discussion on this topic.

Updating storage format

Always be careful updating storage format data and make sure you test before doing mass updates.

Regex searching of body content

Starting with Release 6.4, getPageList can now list pages whose body content matches a regex pattern. This provides an option for finding specific content source that is not amenable to normal Confluence searching via indexing.


Links

Example 1: Changing a url

Changing a link. For instance: http://myjira.com to http://mynewjira.com.

Steps

  1. Setup an example page 

    --action storePage --space xxx --title test --parent @home --content "aaa http://myjira.com bbb"
  2. Construct a modifyPage action for a single page using a simple text replace. Since : (colon) is in the text and is the default key:value separator for CLI, use # instead using the special parameter (spaces are significant!) 

    --action modifyPage --space xxx --title at --findReplace "http://myjira.com#http://mynewjira.com" --special " #"
  3. Run against all pages with the link (using unix style escaping - see Tips). Run against your test space first before using @all.

    --action runFromContentList --search "\"http://myjira.com\"" --space @all --common "--action modifyPage --id @pageId@ --findReplace \"http://myjira.com#http://mynewjira.com\" --special \" #\" "
  4. Results 

    Run: --action modifyPage --id 112197680 --findReplace "http://myjira.com#http://mynewjira.com" --special " #" 
    Page modified: 'at' in space: xxx. Page has id: 112197680
    ...

Use runFromPageList instead

It is sometimes easier to use runFromPageList instead of defining the right search to use with runFromContentList. If the find string is not on the page, then the page will not be changed. The down side is that it may take a bit longer to go through and check each page.

 

Example 2: Renaming a macro

There are a few cases why you may need to rename a macro:

  1. A user macro may need a better name or may conflict with macro from a new plugin
  2. A macro may no longer be valid because you have discontinued a plugin or removed a user macro, but you want to save the body of the macro

Steps

  1. Determine the kind of body the existing macro has
    1. Plain text like the noformat macro
    2. Rich text like the panel macro
  2. Make sure the target macro has the same body type
  3. Use the techniques of Example 1
    • runFromPageList may be more appropriate for many case

Use something similar to the following findReplace string 

One rich text macro:
--findReplace "ac:name=~table-plus~#ac:name=~panel~" --special " #  ~"
 
One rich text macro and a plain text macro
--findReplace "ac:name=~table-plus~#ac:name=~panel~,ac:name=~csv~#ac:name=~noformat~" --special " #  ~"

Finding pages with macros

This answer describes a technique using content search to find pages with a macro. Summarizing - use something like: "macroName: table-plus*"

 

Example 3: Changing XHTML content

This site has many pages created via automation, often by a Bamboo build process. Confluence has an annoying feature (sad) that converts simple wiki text like @entry@ into XHTLM reference to a template variable. There doesn't seem to be a way to escape this behavior. The result is a number of pages with incorrect data. This examples shows how this was corrected.

Steps

  1. Identify the problem - usually, someone notices a problem on a specific page when viewing the page
  2. Setup an example page for testing
  3. Find all content with the problem. You need to have a unique way to identify the text that is in error without finding extraneous text. This can be difficult in some cases.
    1. Look at the storage view of the page with the problem
    2. In this example: <at:var at:name="entry" /> should have been @entry@
    3. In the UI, search for something like: "<at:var" and verify that the pages found represent the problem you are trying to solve
  4. Construct the same search using the CLI and verify results. Suggest you start with a single space first before doing @all. Note you need to escape the double quote using your system specific escaping syntax - see Tips

    --action getContentList --search "\"<at:var\"" --space @all
  5. Construct a find/replace string. In this case, the text between the @ signs could be anything, so we will use findReplaceRegex instead of simple text replacement that findReplace does. You need to know a little bit of regex syntax including a find group (in this example: any number of alphabetic characters) and the replacement text referencing the find group: @$1@. Test your regex first. How to use regular expressions has some pointers.

    <at:var at:name="([a-zA-Z]*)" /> needs to be replace with @$1@
  6. Now construct the CLI command to make a single page fix.
    1. Yuk (sad)! XHTML has a bunch of characters that are considered special characters for the CLI, regex processing, and/or command line processing. Tips has some rules for escaping some of the CLI related areas, but it is still a pain especially with differences between Windows and non-Windows systems. 
    2. Construct the parameters for find and replace. Use ~ instead of " so we can avoid escaping double quote. Use # as the separator between key and value instead of the default : (colon). For unix based command lines, the $ must be escaped.

      --findReplaceRegex "<at:var at:name=~([a-zA-Z]*)~ />#@\$1@" --special " #  ~" 
    3. Construct the CLI command for a single action and test on a single page

      --action modifyPage --space xxx --title "my title" --findReplaceRegex "<at:var at:name=~([a-zA-Z]*)~ />#@\$1@" --special " #  ~" 
       
  7. Global change
    1. Put every thing together using runFromContentList (this examples using unix based escaping)

      --action runFromContentList --search "\"<at:var\"" --space @all --common "--action modifyPage --id @pageId@ --findReplaceRegex \"<at:var at:name=~([a-zA-Z]*)~ />#@\$1@\" --special \" #  ~\" " 
    2. Result

      Run: --action modifyPage --id 28901397 --findReplaceRegex "<at:var at:name=~([a-zA-Z]*)~ />#@$1@" --special " #  ~" 
      Page modified: '3.0.0 - Documentation' in space: BCLI. Page has id: 28901397
      Run: --action modifyPage --id 28901413 --findReplaceRegex "<at:var at:name=~([a-zA-Z]*)~ />#@$1@" --special " #  ~" 
      Page modified: '3.0.0 - Documentation' in space: CSOAP. Page has id: 28901413
      ...

 

 

 

  • No labels