Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This article explains how to get URLs from the Confluence pages based on the regex pattern using the Confluence Command Line Interface app.

...

  • Execute the below command to get all links from pages in all spaces of a Confluence instance:

    Code Block
    themeMidnight
    --action runFromPageList --space "@all" --clearFileBeforeAppend --common "--action getSource --id @pageId@ --special: "" #"" --append --file spaceLinkList.txt"  --input "--findReplaceRegex ""(?s).*?href=.(http[s]{0,1}://[./a-z]+)#$1,"" --findReplaceRegex ""(?s)(((http[s]{0,1}://[./a-z]+),)*).*#$1"" --findReplaceRegex "",#' \n '"" "


  • Execute the below command to get all the links from a single page of a space:

    Code Block
    themeMidnight
    --action getSource --space "ZCLI" --title "PAGE2" --special " #" --findReplaceRegex "(?s).*?href=\"(http[s]{0,1}://[./a-z]+)#\$1 \n"



    Output from the above commands:
    Image Removed
    Image Added

The parameters used in the actions are:

  • --space: Name of the space. In this case, @all represents all spaces in an instance.
  • --clearFileBeforeAppend: This option will automatically clear an existing file on the first append requested.
  • --id: ids of pages. In this case, it searches for all the pages in the spaces using @pageId@
  • --file: Path/name to file of result output.
  • --findReplaceRegex: The regex pattern used to match the patterns of the links.
  • --special: Characters used for specialized processing of some specific parameters.
  • --append: The append will append output to the existing file.


Info
  • The above action is available from cli v9.3 and above. 
  • Make sure the regex pattern matches the pattern of the page links to get meaningful output.
  • The above commands work for a windows machine. Make sure you update the syntax accordingly for Linux based machines.

...