This article explains how to get URLs from the Confluence pages based on the regex pattern using the Confluence Command Line Interface (CLI) app.
Instructions
To get Use the runFromPageList and getSource actions to export all the URLs from Confluence pages to a file, use runFromPageList and getSource actions., CSV file based on the given parameters.
Execute the below command to get all links from all pages in all spaces of a Confluence instance:
Code Block theme Midnight --action runFromPageList --space "@all" --clearFileBeforeAppend --common "--action getSource --id @pageId@ --special: "" #"" --append --file spaceLinkList.txt" --input "--findReplaceRegex ""(?s).*?href=.(http[s]{0,1}://[./a-z]+)#$1,"" --findReplaceRegex ""(?s)(((http[s]{0,1}://[./a-z]+),)*).*#$1"" --findReplaceRegex "",#' \n '"" "
Execute the below command to get all the links from a single page of a space:
OutputCode Block theme Midnight --action getSource --space "ZCLI" --title "PAGE2" --special " #" --findReplaceRegex "(?s).*?href=\"(http[s]{0,1}://[./a-z]+)#\$1 \n"
The output from the above commands is added to the CSV file is similar to:
The parameters used in the actions are:
- --space: Name of the space. In this case, @all @all represents all spaces in an instance.
- --clearFileBeforeAppend: This option will automatically clear clears an existing file on the first append requested.
- --id: ids IDs of pages. In this case, it searches for all the pages in the spaces using @pageId@using @pageId@.
- --file: Path/name to file of result output.
- --findReplaceRegex: The regex pattern used to match the patterns of the links.
- --special: Characters used for specialized processing of some specific parameters.
- --append: The append will append option appends output to the existing file.
Info |
---|
|
...