This article explains how to get all the URLs that are in confluence pages in bulk using the Confluence Command Line Interface app.
Instructions
To get all the URLs from the pages to a file using runFromPageList, getSource actions.
The parameters used in the below action are:
- space: Name of the space. In this case, @all represents all spaces in
- clearFileBeforeAppend: This option will automatically clear an existing file on the first append requested.
- id: ids of pages. In this case, it searches for all the pages in the spaces using @pageId@
- file: Path/name to file of result output.
- findReplaceRegex: The regex pattern used to match the patterns of the links.
Execute the below command to get all the links that are in pages from all the spaces of a confluence instance.
Code Block |
---|
--action runFromPageList --space "@all" --clearFileBeforeAppend --common "--action getSource --id @pageId@ --special: "" #"" --append --file spaceLinkList.txt" --input "--findReplaceRegex ""(?s).*?href=.(http[s]{0,1}://[./a-z]+)#$1,"" --findReplaceRegex ""(?s)(((http[s]{0,1}://[./a-z]+),)*).*#$1"" --findReplaceRegex "",#' \n '"" " |
Results for the above command:
Info |
---|
|