...
As an example benchmark, Smarter Search for Bitbucket indexed the Linux code base which is approximately 15 million lines of code and half a million commits in 2-3 minutes with 6 GB of ram on non-SSD hard drives with an internal node. That said, it is recommended that for large codebases you use an external Elasticsearch node. For any concerns or questions, feel free to contact us.
External Node
For Bitbucket instances with large codebases and a lot of indexing, it's recommended to setup a separate Elasticsearch service. This will reduce the strain on Bitbucket for indexing, and should significantly improve performance. You can configure an external node in the settings page.
...
After installing the plugin in your bitbucket instance, you must enable indexing and trigger a reindex:
Go to
Smarter Search for Bitbucket Global Settings
page in the Bitbucket admin panel.Enable
Indexing
by clicking the check box.Click
Save and Reindex
to save the settings and subsequently reindex all repositories.
External Node (Elasticsearch 2.4.6)
...
Code Block | ||
---|---|---|
| ||
cluster.name: stash-codesearch clientnetwork.transport.cluster.host: 1270.0.0.1 client.transport.cluster.port: 9300-9305 client.transport.sniff: true0 # if network access is required script.inline: on script.indexed: on script.engine.groovy.inline.update: on script.engine.groovy.inline.aggs: on index.query.bool.max_clause_count: 10240 |
...
Once the node is setup, you must configure Smarter Search for Bitbucket:
Go to
Smarter Search for Bitbucket Global Settings
page in the Bitbucket admin panel.Enable
Indexing
by clicking the check box.Uncheck the
Internal ES Node
checkbox.Click
Save and Reindex
to save the settings and subsequently reindex all repositories.
Smarter Search for Bitbucket will then start indexing all of your bitbucket repositories in the background. It will take a few seconds to a few minutes to finish depending on the number and size of your repositories. For large Bitbucket instances, it is recommend indexing to be done during non peak hours.
By default, only the master and develop branches are indexed. Individual repo admins may modify these settings. See the Administration documentation page for instructions.