You've documented the CLI command options near the source code with comments. All the range of possible values defined are defined as enumerations. And, when users run a command with the option -help, they can check how to use it from the terminal. However, you also maintain reference docs on a site that you must update every time there is a new release.

My client ScyllaDB faced this problem with their CLI sctool. While the commands were properly documented in the code, users frequently complained when they found inconsistencies between the actual CLI implementation and the latest version of their docs.

In this article, I'll review how we autogenerated reference docs from the code for Scylla's CLI using Sphinx. Sphinx is a static site generator for docs, such as Jekyll, Gatsby, Docusaurus, or Antora. However, if you are not using Sphinx to document your software, I think you'll still find this article useful if you can apply the same general principles to other documentation tools.

Step 1 - Generate structured YAML

There are different ways to autogenerate documentation for a CLI. For example, you could run all commands with the option --help, and leave the output of each command in a folder that you could later import from the documentation.

However, we quickly discarded this option because it required building and installing the CLI for every documentation update, which substantially increased the build time. Additionally, this approach gave us little flexibility to render the documentation differently from its original output.

Instead, together with @michalmatczuk and @TzachL, we decided to create a new CLI command that generates the documentation for all the commands in YAML form:

sctool docs [options] Generates documentation for the CLI in YAML form. Options: --output,-o (string) Path to generate output to (absolute or relative to the root project directory). By default is the current directory.

The command produces a YAML file for each command with the following syntax:

sctool_cluster_list.yaml

name: sctool cluster list synopsis: Show managed clusters description: | This command displays a list of managed clusters. usage: sctool cluster list [flags] options: - name: help shorthand: h default_value: "false" usage: help for list inherited_options: - name: api-cert-file usage: | File `path` to HTTPS client certificate used to access the Scylla Manager server when client certificate validation is enabled (envvar SCYLLA_MANAGER_API_CERT_FILE). - name: api-key-file usage: | File `path` to HTTPS client key associated with --api-cert-file flag (envvar SCYLLA_MANAGER_API_KEY_FILE). - name: api-url default_value: http://127.0.0.1:5080/api/v1 usage: | Base `URL` of Scylla Manager server (envvar SCYLLA_MANAGER_API_URL). If running sctool on the same machine as server, it's generated based on '/etc/scylla-manager/scylla-manager.yaml' file. see_also: - sctool cluster - Add or delete clusters

As you can see, each YAML file defines the command name, a synopsis with a description, and lists options in a structured form. This allows us to read the file with less effort so that we can render the documentation in the format we prefer.

Step 2 - Convert YAML to restructuredText (or Markdown)

Once we have the commands' docs as YAML files, it's time to outline how we want the reference documentation to be displayed.

To do so, we created a new template file that defines how to iterate over an object - in this case, a single YAML file - and format the resulting document as restructuredText. For example, we divided our command template for a sctool command in the following sections:

  • The synopsis which describes command usage for the end user.
  • The list of command options with a description and their default values.
  • The list of inherited options, which are common for all commands.
  • An example of use.

command.tmpl

.. -*- mode: rst -*- {{ data['description'] }} {% if data['usage'] %} Syntax ...... .. code-block:: none {{ data['usage'] }} {% endif %} {% if data['options'] %} Command options ............... {% for item in data['options'] %} ``{% if item.shorthand %}-{{item.shorthand}}, {% endif %}--{{item.name}}`` {{item.usage}} {% if item.default_value %}**Default value:** ``{{ item.default_value }}``{% endif %} {% endfor %} {% endif %} {% if data['inherited_options'] %} .. collapse:: Inherited Options {% for item in data['inherited_options'] %} ``{% if item.shorthand %}-{{item.shorthand}}, {% endif %}--{{item.name}}`` {% set usage = item.usage.split('\n') %} {% for line in usage %} {{line}} {% endfor %} {% if item.default_value %}**Default value:** ``{{ item.default_value }}``{% endif %} {% endfor %} {% endif %} {% if data['example'] %} Example ....... {% set example = data['example'].split('\n') %} .. code-block:: {% for line in example %}{{line}} {% endfor %} {% endif %}

This file follows the Jinja2 syntax. Here are some of the most common Jinja2 delimiters we used:

DelimiterExampleDescription
Variable substitution{{ data['key'] }}Inserts the value from the object with key key in the document.
Conditional{% if condition %}...{% endif %}``Shows the block between the brackets if the condition evaluates to true.
Loop{% for o in data['key'] %}...{% endfor %}If data['key'] is an array, iterate over all its elements.

👉 For a complete Jinja2 reference, see Template designer documentation.

Step 3 - Autogenerate the documentation

Next, we need to pass the YAML files to the Jinja template to generate restructuredText.

One option would have been to create a script that runs for each CLI command a tool like Jinja2 CLI:

jinja2 command.tmpl sctool_cluster_list.yaml --format=json > sctool_cluster_list.rst

This command takes as the input the YAML file sctool_cluster_list.yaml, and produces sctool_cluster_list.rst using command.tmpl as the intermediate template.

Then, we could include the output in the documentation. For example, in Sphinx we have the directive literalinclude to import entire files in other restructuredText files.

Cluster ======= list ---- .. literalinclude:: sctool_cluster_list.rst

In our case, we used the extension sphinx-data-templates instead. This directive parses the content using the Jinja2 template and includes the content with the same directive.

For example:

Cluster ======= cluster list ------------ .. datatemplate:yaml:: partials/sctool_cluster_list.yaml :template: command.tmpl

Results in the following page after building the docs:

CLI documentation preview

Preview: https://manager.docs.scylladb.com/master/sctool/cluster.html

Step 4 - Keep documentation up to date

To keep docs up to date, we added a mechanism that checks if the source YAML files stored in the repository match the latest version of the CLI.

A CI workflow triggers every time a new pull request edits the CLI code. Here's a sample workflow implementation using GitHub Actions:

.github/workflows/verify-cli-docs.yaml

name: Verify CLI docs on: push: branches: - main pull: branches: - main jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 with: ref: ${{ github.head_ref }} - name: Install CLI run: npm install . - name: Run command to generate YAML files run: sctool docs -- output docs/commands/_partials - name: Check for uncommitted changes id: check-changes uses: mskri/check-uncommitted-changes-action@v1.0.1 - name: Evaluate if there are changes if: steps.check-changes.outputs.outcome == failure() run: exit 1

In short, the workflow runs the command sctool docs. If there are uncommitted changes after running the command, the repository does not have the latest version of the YAML files. Therefore, the workflow raises an error.

By doing so, we ensure the docs and the implementation are up to date as long as the CI does not complain.

Conclusions

After implementing those changes in Scylla's project, the cost of maintaining this reference documentation tends to zero as long as devs remember to update the source code. Besides, the docs always match the latest source code version, reducing the number of complaints and keeping a single source of truth. Finally, autogenerating the CLI reference documentation made the CLI docs more comprehensive since they always follow a standard format.