Command-line interface

sGDML provides a fully featured command-line interface (CLI) for all tasks related to force field reconstruction. You can get help for any command with the -h flag:

$ sgdml <command> -h

List of commands



Reconstruct a force field from beginning to end

sgdml all <dataset_file> <n_train> <n_valid> [<n_test>]

Create training tasks

sgdml create <dataset_file> <n_train> <n_valid>

Train models from training tasks

sgdml train <task_dir_or_file> <valid_dataset_file>

Validate models

sgdml validate <model_dir_or_file> <valid_dataset_file>

Select best performing model

sgdml select <model_dir>

Test a model

sgdml test <model_dir_or_file> <test_dataset_file> [<n_test>]

Show details for dataset, task or model file

sgdml show <file>

Purge all caches

sgdml reset


Dataset files can also be referenced by their fingerprint instead of their file name, e.g. ./d_ethanol.npz is equivalent to ./f03a68c944d70bd7083c951e7f77aaac within the CLI. Since fingerprints are guaranteed to be unique, this practice is less prone to user input error when dealing with multiple similar datasets.

List of optional arguments

Some commands have optional arguments to configure model/training parameters, file handling or to manage compute resources. Please use the -h flag to see which commands support what arguments.

Generic arguments



Print a description of all command line options

-h, --help

Print the sGDML version number and exit


Model/training configuration



Path to a separate validation dataset file

-v <valid_dataset_file>, --validation_dataset <valid_dataset_file>

Path to separate test dataset file

-t <test_dataset_file>, --test_dataset <test_dataset_file>

Integer list and/or range <start>:[<step>:]<stop> to search for the option kernel width sigma

-s <s1> [<s2> ...], --sig <s1> [<s2> ...]

Ignore symmetries in the model (as in the orginal GDML variant)


Take permutations from existing file (e.g. other dataset or model files; key: perms)

--perms_from <file>

Reconstruct force field w/o its corresponding potential energy surface


Include the energy constraints in the kernel (CPU-implementation only)


Give up on unfinished tasks (e.g. due to timeouts, crashes, etc.), if multiple are trained


File management



Specify a custom training directory

--task_dir <task_dir>

Specify a custom model output file

--model_file <model_file>

Overwrite outputs/allow changing existing files

-o, --overwrite

Compute resource management



Limit memory usage (whenever possible) [GB]

-m <max_memory>, --max_memory <max_memory>

Limit number of processes

-p <max_processes>, --max_processes <max_processes>

Use CPU implementation (no PyTorch dependency)