Command-line interface¶

sGDML provides a fully featured command-line interface (CLI) for all tasks related to force field reconstruction. You can get help for any command with the -h flag:

$ sgdml <command> -h

List of commands¶

Task	Command
Reconstruct a force field from beginning to end	`sgdml all <dataset_file> <n_train> <n_valid> [<n_test>]`
Create training tasks	`sgdml create <dataset_file> <n_train> <n_valid>`
Train models from training tasks	`sgdml train <task_dir_or_file> <valid_dataset_file>`
Validate models	`sgdml validate <model_dir_or_file> <valid_dataset_file>`
Select best performing model	`sgdml select <model_dir>`
Test a model	`sgdml test <model_dir_or_file> <test_dataset_file> [<n_test>]`
Show details for dataset, task or model file	`sgdml show <file>`
Purge all caches	`sgdml reset`

Tip

Dataset files can also be referenced by their fingerprint instead of their file name, e.g. ./d_ethanol.npz is equivalent to ./f03a68c944d70bd7083c951e7f77aaac within the CLI. Since fingerprints are guaranteed to be unique, this practice is less prone to user input error when dealing with multiple similar datasets.

List of optional arguments¶

Some commands have optional arguments to configure model/training parameters, file handling or to manage compute resources. Please use the -h flag to see which commands support what arguments.

Generic arguments¶

Option	Argument
Print a description of all command line options	`-h`, `--help`
Print the sGDML version number and exit	`--version`

Model/training configuration¶

Option	Argument
Path to a separate validation dataset file	`-v <valid_dataset_file>`, `--validation_dataset <valid_dataset_file>`
Path to separate test dataset file	`-t <test_dataset_file>`, `--test_dataset <test_dataset_file>`
Integer list and/or range `<start>:[<step>:]<stop>` to search for the option kernel width sigma	`-s <s1> [<s2> ...]`, `--sig <s1> [<s2> ...]`
Ignore symmetries in the model (as in the orginal GDML variant)	`--gdml`
Take permutations from existing file (e.g. other `dataset` or `model` files; key: `perms`)	`--perms_from <file>`
Reconstruct force field w/o its corresponding potential energy surface	`--no_E`
Include the energy constraints in the kernel (CPU-implementation only)	`--E_cstr`
Give up on unfinished tasks (e.g. due to timeouts, crashes, etc.), if multiple are trained	`--lazy`

File management¶

Option	Argument
Specify a custom training directory	`--task_dir <task_dir>`
Specify a custom model output file	`--model_file <model_file>`
Overwrite outputs/allow changing existing files	`-o`, `--overwrite`

Compute resource management¶

Option	Argument
Limit memory usage (whenever possible) [GB]	`-m <max_memory>`, `--max_memory <max_memory>`
Limit number of processes	`-p <max_processes>`, `--max_processes <max_processes>`
Use CPU implementation (no PyTorch dependency)	`--cpu`