Command-line interface

sGDML provides a fully featured command-line interface (CLI) for all tasks related to force field reconstruction. You can get help for any command with the -h flag:

$ sgdml <command> -h

List of commands

Task

Command

Reconstruct a force field from beginning to end

sgdml all <dataset_file> <n_train> <n_valid> [<n_test>]

Create training tasks

sgdml create <dataset_file> <n_train> <n_valid>

Train models from training tasks

sgdml train <task_dir_or_file> <valid_dataset_file>

Validate models

sgdml validate <model_dir_or_file> <valid_dataset_file>

Select best performing model

sgdml select <model_dir>

Test a model

sgdml test <model_dir_or_file> <test_dataset_file> [<n_test>]

Show details for dataset, task or model file

sgdml show <file>

Purge all caches

sgdml reset

Tip

Dataset files can also be referenced by their fingerprint instead of their file name, e.g. ./d_ethanol.npz is equivalent to ./f03a68c944d70bd7083c951e7f77aaac within the CLI. Since fingerprints are guaranteed to be unique, this practice is less prone to user input error when dealing with multiple similar datasets.

List of optional arguments

Some commands have optional arguments to configure model/training parameters, file handling or to manage compute resources. Please use the -h flag to see which commands support what arguments.

Generic arguments

Option

Argument

Print a description of all command line options

-h, --help

Print the sGDML version number and exit

--version

Model/training configuration

Option

Argument

Path to a separate validation dataset file

-v <valid_dataset_file>, --validation_dataset <valid_dataset_file>

Path to separate test dataset file

-t <test_dataset_file>, --test_dataset <test_dataset_file>

Integer list and/or range <start>:[<step>:]<stop> to search for the option kernel width sigma

-s <s1> [<s2> ...], --sig <s1> [<s2> ...]

Ignore symmetries in the model (as in the orginal GDML variant)

--gdml

Reconstruct force field w/o its corresponding potential energy surface

--no_E

Include the energy constraints in the kernel (CPU-implementation only)

--E_cstr

Give up on unfinished tasks (e.g. due to timeouts, crashes, etc.), if multiple are trained

--lazy

File management

Option

Argument

Specify a custom training directory

--task_dir <task_dir>

Specify a custom model output file

--model_file <model_file>

Overwrite outputs/allow changing existing files

-o, --overwrite

Compute resource management

Option

Argument

Limit memory usage (whenever possible) [GB]

-m <max_memory>, --max_memory <max_memory>

Limit number of processes

-p <max_processes>, --max_processes <max_processes>

Use CPU implementation (no PyTorch dependency)

--cpu