This document provides instructions on how to use the polymon command-line interface (CLI).
The polymon CLI has three main modes:
Train: Train a machine learning or deep learning model.
Merge: Merge two datasets.
Predict: Predict labels for a given dataset.
Train
This command is used to train a model.
Usage:
polymon train [OPTIONS]
Arguments:
Supported Arguments in |
Type |
Default |
Description |
|---|---|---|---|
|
str |
|
Path to the raw csv file. |
|
str (multiple) |
|
Sources to use for training. |
|
str |
|
Tag to use for training. |
|
str (multiple) |
Required |
Labels to use for training. |
|
str (multiple) |
|
Feature names to use for training. |
|
int |
|
Number of trials to run for hyperparameter optimization. |
|
str |
|
Path to the output directory. |
|
str |
|
Path to the hparams file. Allowed formats: .json, .pt, .pkl. |
|
int |
|
Number of folds to use for cross-validation. |
|
str |
|
Mode to split the data into training, validation, and test sets. |
|
int |
|
Seed to use for training. |
|
bool |
|
Whether to remove hydrogens from the molecules. |
|
str (multiple) |
|
Descriptors to use for training. For ML models, this must be specified. |
|
str |
|
Model to use for training. |
|
int |
|
Hidden dimension of the model. |
|
int |
|
Number of layers of the model. |
|
int |
|
Batch size to use for training. |
|
float |
|
Learning rate to use for training. |
|
int |
|
Number of epochs to use for training. |
|
int |
|
Number of epochs to wait before early stopping. |
|
str |
|
Device to use for training. |
|
bool |
|
Whether to run the training in production mode, which means train:val:test splits will be forced to 0.95:0.05:0.0. |
|
bool |
|
Whether to finetune the model. |
|
str |
|
Path to the csv file to finetune the model on. |
|
str |
|
Path to the pretrained model. |
|
int |
|
Number of estimators to use for training. |
|
str (multiple) |
|
Additional features to use for training. |
|
bool |
|
Whether to skip the training step. |
|
str |
|
Path to the low fidelity model. |
|
str |
|
Name of the estimator to give base predictions. |
|
str |
|
Name of the embedding model for base graph embeddings. |
|
str |
|
Type of ensemble to use for training. |
|
bool |
|
Whether to train the residual of the model. |
|
str |
|
Type of normalizer to use for training. Choices: |
|
bool |
|
Whether to use data augmentation. |
Merge
This command is used to merge two datasets.
Usage:
polymon merge [OPTIONS]
Arguments:
Supported Arguments in |
Type |
Default |
Description |
|---|---|---|---|
|
str (multiple) |
Required |
Sources to merge. |
|
str |
Required |
Label to merge. |
|
str |
Required |
Path to the hparams file. |
|
str |
Required |
Acquisition function to use for merging. Choices: |
|
int |
|
Sample size to use for merging. |
|
float |
|
Uncertainty threshold to use for merging. |
|
float |
|
Difference threshold to use for merging. |
|
int |
|
Target size to use for merging. |
|
str |
|
Path to the base csv file. |
Predict
This command is used to predict labels for a given dataset.
Usage:
polymon predict [OPTIONS]
Arguments:
Argument |
Type |
Default |
Description |
|---|---|---|---|
|
str |
Required |
Path to the model. |
|
str |
Required |
Path to the csv file. |
|
str |
Required |
Name of the smiles column. |