4. Input script#

The input file of AESP is a json format file, which is mainly composed of two main parts, namely, the configuration files of dflow (dflow_config, dflow_s3_config), and some configurations of the algorithms of AESP (aesp_config).

For all the aesp commands, one need to provide dflow global configurations. Generally set to the following mode:

"dflow_config" : {
            "mode" : "debug"
    },
"dflow_s3_config" : {}

The aesp simply pass all keys of “dflow_config” to dflow.config and all keys of “dflow_s3_config” to dflow.s3_config.

There are two main modes of aesp_config, one is the conventional genetic algorithm structure prediction process, which utilizes empirical potentials, quantum mechanical calculations, and trained machine learning potential functions for structure prediction; the other is the use of a combination of active learning and machine learning potentials to speed up the efficiency of structure prediction.

4.1. Standard mode#

"aesp_config" : {
    "mode": "std-csp"
}

First the population as well as the generation parameters

"aesp_config" : {
    "opt_params" : {
        "generation" : {
            "gen_size" : 50,
            "adaptive" : {
                "type" : "rca",
                "size_change_ratio" : 0.5
            }
        },
        "population" : {
            "pop_size" : 40,
            "adaptive" : {
                "type" : "rca",
                "size_change_ratio" : 0.5
            }
        }
    }
}

The parameters of generation and population are basically the same, but we’d better keep the size of population larger than that of generation ; where adaptive is the parameter of self-adaptation, and there are also two ways. Both ways have the same parameter size_change_ratio, which represents the ratio of the change in size; for example, if size is 50, the range of the change in size is [50*(1-0.5), 50*(1+0.5)].

Next are some operator arguments

"aesp_config" : {
    "opt_params" : {
        "type" : "std",
        "operator" : {
            "type" : "bulk",
            "generator" : {
                "prob" : 0.4,
                "random_gen_prob" : 1,
                "random_gen_params" : {
                    "composition" : {"B": 1, "C": 3},
                    "_spgnum" : ["1-230"],
                    "factor" : 1.1,
                    "_thickness" : 2,
                    "max_count" : 50
                }
            },
            "crossover" : {
                "prob" : 0.3,
                "plane_cross_prob" : 0.333,
                "sphere_cross_prob" : 0.333,
                "cylinder_cross_prob" : 0.334,
                "plane_cross_params" : {
                    "stddev" : 0.1,
                    "max_count" : 5
                },
                "sphere_cross_params" : {
                    "max_count" : 5
                },
                "cylinder_cross_params" : {
                    "max_count" : 5
                }
            },
            "mutation" : {
                "prob" : 0.3,
                "continuous_mut_factor" : 2,
                "strain_mut_prob" : 0.333,
                "permutation_mut_prob" : 0.333,
                "ripple_mut_prob" : 0.334,
                "strain_mut_params" : {
                    "stddev" : 0.1,
                    "max_count" : 5
                },
                "permutation_mut_params" : {
                    "max_count" : 5
                },
                "ripple_mut_params" : {
                    "max_count" : 5,
                    "rho" : 0.3,
                    "miu" : 2,
                    "eta" : 1
                }
            },
            "adaptive" : {
                "type": "adjustment",
                "use_recent_gen" : 2
            },
            "hard_constrains" : {
                "alpha" : [30, 150],
                "beta" : [30, 150],
                "gamma" : [30, 150],
                "chi" : [0, 180],
                "psi" : [0, 180],
                "phi" : [0, 180],
                "a" : [0, 100],
                "b" : [0, 100],
                "c" : [0, 100],
                "tol_matrix" : {
                    "_tuples" : [["Cl", "Na", 12], ["Cl", "Cl", 12], ["Na", "Na", 12]],
                    "prototype" : "atomic",
                    "factor" : 1.0
                }
            }
        }
    }
}

The operator contains various modes depending on the system, namely bulk, cluster. The operator contains three modes, namely generator, mutation and crossover. The following adaptive refers to the adaptive parameters of the three operators. Among them type has two ways. The hard_constrains represent some constraints on the structure generated by each operator. It contains some constraints on the angles of the lattice (including dihedral angles) and on the lattice constants; tol_matrix is some constraints on the interatomic distances. The prob in generator, mutation, and crossover represents the probability of choosing that mode, and the sum of the three probabilities is 1. Each of these operators has separate modes of operation, and their probabilities are xxx_gen_prob (summed to 1), xxx_mut_prob (summed to 1), and xx_gen_prob (summed to 1). mut_prob`` (sums to 1) and xxx_cross_prob (sums to 1). And xxx_xxx_params corresponds to the parameters of the corresponding operation.

At the same time we need to define the convergence conditions for the algorithm, that is, cvg_criterion .

"aesp_config" : {
    "opt_params" : {
        "cvg_criterion" : {
            "max_gen_num" : 10,
            "continuous_opt_num" : null
        }
    }
}

We also need to specify how each structure is calculated

The execution units of the aesp are the dflow Steps. How each step is executed is defined by the “step_configs”.

"aesp_config" : {
    "step_configs" : {}
}

The configs for prepare training, run training, prepare exploration, run exploration, prepare fp, run fp, select configurations, collect data and concurrent learning steps are given correspondingly.

Any of the config in the “step_configs” can be ommitted. If so, the configs of the step is set to the default step configs, which is provided by the following section, for example,

"aesp_config" : {
    "default_step_config" : {
        "template_slice_config" : {
            "group_size": 8,
            "pool_size" : 1
        },
        "executor" : {
            "type" : "dispatcher",
            "host" : "127.0.0.1",
            "image_pull_policy" : "IfNotPresent",
            "username" : "clqin",
            "password" : "clqin",
            "machine_dict" : {
                "batch_type" : "Shell",
                "context_type" : "local",
                "local_root" : "./",
                "remote_root" : "/home/zhao/work"
            },
            "resources_dict" : {
                "cpu_per_node" : 8,
                "gpu_per_node" : 1,
                "group_size" : 1
            }
        }
    }
}

The way of writing the “default_step_config” is the same as any step config in the “step_configs”.

4.2. active learning#

Note

Under test