4. Input script#
The input file of AESP is a json format file, which is mainly composed of two main parts, namely, the configuration files of dflow (dflow_config, dflow_s3_config), and some configurations of the algorithms of AESP (aesp_config).
For all the aesp commands, one need to provide dflow global configurations. Generally set to the following mode:
"dflow_config" : {
"mode" : "debug"
},
"dflow_s3_config" : {}
The aesp simply pass all keys of “dflow_config” to dflow.config and all keys of “dflow_s3_config” to dflow.s3_config.
There are two main modes of aesp_config, one is the conventional genetic algorithm structure prediction process, which utilizes empirical potentials, quantum mechanical calculations, and trained machine learning potential functions for structure prediction; the other is the use of a combination of active learning and machine learning potentials to speed up the efficiency of structure prediction.
4.1. Standard mode#
"aesp_config" : {
"mode": "std-csp"
}
First the population as well as the generation parameters
"aesp_config" : {
"opt_params" : {
"generation" : {
"gen_size" : 50,
"adaptive" : {
"type" : "rca",
"size_change_ratio" : 0.5
}
},
"population" : {
"pop_size" : 40,
"adaptive" : {
"type" : "rca",
"size_change_ratio" : 0.5
}
}
}
}
The parameters of generation
and population
are basically the same, but
we’d better keep the size
of population
larger than that of generation
; where adaptive
is the parameter of self-adaptation, and there are also two
ways. Both ways have the same parameter size_change_ratio
, which represents
the ratio of the change in size
; for example, if size
is 50, the range
of the change in size
is [50*(1-0.5), 50*(1+0.5)].
Next are some operator arguments
"aesp_config" : {
"opt_params" : {
"type" : "std",
"operator" : {
"type" : "bulk",
"generator" : {
"prob" : 0.4,
"random_gen_prob" : 1,
"random_gen_params" : {
"composition" : {"B": 1, "C": 3},
"_spgnum" : ["1-230"],
"factor" : 1.1,
"_thickness" : 2,
"max_count" : 50
}
},
"crossover" : {
"prob" : 0.3,
"plane_cross_prob" : 0.333,
"sphere_cross_prob" : 0.333,
"cylinder_cross_prob" : 0.334,
"plane_cross_params" : {
"stddev" : 0.1,
"max_count" : 5
},
"sphere_cross_params" : {
"max_count" : 5
},
"cylinder_cross_params" : {
"max_count" : 5
}
},
"mutation" : {
"prob" : 0.3,
"continuous_mut_factor" : 2,
"strain_mut_prob" : 0.333,
"permutation_mut_prob" : 0.333,
"ripple_mut_prob" : 0.334,
"strain_mut_params" : {
"stddev" : 0.1,
"max_count" : 5
},
"permutation_mut_params" : {
"max_count" : 5
},
"ripple_mut_params" : {
"max_count" : 5,
"rho" : 0.3,
"miu" : 2,
"eta" : 1
}
},
"adaptive" : {
"type": "adjustment",
"use_recent_gen" : 2
},
"hard_constrains" : {
"alpha" : [30, 150],
"beta" : [30, 150],
"gamma" : [30, 150],
"chi" : [0, 180],
"psi" : [0, 180],
"phi" : [0, 180],
"a" : [0, 100],
"b" : [0, 100],
"c" : [0, 100],
"tol_matrix" : {
"_tuples" : [["Cl", "Na", 12], ["Cl", "Cl", 12], ["Na", "Na", 12]],
"prototype" : "atomic",
"factor" : 1.0
}
}
}
}
}
The operator
contains various modes depending on the system, namely bulk,
cluster. The operator
contains three modes, namely generator
, mutation
and crossover
. The following adaptive
refers to the adaptive parameters
of the three operators. Among them type
has two ways. The hard_constrains
represent some constraints on the structure generated by each operator. It
contains some constraints on the angles of the lattice (including dihedral
angles) and on the lattice constants; tol_matrix
is some constraints on
the interatomic distances. The prob
in generator
, mutation
, and
crossover
represents the probability of choosing that mode, and the sum
of the three probabilities is 1. Each of these operators has separate modes
of operation, and their probabilities are xxx_gen_prob
(summed to 1),
xxx_mut_prob
(summed to 1), and xx_gen_prob
(summed to 1). mut_prob``
(sums to 1) and xxx_cross_prob
(sums to 1). And xxx_xxx_params
corresponds to the parameters of the corresponding operation.
At the same time we need to define the convergence conditions for the algorithm,
that is, cvg_criterion
.
"aesp_config" : {
"opt_params" : {
"cvg_criterion" : {
"max_gen_num" : 10,
"continuous_opt_num" : null
}
}
}
We also need to specify how each structure is calculated
The execution units of the aesp are the dflow Steps. How each step is executed is defined by the “step_configs”.
"aesp_config" : {
"step_configs" : {}
}
The configs for prepare training, run training, prepare exploration, run exploration, prepare fp, run fp, select configurations, collect data and concurrent learning steps are given correspondingly.
Any of the config in the “step_configs” can be ommitted. If so, the configs of the step is set to the default step configs, which is provided by the following section, for example,
"aesp_config" : {
"default_step_config" : {
"template_slice_config" : {
"group_size": 8,
"pool_size" : 1
},
"executor" : {
"type" : "dispatcher",
"host" : "127.0.0.1",
"image_pull_policy" : "IfNotPresent",
"username" : "clqin",
"password" : "clqin",
"machine_dict" : {
"batch_type" : "Shell",
"context_type" : "local",
"local_root" : "./",
"remote_root" : "/home/zhao/work"
},
"resources_dict" : {
"cpu_per_node" : 8,
"gpu_per_node" : 1,
"group_size" : 1
}
}
}
}
The way of writing the “default_step_config” is the same as any step config in the “step_configs”.
4.2. active learning#
Note
Under test