Workflows#
Note
This notebook can be downloaded as workflows.ipynb and workflows.md
The Workchain class is the central piece that enables workflows to be run with aiida-vasp. By composing one or several Workchain classes, one can make a workflow. As single WorkChain class may launch one or several calculations, or it may launch children WorkChains to achieve the designed functionality.
For any short-running python code, the workchain can run them directly as calcfunction or calcfunction directly, and the provenance will be recorded accordingly.
It is important to note that however, long-running computational tasks should not be run directly in the code as it will delay or block the operation of the daemon.
We would like to encourage users to build workchains and/or compose existing ones into more advanced workflows that we can all share and benefit from. You may want to visit this page to learn more about WorkChains and how to build them.
One should note that the advantage of using a provenance-preserving engine like AiiDA is that you do
not have to define a workflow in order to have the calculations steps recorded and stored.
It is perfectly fine to conduct exploration studies using the basic workchains and use calcfunction to link the outputs/inputs together for provenance.
Design principles#
The rest of the bundled workchain are designed to run VaspWorkChain as the basic unit of work.
This means that they expect error-correction functionalities to be embedded in the VaspWorkChain so they
do not need to explicitly handle errors.
We use the expose_input and expose_outputs methods of the WorkChain class to expose the inputs and outputs of the VaspWorkChain.
For example, the inputs to the relax workchain looks like this:
VaspRelaxWorkChain
|
|- structure (StructureData of the input structure)
|- vasp (exposed VaspWorkChain inputs)
|- static_calc_settings (settings to override for the static calculation)
|- static_calc_options (options to override for the static calculation)
|- static_calc_parameters (parameters to override for the static calculation)
|- relax_settings (settings controlling the relaxation)
|- verbose
Where the inputs specific to the VaspWorkChain to be launched as nested inside the vasp namespace.
For example, to set the parameters one can use do the following:
from aiida.plugins import WorkflowFactory
builder = WorkflowFactory('vasp.v2.relax').get_builder()
builder.vasp.parameters = Dict(dict={'incar': {'encut': 500, 'isif': 2, 'nsw': 5, 'potim': 0.01}})
while when using VaspWorkChain directly, one can use:
from aiida.plugins import WorkflowFactory
builder = WorkflowFactory('vasp.v2.vasp').get_builder()
builder.parameters = {'incar': {'encut': 500, 'isif': 2, 'nsw': 5, 'potim': 0.01}} # This gets converted to a Dict automatically
The other options at the top level are specific to the workchain and are used to control its behavior.
The relax_settings input is a Dict that contains the settings for the relaxation.
These settings are validated at the submission time using the pydantic library.
To see the available settings, one can use:
from aiida.plugins import WorkflowFactory
opt = WorkflowFactory('vasp.v2.relax').option_class
# opt.<tab> to see all available options
print(opt.aiida_description())
algo: str
Default: The algorithm to use for relaxation
energy_cutoff: Optional
Default: The cut off energy difference when the relaxation is stopped (e.g. EDIFF)
force_cutoff: float
Default: The maximum force when the relaxation is stopped (e.g. EDIFFG)
steps: int
Default: Number of relaxation steps to perform (eg. NSW)
positions: bool
Default: If True, perform relaxation of the atomic positions
shape: bool
Default: If True, perform relaxation of the cell shape
volume: bool
Default: If True, perform relaxation of the cell volume
convergence_on: bool
Default: If True, perform convergence checks within the workchain
convergence_absolute: bool
Default: If True, use absolute values where possible when performing convergence checks
convergence_max_iterations: int
Default: Maximum iterations for convergence checking
convergence_positions: float
Default: The cutoff value for the convergence check on positions in Angstram. A negative value by pass the check.
convergence_volume: float
Default: The cutoff value for the convergence check on volume between the two structures. A negative value by pass the check.
convergence_shape_lengths: float
Default: The cutoff value for the convergence check on the lengths of the unit cell vectors, between input and the outputs structure. A negative value by pass the check.
convergence_shape_angles: float
Default: The cutoff value for the convergence check on the angles of the unit cell vectors, between input and the outputs structure. A negative value by pass the check.
convergence_mode: str
Default: Mode of the convergence check for positions. 'inout' for checking input/output structure, or 'last' to check only the change of the last step.
reuse: bool
Default: Whether reuse the previous calculation by copying over the remote folder
clean_reuse: bool
Default: Whether to perform a final cleaning of the reused calculations
keep_sp_workdir: bool
Default: Whether to keep the workdir of the final singlepoint calculation
perform: bool
Default: Do not perform any relaxation if set to 'False'
hybrid_calc_bootstrap: bool
Default: Whether to bootstrap hybrid calculation by performing standard DFT first
hybrid_calc_bootstrap_wallclock: int
Default: Wall time limit in second for the bootstrap calculation
keep_magnetization: bool
Default: Whether to keep magnetization from the previous calculation if possible
double_relax_mode: bool
Default: Experimental: Run in double relax mode - launch of the sub workflow is only performed up to two times without checking convergence in the end. This is useful for cases where the convergence is difficult due to change of basis set with variable cell and high-throughput studies.
residual_forces_check: bool
Default: Whether to perform residual force check after relaxation to ensure the forces are below threshold
By default, every input to the workchain has to be specified in full before submission, this can be quiet tedious for daily calculation.
To simplify the input, we have implemented the VaspInputGenerator class that can automatically update the builder with default values.
See this page for more information.
The user may write default values and store them in an YAML file to ensure consistent settings are used across multiple projects.
PS you can also print the input and output ports of the workchain using:
from aiida.plugins import WorkflowFactory
!verdi plugin list aiida.workflows vasp.v2.relax
Description:
Structure relaxation workchain.
Inputs:
relax_settings Dict algo: str
Default: The algorithm to use for relaxation
energy_cutoff: Optional Default:
The cut off energy difference when the relaxation is stopped (e.g. EDIFF)
force_cutoff: float Default: The
maximum force when the relaxation is stopped (e.g. EDIFFG)
steps: int Default: Number of
relaxation steps to perform (eg. NSW) positions:
bool Default: If True, perform
relaxation of the atomic positions shape: bool
Default: If True, perform relaxation of the cell shape
volume: bool Default: If True,
perform relaxation of the cell volume convergence_on:
bool Default: If True, perform
convergence checks within the workchain convergence_absolute:
bool Default: If True, use
absolute values where possible when performing convergence checks
convergence_max_iterations: int
Default: Maximum iterations for convergence checking
convergence_positions: float
Default: The cutoff value for the convergence check on positions in
Angstram. A negative value by pass the check.
convergence_volume: float
Default: The cutoff value for the convergence check on volume between the
two structures. A negative value by pass the check.
convergence_shape_lengths: float
Default: The cutoff value for the convergence check on the lengths of the
unit cell vectors, between input and the outputs structure. A negative
value by pass the check. convergence_shape_angles: float
Default: The cutoff value for the convergence check on the angles of the
unit cell vectors, between input and the outputs structure. A negative
value by pass the check. convergence_mode: str
Default: Mode of the convergence check for positions. 'inout' for checking
input/output structure, or 'last' to check only the change of the last
step. reuse: bool
Default: Whether reuse the previous calculation by copying over the remote
folder clean_reuse: bool
Default: Whether to perform a final cleaning of the reused calculations
keep_sp_workdir: bool Default:
Whether to keep the workdir of the final singlepoint calculation
perform: bool Default: Do not
perform any relaxation if set to 'False' hybrid_calc_bootstrap:
bool Default: Whether to
bootstrap hybrid calculation by performing standard DFT first
hybrid_calc_bootstrap_wallclock: int
Default: Wall time limit in second for the bootstrap calculation
keep_magnetization: bool
Default: Whether to keep magnetization from the previous calculation if
possible double_relax_mode: bool
Default: Experimental: Run in double relax mode - launch of the sub
workflow is only performed up to two times without checking convergence in
the end. This is useful for cases where the convergence is difficult due to
change of basis set with variable cell and high-throughput studies.
residual_forces_check: bool
Default: Whether to perform residual force check after relaxation to ensure
the forces are below threshold
structure StructureData, CifData
vasp Data
metadata
static_calc_options Dict, NoneType The full options Dict to be used in the final static
calculation.
static_calc_parameters Dict, NoneType The parameters (INCAR) to be used in the final static
calculation.
static_calc_settings Dict, NoneType The full settings Dict to be used in the final static
calculation.
verbose Bool, NoneType Increased verbosity.
Required inputs are displayed in bold red.
Outputs:
misc Dict The output parameters containing smaller quantities that do not depend on
system size.
relax
remote_folder RemoteData Input files necessary to run the process will be stored in this folder
node.
retrieved FolderData Files that are retrieved by the daemon will be stored in this node. By
default the stdout and stderr of the scheduler will be added, but one can
add more by specifying them in `CalcInfo.retrieve_list`.
arrays ArrayData The output trajectory data.
bands BandsData The output band structure.
born_charges ArrayData The output {name} data.
chgcar ChargedensityData The output charge density CHGCAR file.
dielectrics ArrayData The output {name} data.
dos ArrayData The output dos.
dynmat ArrayData The output {name} data.
energies ArrayData Energies of the calculation at each ionic/electronic step.
hessian ArrayData The output {name} data.
kpoints KpointsData The output k-points.
parallel_settings Dict
parameters Dict All input parameters including the default values.
projectors ArrayData The projectors for the calculation.
remote_stash RemoteStashData Contents of the `stash.source_list` option are stored in this remote folder
after job completion.
structure StructureData The output structure.
trajectory TrajectoryData The output trajectory data.
wavecar WavefunData The output plane wave coefficients file.
Required outputs are displayed in bold red.
Exit codes:
0 The process finished successfully.
0 the sun is shining
1 The process has failed with an unspecified error.
2 The process failed with legacy failure mode.
10 The process returned an invalid output.
11 The process did not register a required output.
300 the called workchain does not contain the necessary relaxed output
structure
420 no called workchain detected
500 unknown error detected in the relax workchain
502 there was an error overriding the parameters
600 Ionic relaxation was not converged after the maximum number of iterations
has been spent
601 The final singlepoint calculation has increased residual forces. This may
be caused by electronic solver converging to a different solution. Care
should be taken to investigate the results.
Exit codes that invalidate the cache are marked in bold red.
Workflows included in aiida-vasp#
There are several workflows bundled with aiida-vasp. They can be referred using the entry point started with vasp.
For example, the following code load the standard VaspWorkChain in a shell launched by using the command verdi shell:
from aiida.plugins import WorkflowFactory # This can be omitted as it is imported by default with verdi shell
vasp_wc = WorkflowFactory('vasp.vasp')
Hint
You may see something like vasp.v2.vasp as entry point in the document:
The first
vaspmeans the entrypoint is fromaiida-vasppluginThe second part
v2is a version tag, refers to thev2versionThe last part
vasprefers to thevaspworkchain/calculation included in the plugin.
The latest version of the workchain is selected if the v2 is omitted. We use this syntax to allow
some backward compatibility during the development.
The VaspWorkChain is the main workchain that performs a VASP calculation from start to finish.
One can view it as a improved version of of the VaspCalculation as it takes care input generation and validation.
It also includes several error handling mechanisms to ensure that the calculation is successful and that the output is valid.
For example, if a geometry optimization run fails to converge due to insufficient wall time requested, the workchain will resubmit a new calculation starting from the last geometry.
The main objective is to ensure the completion of the calculation with the parameters originally specified.
VaspWorkChain will not change any parameters that may render the calculated energies incompatible, such as the energy cut off or the k-point grid. However, it may change the electronic solver,
the geometry optimisation algorithm or of the step size.
The VaspWorkChain is designed to be general-purpose so it should support any types of VASP calculations.
If it gives false-positive assertion of errors, please report them as issues on the aiida-vasp issue tracker.
You can also try to turn off the process handler that raises the error.
This section we give some brief introduction to the bundled workflows in AiiDA-VASP.
Convergence workchain#
The VaspWorkChain is a simple workflow that runs a series of VASP calculations with different parameters and checks if the results converge.
The convergence of cut off energy and kpoints are currently implemented.
As metioned above, the inputs to the VaspWorkChain should be placed into the vasp namespace.
The convergence settings are specified using the convergence_settings input which is a Dict containing the following keys:
print(WorkflowFactory('vasp.v2.relax').option_class.aiida_description())
Relaxation workchain#
The VaspWorkChain is a simple workflow that runs a VASP relaxation calculation.
It will run VASP geometry optimizations until the specified convergence criteria are met.
This may involve one or more actual VASP calculations. This is because:
A single VASP calculation may not fully relax the structure, especially when the maximum number of ionic steps is set to a relatively small value.
For variable cell geometry optimization, multiple VASP calculations are required as each restart resets the basis set, otherwise the effective cut off energy can change.
A final singlepoint calculation may be needed to ensure that the energy is consistent with the cut off, if the lattice has been changed.
The inputs to the VaspWorkChain should be placed into the vasp namespace.
The convergence settings are specified using the relax_settings input which is a Dict containing the following keys:
from aiida.plugins import WorkflowFactory
print(WorkflowFactory('vasp.v2.relax').option_class.aiida_description())
algo: str
Default: The algorithm to use for relaxation
energy_cutoff: Optional
Default: The cut off energy difference when the relaxation is stopped (e.g. EDIFF)
force_cutoff: float
Default: The maximum force when the relaxation is stopped (e.g. EDIFFG)
steps: int
Default: Number of relaxation steps to perform (eg. NSW)
positions: bool
Default: If True, perform relaxation of the atomic positions
shape: bool
Default: If True, perform relaxation of the cell shape
volume: bool
Default: If True, perform relaxation of the cell volume
convergence_on: bool
Default: If True, perform convergence checks within the workchain
convergence_absolute: bool
Default: If True, use absolute values where possible when performing convergence checks
convergence_max_iterations: int
Default: Maximum iterations for convergence checking
convergence_positions: float
Default: The cutoff value for the convergence check on positions in Angstram. A negative value by pass the check.
convergence_volume: float
Default: The cutoff value for the convergence check on volume between the two structures. A negative value by pass the check.
convergence_shape_lengths: float
Default: The cutoff value for the convergence check on the lengths of the unit cell vectors, between input and the outputs structure. A negative value by pass the check.
convergence_shape_angles: float
Default: The cutoff value for the convergence check on the angles of the unit cell vectors, between input and the outputs structure. A negative value by pass the check.
convergence_mode: str
Default: Mode of the convergence check for positions. 'inout' for checking input/output structure, or 'last' to check only the change of the last step.
reuse: bool
Default: Whether reuse the previous calculation by copying over the remote folder
clean_reuse: bool
Default: Whether to perform a final cleaning of the reused calculations
keep_sp_workdir: bool
Default: Whether to keep the workdir of the final singlepoint calculation
perform: bool
Default: Do not perform any relaxation if set to 'False'
hybrid_calc_bootstrap: bool
Default: Whether to bootstrap hybrid calculation by performing standard DFT first
hybrid_calc_bootstrap_wallclock: int
Default: Wall time limit in second for the bootstrap calculation
keep_magnetization: bool
Default: Whether to keep magnetization from the previous calculation if possible
double_relax_mode: bool
Default: Experimental: Run in double relax mode - launch of the sub workflow is only performed up to two times without checking convergence in the end. This is useful for cases where the convergence is difficult due to change of basis set with variable cell and high-throughput studies.
residual_forces_check: bool
Default: Whether to perform residual force check after relaxation to ensure the forces are below threshold
Note the keys such as algo, steps, force_cutoff are translated into INCAR tags (IBRION, NSW, EDIFFG, etc.), so one should not explicitly set these tags in the parameters input.
Hint
This means one can quickly reuse the parameters from a single point calculation for a relaxation and vice versa.
See this tutorial for an example of how to run the VaspWorkChain.
Band structure workflow#
The VaspWorkChain is a workflow for calculating the band structure of a material using VASP.
A band structure typically involves computing the ground state electron density then using this fixed density to
solve for the eigenvalues of the Kohn-Sham equation at specific k-points in the Brillouin zone.
Typically, a path along which the eigenvalues are computed is generated based on the point group symmetry of the
input structure.
There are approaches to generate this path automatically,here we default to using seekpath, but it can be
switched to using the paths generated by sumo.
Another complication is that the path generated is for a specific primitive-cell configuration (as there are infinite ways of choosing the primitive cell). Hence, a common mistake is to blindly using the path of the input cell, which may not be the standardized primitive cell. Here, the workchain handles this internally, and the generated standardized primitive cell is returned by the workchain as one of the outputs.
In addition, an exposed relax namespace for running VaspWorkChain exists and the workchain will perform
geometry optimization before the band structure calculation if it is specified.
The parameters for the scf (for generating the charge density) the actual band structure structure calculation should be specified under the exposed VaspWorkChain namespace called scf and bands.
An additional dos namespace is also exposed for calculating the density of states and can be specified if desired.
Note
The scf namespace should always be specified, while specifying bands namespace is only needed if the
input nodes should be different from that in the scf namespace. The same rule applies to the dos namespace.
Similar to the VaspWorkChain the behavor of the VaspWorkChain can be controlled using the band_settings input:
from aiida.plugins import WorkflowFactory
print(WorkflowFactory('vasp.v2.bands').option_class.aiida_description())
symprec: float
Default: Precision of the symmetry determination
band_mode: str
Default: Mode for generating the band path. Choose from: bradcrack, pymatgen,seekpath-aiida and latimer-munro.
band_kpoints_distance: float
Default: Spacing for band distances for automatic kpoints generation, used by seekpath-aiida mode.
line_density: float
Default: Density of the point along the path, used by the sumo interface.
dos_kpoints_distance: float
Default: Kpoints for running DOS calculations in A^-1 * 2pi. Will perform non-SCF DOS calculation is supplied.
only_dos: bool
Default: Flag for running only DOS calculations
run_dos: bool
Default: Flag for running DOS calculations
additional_band_analysis_parameters: dict
Default: Additional keyword arguments for the seekpath/ interface, available keys are: ['with_time_reversal', 'reference_distance', 'recipe', 'threshold', 'symprec', 'angle_tolerance']
kpoints_per_split: int
Default: Number of kpoints per split for the band structure calculation
hybrid_reuse_wavecar: bool
Default: Whether to reuse the WAVECAR from the previous relax/singlepoint calculation
The VaspHybridBandsWorkChain is an variant of the VaspWorkChain for running band structure calculation with hybrid functional.
In this case, the potential is not completely determined from the electron density, hence one cannot use the standard
approach that first compute the ground state electron density and then use it to solve the Kohn-Sham equation.
Instead, the Kohn-Sham equation has to be solved self-consistently, and the k-points along the path are inserted
as zero-weighted k-points.
The VaspHybridBandsWorkChain is designed for this purpose.
In addition, the large compute cost of hybrid functional means it may be advantageous to split the full k-point path into smaller sub-paths,
and run multiple self-consistent calculations in parallel instead of doing a single large calculation,
given the constraints of the available computing resources.
The number of kpoints included in each sub-path can be specified using the kpoints_per_subpath input.
Hint
Set kpoints_per_subpath to a very large number to run a single self-consistent calculation with all k-points.
See this tutorial for an example of how to run the VaspWorkChain.