Workflows#

Note

This notebook can be downloaded as workflows.ipynb and workflows.md

The Workchain class is the central piece that enables workflows to be run with aiida-vasp. By composing one or several Workchain classes, one can make a workflow. As single WorkChain class may launch one or several calculations, or it may launch children WorkChains to achieve the designed functionality.

For any short-running python code, the workchain can run them directly as calcfunction or calcfunction directly, and the provenance will be recorded accordingly.

It is important to note that however, long-running computational tasks should not be run directly in the code as it will delay or block the operation of the daemon.

We would like to encourage users to build workchains and/or compose existing ones into more advanced workflows that we can all share and benefit from. You may want to visit this page to learn more about WorkChains and how to build them.

One should note that the advantage of using a provenance-preserving engine like AiiDA is that you do not have to define a workflow in order to have the calculations steps recorded and stored. It is perfectly fine to conduct exploration studies using the basic workchains and use calcfunction to link the outputs/inputs together for provenance.

Design principles#

The rest of the bundled workchain are designed to run VaspWorkChain as the basic unit of work. This means that they expect error-correction functionalities to be embedded in the VaspWorkChain so they do not need to explicitly handle errors.

We use the expose_input and expose_outputs methods of the WorkChain class to expose the inputs and outputs of the VaspWorkChain.

For example, the inputs to the relax workchain looks like this:

VaspRelaxWorkChain
|
|- structure (StructureData of the input structure)
|- vasp (exposed VaspWorkChain inputs)
|- static_calc_settings (settings to override for the static calculation)
|- static_calc_options (options to override for the static calculation)
|- static_calc_parameters (parameters to override for the static calculation)
|- relax_settings (settings controlling the relaxation)
|- verbose

Where the inputs specific to the VaspWorkChain to be launched as nested inside the vasp namespace. For example, to set the parameters one can use do the following:

from aiida.plugins import WorkflowFactory
builder = WorkflowFactory('vasp.v2.relax').get_builder()
builder.vasp.parameters = Dict(dict={'incar': {'encut': 500, 'isif': 2, 'nsw': 5, 'potim': 0.01}})

while when using VaspWorkChain directly, one can use:

from aiida.plugins import WorkflowFactory
builder = WorkflowFactory('vasp.v2.vasp').get_builder()
builder.parameters = {'incar': {'encut': 500, 'isif': 2, 'nsw': 5, 'potim': 0.01}}  # This gets converted to a Dict automatically

The other options at the top level are specific to the workchain and are used to control its behavior.

The relax_settings input is a Dict that contains the settings for the relaxation. These settings are validated at the submission time using the pydantic library.

To see the available settings, one can use:

from aiida.plugins import WorkflowFactory
opt = WorkflowFactory('vasp.v2.relax').option_class
# opt.<tab> to see all available options
print(opt.aiida_description())
                             algo:  str        
                                    Default: The algorithm to use for relaxation
                    energy_cutoff:  Optional   
                                    Default: The cut off energy difference when the relaxation is stopped (e.g. EDIFF)
                     force_cutoff:  float      
                                    Default: The maximum force when the relaxation is stopped (e.g. EDIFFG)
                            steps:  int        
                                    Default: Number of relaxation steps to perform (eg. NSW)
                        positions:  bool       
                                    Default: If True, perform relaxation of the atomic positions
                            shape:  bool       
                                    Default: If True, perform relaxation of the cell shape
                           volume:  bool       
                                    Default: If True, perform relaxation of the cell volume
                   convergence_on:  bool       
                                    Default: If True, perform convergence checks within the workchain
             convergence_absolute:  bool       
                                    Default: If True, use absolute values where possible when performing convergence checks
       convergence_max_iterations:  int        
                                    Default: Maximum iterations for convergence checking
            convergence_positions:  float      
                                    Default: The cutoff value for the convergence check on positions in Angstram. A negative value by pass the check.
               convergence_volume:  float      
                                    Default: The cutoff value for the convergence check on volume between the two structures. A negative value by pass the check.
        convergence_shape_lengths:  float      
                                    Default: The cutoff value for the convergence check on the lengths of the unit cell vectors, between input and the outputs structure. A negative value by pass the check.
         convergence_shape_angles:  float      
                                    Default: The cutoff value for the convergence check on the angles of the unit cell vectors, between input and the outputs structure. A negative value by pass the check.
                 convergence_mode:  str        
                                    Default: Mode of the convergence check for positions. 'inout' for checking input/output structure, or 'last' to check only the change of the last step.
                            reuse:  bool       
                                    Default: Whether reuse the previous calculation by copying over the remote folder
                      clean_reuse:  bool       
                                    Default: Whether to perform a final cleaning of the reused calculations
                  keep_sp_workdir:  bool       
                                    Default: Whether to keep the workdir of the final singlepoint calculation
                          perform:  bool       
                                    Default: Do not perform any relaxation if set to 'False'
            hybrid_calc_bootstrap:  bool       
                                    Default: Whether to bootstrap hybrid calculation by performing standard DFT first
  hybrid_calc_bootstrap_wallclock:  int        
                                    Default: Wall time limit in second for the bootstrap calculation
               keep_magnetization:  bool       
                                    Default: Whether to keep magnetization from the previous calculation if possible
                double_relax_mode:  bool       
                                    Default: Experimental: Run in double relax mode - launch of the sub workflow is only performed up to two times without checking convergence in the end. This is useful for cases where the convergence is difficult due to change of basis set with variable cell and high-throughput studies.
            residual_forces_check:  bool       
                                    Default: Whether to perform residual force check after relaxation to ensure the forces are below threshold

By default, every input to the workchain has to be specified in full before submission, this can be quiet tedious for daily calculation. To simplify the input, we have implemented the VaspInputGenerator class that can automatically update the builder with default values. See this page for more information.

The user may write default values and store them in an YAML file to ensure consistent settings are used across multiple projects.

PS you can also print the input and output ports of the workchain using:

from aiida.plugins import WorkflowFactory
!verdi plugin list aiida.workflows vasp.v2.relax
Description:

    Structure relaxation workchain.

Inputs:
        relax_settings  Dict                    algo:  str
                                                Default: The algorithm to use for relaxation
                                                energy_cutoff:  Optional                                        Default:
                                                The cut off energy difference when the relaxation is stopped (e.g. EDIFF)
                                                force_cutoff:  float                                           Default: The
                                                maximum force when the relaxation is stopped (e.g. EDIFFG)
                                                steps:  int                                             Default: Number of
                                                relaxation steps to perform (eg. NSW)                         positions:
                                                bool                                            Default: If True, perform
                                                relaxation of the atomic positions                             shape:  bool
                                                Default: If True, perform relaxation of the cell shape
                                                volume:  bool                                            Default: If True,
                                                perform relaxation of the cell volume                    convergence_on:
                                                bool                                            Default: If True, perform
                                                convergence checks within the workchain              convergence_absolute:
                                                bool                                            Default: If True, use
                                                absolute values where possible when performing convergence checks
                                                convergence_max_iterations:  int
                                                Default: Maximum iterations for convergence checking
                                                convergence_positions:  float
                                                Default: The cutoff value for the convergence check on positions in
                                                Angstram. A negative value by pass the check.
                                                convergence_volume:  float
                                                Default: The cutoff value for the convergence check on volume between the
                                                two structures. A negative value by pass the check.
                                                convergence_shape_lengths:  float
                                                Default: The cutoff value for the convergence check on the lengths of the
                                                unit cell vectors, between input and the outputs structure. A negative
                                                value by pass the check.          convergence_shape_angles:  float
                                                Default: The cutoff value for the convergence check on the angles of the
                                                unit cell vectors, between input and the outputs structure. A negative
                                                value by pass the check.                  convergence_mode:  str
                                                Default: Mode of the convergence check for positions. 'inout' for checking
                                                input/output structure, or 'last' to check only the change of the last
                                                step.                             reuse:  bool
                                                Default: Whether reuse the previous calculation by copying over the remote
                                                folder                       clean_reuse:  bool
                                                Default: Whether to perform a final cleaning of the reused calculations
                                                keep_sp_workdir:  bool                                            Default:
                                                Whether to keep the workdir of the final singlepoint calculation
                                                perform:  bool                                            Default: Do not
                                                perform any relaxation if set to 'False'             hybrid_calc_bootstrap:
                                                bool                                            Default: Whether to
                                                bootstrap hybrid calculation by performing standard DFT first
                                                hybrid_calc_bootstrap_wallclock:  int
                                                Default: Wall time limit in second for the bootstrap calculation
                                                keep_magnetization:  bool
                                                Default: Whether to keep magnetization from the previous calculation if
                                                possible                 double_relax_mode:  bool
                                                Default: Experimental: Run in double relax mode - launch of the sub
                                                workflow is only performed up to two times without checking convergence in
                                                the end. This is useful for cases where the convergence is difficult due to
                                                change of basis set with variable cell and high-throughput studies.
                                                residual_forces_check:  bool
                                                Default: Whether to perform residual force check after relaxation to ensure
                                                the forces are below threshold
             structure  StructureData, CifData
                  vasp  Data
              metadata
   static_calc_options  Dict, NoneType          The full options Dict to be used in the final static
                                                calculation.
static_calc_parameters  Dict, NoneType          The parameters (INCAR) to be used in the final static
                                                calculation.
  static_calc_settings  Dict, NoneType          The full settings Dict to be used in the final static
                                                calculation.
               verbose  Bool, NoneType          Increased verbosity.

Required inputs are displayed in bold red.

Outputs:
             misc  Dict               The output parameters containing smaller quantities that do not depend on
                                      system size.
            relax
    remote_folder  RemoteData         Input files necessary to run the process will be stored in this folder
                                      node.
        retrieved  FolderData         Files that are retrieved by the daemon will be stored in this node. By
                                      default the stdout and stderr of the scheduler will be added, but one can
                                      add more by specifying them in `CalcInfo.retrieve_list`.
           arrays  ArrayData          The output trajectory data.
            bands  BandsData          The output band structure.
     born_charges  ArrayData          The output {name} data.
           chgcar  ChargedensityData  The output charge density CHGCAR file.
      dielectrics  ArrayData          The output {name} data.
              dos  ArrayData          The output dos.
           dynmat  ArrayData          The output {name} data.
         energies  ArrayData          Energies of the calculation at each ionic/electronic step.
          hessian  ArrayData          The output {name} data.
          kpoints  KpointsData        The output k-points.
parallel_settings  Dict
       parameters  Dict               All input parameters including the default values.
       projectors  ArrayData          The projectors for the calculation.
     remote_stash  RemoteStashData    Contents of the `stash.source_list` option are stored in this remote folder
                                      after job completion.
        structure  StructureData      The output structure.
       trajectory  TrajectoryData     The output trajectory data.
          wavecar  WavefunData        The output plane wave coefficients file.

Required outputs are displayed in bold red.

Exit codes:

  0  The process finished successfully.
  0  the sun is shining
  1  The process has failed with an unspecified error.
  2  The process failed with legacy failure mode.
 10  The process returned an invalid output.
 11  The process did not register a required output.
300  the called workchain does not contain the necessary relaxed output
     structure
420  no called workchain detected
500  unknown error detected in the relax workchain
502  there was an error overriding the parameters
600  Ionic relaxation was not converged after the maximum number of iterations
     has been spent
601  The final singlepoint calculation has increased residual forces. This may
     be caused by electronic solver converging to a different solution. Care
     should be taken to investigate the results.

Exit codes that invalidate the cache are marked in bold red.

Workflows included in aiida-vasp#

There are several workflows bundled with aiida-vasp. They can be referred using the entry point started with vasp.

For example, the following code load the standard VaspWorkChain in a shell launched by using the command verdi shell:

from aiida.plugins import WorkflowFactory  # This can be omitted as it is imported by default with verdi shell
vasp_wc = WorkflowFactory('vasp.vasp')

Hint

You may see something like vasp.v2.vasp as entry point in the document:

  • The first vasp means the entrypoint is from aiida-vasp plugin

  • The second part v2 is a version tag, refers to the v2 version

  • The last part vasp refers to the vasp workchain/calculation included in the plugin.

The latest version of the workchain is selected if the v2 is omitted. We use this syntax to allow some backward compatibility during the development.

The VaspWorkChain is the main workchain that performs a VASP calculation from start to finish. One can view it as a improved version of of the VaspCalculation as it takes care input generation and validation. It also includes several error handling mechanisms to ensure that the calculation is successful and that the output is valid. For example, if a geometry optimization run fails to converge due to insufficient wall time requested, the workchain will resubmit a new calculation starting from the last geometry. The main objective is to ensure the completion of the calculation with the parameters originally specified.

VaspWorkChain will not change any parameters that may render the calculated energies incompatible, such as the energy cut off or the k-point grid. However, it may change the electronic solver, the geometry optimisation algorithm or of the step size.

The VaspWorkChain is designed to be general-purpose so it should support any types of VASP calculations. If it gives false-positive assertion of errors, please report them as issues on the aiida-vasp issue tracker. You can also try to turn off the process handler that raises the error.

This section we give some brief introduction to the bundled workflows in AiiDA-VASP.

Convergence workchain#

The VaspWorkChain is a simple workflow that runs a series of VASP calculations with different parameters and checks if the results converge. The convergence of cut off energy and kpoints are currently implemented.

As metioned above, the inputs to the VaspWorkChain should be placed into the vasp namespace. The convergence settings are specified using the convergence_settings input which is a Dict containing the following keys:

print(WorkflowFactory('vasp.v2.relax').option_class.aiida_description())

Relaxation workchain#

The VaspWorkChain is a simple workflow that runs a VASP relaxation calculation. It will run VASP geometry optimizations until the specified convergence criteria are met.

This may involve one or more actual VASP calculations. This is because:

  • A single VASP calculation may not fully relax the structure, especially when the maximum number of ionic steps is set to a relatively small value.

  • For variable cell geometry optimization, multiple VASP calculations are required as each restart resets the basis set, otherwise the effective cut off energy can change.

  • A final singlepoint calculation may be needed to ensure that the energy is consistent with the cut off, if the lattice has been changed.

The inputs to the VaspWorkChain should be placed into the vasp namespace. The convergence settings are specified using the relax_settings input which is a Dict containing the following keys:

from aiida.plugins import WorkflowFactory
print(WorkflowFactory('vasp.v2.relax').option_class.aiida_description())
                             algo:  str        
                                    Default: The algorithm to use for relaxation
                    energy_cutoff:  Optional   
                                    Default: The cut off energy difference when the relaxation is stopped (e.g. EDIFF)
                     force_cutoff:  float      
                                    Default: The maximum force when the relaxation is stopped (e.g. EDIFFG)
                            steps:  int        
                                    Default: Number of relaxation steps to perform (eg. NSW)
                        positions:  bool       
                                    Default: If True, perform relaxation of the atomic positions
                            shape:  bool       
                                    Default: If True, perform relaxation of the cell shape
                           volume:  bool       
                                    Default: If True, perform relaxation of the cell volume
                   convergence_on:  bool       
                                    Default: If True, perform convergence checks within the workchain
             convergence_absolute:  bool       
                                    Default: If True, use absolute values where possible when performing convergence checks
       convergence_max_iterations:  int        
                                    Default: Maximum iterations for convergence checking
            convergence_positions:  float      
                                    Default: The cutoff value for the convergence check on positions in Angstram. A negative value by pass the check.
               convergence_volume:  float      
                                    Default: The cutoff value for the convergence check on volume between the two structures. A negative value by pass the check.
        convergence_shape_lengths:  float      
                                    Default: The cutoff value for the convergence check on the lengths of the unit cell vectors, between input and the outputs structure. A negative value by pass the check.
         convergence_shape_angles:  float      
                                    Default: The cutoff value for the convergence check on the angles of the unit cell vectors, between input and the outputs structure. A negative value by pass the check.
                 convergence_mode:  str        
                                    Default: Mode of the convergence check for positions. 'inout' for checking input/output structure, or 'last' to check only the change of the last step.
                            reuse:  bool       
                                    Default: Whether reuse the previous calculation by copying over the remote folder
                      clean_reuse:  bool       
                                    Default: Whether to perform a final cleaning of the reused calculations
                  keep_sp_workdir:  bool       
                                    Default: Whether to keep the workdir of the final singlepoint calculation
                          perform:  bool       
                                    Default: Do not perform any relaxation if set to 'False'
            hybrid_calc_bootstrap:  bool       
                                    Default: Whether to bootstrap hybrid calculation by performing standard DFT first
  hybrid_calc_bootstrap_wallclock:  int        
                                    Default: Wall time limit in second for the bootstrap calculation
               keep_magnetization:  bool       
                                    Default: Whether to keep magnetization from the previous calculation if possible
                double_relax_mode:  bool       
                                    Default: Experimental: Run in double relax mode - launch of the sub workflow is only performed up to two times without checking convergence in the end. This is useful for cases where the convergence is difficult due to change of basis set with variable cell and high-throughput studies.
            residual_forces_check:  bool       
                                    Default: Whether to perform residual force check after relaxation to ensure the forces are below threshold

Note the keys such as algo, steps, force_cutoff are translated into INCAR tags (IBRION, NSW, EDIFFG, etc.), so one should not explicitly set these tags in the parameters input.

Hint

This means one can quickly reuse the parameters from a single point calculation for a relaxation and vice versa.

See this tutorial for an example of how to run the VaspWorkChain.

Band structure workflow#

The VaspWorkChain is a workflow for calculating the band structure of a material using VASP. A band structure typically involves computing the ground state electron density then using this fixed density to solve for the eigenvalues of the Kohn-Sham equation at specific k-points in the Brillouin zone.

Typically, a path along which the eigenvalues are computed is generated based on the point group symmetry of the input structure. There are approaches to generate this path automatically,here we default to using seekpath, but it can be switched to using the paths generated by sumo.

Another complication is that the path generated is for a specific primitive-cell configuration (as there are infinite ways of choosing the primitive cell). Hence, a common mistake is to blindly using the path of the input cell, which may not be the standardized primitive cell. Here, the workchain handles this internally, and the generated standardized primitive cell is returned by the workchain as one of the outputs.

In addition, an exposed relax namespace for running VaspWorkChain exists and the workchain will perform geometry optimization before the band structure calculation if it is specified.

The parameters for the scf (for generating the charge density) the actual band structure structure calculation should be specified under the exposed VaspWorkChain namespace called scf and bands. An additional dos namespace is also exposed for calculating the density of states and can be specified if desired.

Note

The scf namespace should always be specified, while specifying bands namespace is only needed if the input nodes should be different from that in the scf namespace. The same rule applies to the dos namespace.

Similar to the VaspWorkChain the behavor of the VaspWorkChain can be controlled using the band_settings input:

from aiida.plugins import WorkflowFactory
print(WorkflowFactory('vasp.v2.bands').option_class.aiida_description())
                              symprec:  float      
                                        Default: Precision of the symmetry determination
                            band_mode:  str        
                                        Default: Mode for generating the band path. Choose from: bradcrack, pymatgen,seekpath-aiida and latimer-munro.
                band_kpoints_distance:  float      
                                        Default: Spacing for band distances for automatic kpoints generation, used by seekpath-aiida mode.
                         line_density:  float      
                                        Default: Density of the point along the path, used by the sumo interface.
                 dos_kpoints_distance:  float      
                                        Default: Kpoints for running DOS calculations in A^-1 * 2pi. Will perform non-SCF DOS calculation is supplied.
                             only_dos:  bool       
                                        Default: Flag for running only DOS calculations
                              run_dos:  bool       
                                        Default: Flag for running DOS calculations
  additional_band_analysis_parameters:  dict       
                                        Default: Additional keyword arguments for the seekpath/ interface, available keys are:  ['with_time_reversal', 'reference_distance', 'recipe', 'threshold', 'symprec', 'angle_tolerance']
                    kpoints_per_split:  int        
                                        Default: Number of kpoints per split for the band structure calculation
                 hybrid_reuse_wavecar:  bool       
                                        Default: Whether to reuse the WAVECAR from the previous relax/singlepoint calculation

The VaspHybridBandsWorkChain is an variant of the VaspWorkChain for running band structure calculation with hybrid functional. In this case, the potential is not completely determined from the electron density, hence one cannot use the standard approach that first compute the ground state electron density and then use it to solve the Kohn-Sham equation. Instead, the Kohn-Sham equation has to be solved self-consistently, and the k-points along the path are inserted as zero-weighted k-points.

The VaspHybridBandsWorkChain is designed for this purpose. In addition, the large compute cost of hybrid functional means it may be advantageous to split the full k-point path into smaller sub-paths, and run multiple self-consistent calculations in parallel instead of doing a single large calculation, given the constraints of the available computing resources. The number of kpoints included in each sub-path can be specified using the kpoints_per_subpath input.

Hint

Set kpoints_per_subpath to a very large number to run a single self-consistent calculation with all k-points.

See this tutorial for an example of how to run the VaspWorkChain.