Usage¶
Fig. 2 Overall structure of pyHMA package.
As shown in the diagram, pyHMA post-processes VASP AIMD data in two stages: first, reading vasprun.xml output file(s) (pyhma.vasp_reader module), and then process the data to compute anharmonic properties, using conventional (Conv) and harmonically-mapped averaging (HMA) approaches (pyhma.processor module).
Below is a detailed description of each stage along with an example of fcc aluminum at high pressure (\(P_{\rm lat}=114.4\) GPa; corresponds to \(V=10 A^3\)/atom) and temperature (\(1000\) K).
Attention
- It is worth emphasizing here that \(U_{\rm lat}\) , \(P_{\rm lat}\) , and \(P_{\rm qh}\) inputs needed for the anharmonic calculations (see Table 1) must be obtained using the same setting with with AIMD (e.g., system size, DFT parameters, etc.) in order to ensure having the same potential-energy surface.
- However, lattice and quasiharmonic contributions needed for computing the full property (lat+qh+ah) should be obtained from separate calculations as those components can converge at a different rates from the anharmonic one. The full property \(X\) can be then decomposed to \(X = X^{*}_{\rm lat} + X^{*}_{\rm qh} + X_{\rm ah}\), where the asterisk refers to using different DFT setting.
- To summarize points 1 and 2 above: \(X_{\rm lat}\) and \(X_{\rm qh}\) used as inputs to
pyHMAshould not be used as the lat and qh contributions of the full property. On the other hand, \(X^{*}_{\rm lat}\) and \(X^{*}_{\rm qh}\) used for computing the full property should not be used as the lat and qh input parameters forpyHMA. Mixing between these two uses will result in inconsistent and inaccurate results.
1. Interactive usage¶
Reading
In this stage, pyhma.vasp_reader.read() function parses vasprun.xml AIMD simulation file and returns a data dictionary with simulation information to be processed in the second stage.
In the below example, the AIMD simulation consists of two consecutive runs of the same simulation, where the initial configuration of the first run is the fcc lattice positions.
>>> import pyhma
>>> data = pyhma.read(['vasprun-1.xml', 'vasprun-2.xml'])
The data dictionary contains the following keys:
- box_row_vecs (Å): box edge row vectors
- num_atoms: total number of atoms
- volume_atom (Å^3/atom): average volume per atom
- basis: list of atomic fractional positions of first configuration
- position: instantaneous atomic fractional positions
- force (eV/Å): instantaneous atomic forces
- energy (eV/atom): instantaneous potential energy (E0)
- pressure (GPa): instantaneous pressure
- pressure_ig (GPa): ideal gas pressure
- timestep (fs): MD timestep
- temperature (K): NVT set temperature
This function also takes optional arguments: raw_files, force_tol, and verbose. By setting raw_files=True (default is False), the following .dat files will be generated, which contain raw data from vasprun.xml file(s). These files are only for diagnostics purposes, and not used in the process stage.
- poscar_eq.dat: initial (must be the equilibrium) POSCAR file (in fractional coordinates)
- energy.dat: instantaneous potential energy, E0 (in eV/atom)
- pressure.dat: instantaneous pressure (in GPa)
- posfor.dat: instantaneous atomic positions and forces (in Å and eV/Å)
The second argument (force_tol) is the maximum force allowed on any atom in the initial configuration (default is 0.001 eV/Å). This is used to make sure the initial configuration is the minimized (or, equilibrium). If this condition is not matched, the function will be interrupted and prints a warning with a list of atoms having large initial forces.
To print the lattice vectors, and the initial atomic positions and forces, you can add verbose=True (default is False) to the arguments.
Below is the output resulted from using all optional arguments:
>>> data = pyhma.read(['vasprun-1.xml', 'vasprun-2.xml'], raw_files=True, force_tol=0.002, verbose=True)
Reading vasprun-1.xml vasprun-2.xml
==============================================
first configuration data from vasprun-1.xml
----------------------------------------------
32 atoms (total)
Box edge (row) vectors
6.83990379 0.00000000 0.00000000
0.00000000 6.83990379 0.00000000
0.00000000 0.00000000 6.83990379
atom xyz (direct) coordinates (A) xyz forces (eV/A)
1 0.00000000 0.00000000 0.00000000 0.00000098 0.00000047 -0.00000207
2 0.50000000 0.00000000 0.00000000 -0.00000114 -0.00000068 -0.00000010
3 0.00000000 0.50000000 0.00000000 0.00000092 -0.00000001 0.00000009
4 0.50000000 0.50000000 0.00000000 -0.00000075 0.00000133 0.00000002
5 0.00000000 0.00000000 0.50000000 0.00000156 -0.00000124 0.00000148
6 0.50000000 0.00000000 0.50000000 -0.00000228 -0.00000113 0.00000060
7 0.00000000 0.50000000 0.50000000 0.00000058 0.00000088 -0.00000025
8 0.50000000 0.50000000 0.50000000 -0.00000022 0.00000147 0.00000020
9 0.25000000 0.25000000 0.00000000 0.00000626 0.00000445 0.00000093
10 0.75000000 0.25000000 0.00000000 -0.00000698 0.00000368 -0.00000093
11 0.25000000 0.75000000 0.00000000 0.00000876 -0.00000401 -0.00000012
12 0.75000000 0.75000000 0.00000000 -0.00000828 -0.00000414 -0.00000230
13 0.25000000 0.25000000 0.50000000 0.00000749 0.00000298 -0.00000160
14 0.75000000 0.25000000 0.50000000 -0.00000890 0.00000465 0.00000053
15 0.25000000 0.75000000 0.50000000 0.00000879 -0.00000420 0.00000048
16 0.75000000 0.75000000 0.50000000 -0.00000810 -0.00000479 0.00000260
17 0.00000000 0.25000000 0.25000000 0.00000300 0.00000494 0.00000561
18 0.50000000 0.25000000 0.25000000 -0.00000351 0.00000432 0.00000600
19 0.00000000 0.75000000 0.25000000 0.00000149 -0.00000550 0.00000703
20 0.50000000 0.75000000 0.25000000 -0.00000171 -0.00000398 0.00000669
21 0.00000000 0.25000000 0.75000000 0.00000072 0.00000397 -0.00000567
22 0.50000000 0.25000000 0.75000000 -0.00000041 0.00000300 -0.00000674
23 0.00000000 0.75000000 0.75000000 -0.00000135 -0.00000323 -0.00000732
24 0.50000000 0.75000000 0.75000000 0.00000067 -0.00000412 -0.00000650
25 0.25000000 0.00000000 0.25000000 0.00000892 0.00000095 0.00000758
26 0.75000000 0.00000000 0.25000000 -0.00000866 -0.00000109 0.00000571
27 0.25000000 0.50000000 0.25000000 0.00000780 -0.00000037 0.00000582
28 0.75000000 0.50000000 0.25000000 -0.00000778 0.00000087 0.00000677
29 0.25000000 0.00000000 0.75000000 0.00000777 -0.00000093 -0.00000665
30 0.75000000 0.00000000 0.75000000 -0.00000764 -0.00000253 -0.00000547
31 0.25000000 0.50000000 0.75000000 0.00000795 0.00000070 -0.00000629
32 0.75000000 0.50000000 0.75000000 -0.00000705 0.00000248 -0.00000653
Reading vasprun-1.xml ( 1 out of 2 )
Reading vasprun-2.xml ( 2 out of 2 )
>>>
Note
- The read() function can handle incomplete vasprun.xml file(s) generated from interrupted AIMD runs (by the user, or due to some time constraint). The was possible with using the recovery option of LXML parser.
- If your MD simulation starts from a thermalized/equilibrated (not lattice) configuration, you can just run a single-point energy calculation on the lattice configuration (using the same DFT parameters used with AIMD) and use the output as your
vasprun-1.xmlinput to pyHMA, followed by your thermalizedvasprun.xmlfiles.
Processing
In this stage, the data from previous step are processed to compute anharmonic properties. This is done, first, by creating a processor instance (proc) of the pyhma.processor.Processor() class, using the data dictionary and the quasiharmonic pressure (GPa) at the given \(V\) and \(T\). The class takes one optional argument, meV, to specify whether to report the energy results in meV (meV=True) or eV (meV=False, default). At this point, the proc object carries the same information exist in the data dictionary.
>>> proc = pyhma.Processor(data, pressure_qh=4.94154, meV=True)
Then, the instantaneous properties are obtained by calling the pyhma.processor.Processor.process() method, which takes two optional arguments: steps_tot and verbose. The steps_tot is the total number of MD steps to be used for ensemble averages (default is all steps found in vasprun.xml) and the verbose (default is False) directs pyHMA to print information while running.
The output is saved to a 2D array (proc.out_data attribute) of length equal to all MD steps (or, steps_tot if set) and contains four columns: Conv and HMA anharmonic energies and pressures.
>>> proc.process(steps_tot=10000, verbose=True)
Simulation data
===============
Set temperature (K): 1000.00000
Volume (A^3/atom): 10.00000
MD timestep (fs): 2.00000
Lattice energy (eV/atom): -2.21324
Harmonic energy (eV/atom): 0.12522
Lattice pressure (GPa): 114.44281
Harmonic pressure (GPa): 4.94525
Found 11036 total MD steps
Using 10000 user-set MD steps
Computing instantaneous properties ...
The method also generates energy_ah.out and pressure_ah.out output files for the instantaneous anharmonic energy (eV/atom; or meV/atom if meV=True) and pressure (GPa), respectively. Each file contains three columns; time (in fs), Conv, and HMA estimates of the property. This data is plotted below.
Fig. 3 Time vartaion of the anharmonic energy (energy_ah.out) and pressure (pressure_ah.out).
Lastly, ensemble statistics (average, uncertainty, and block correlation) are obtained using block averaging technique. This is done by invoking the pyhma.processor.Processor.get_stats() method, which takes two required arguments (steps_eq and blocksize) and one optional argument (verbose). The steps_eq is the number of MD steps used for equilibaration and blocksize is the number of MD steps in each block used for block averaging; so, steps_tot/blocksize is the number of blocks to be used. If True, the verbose flag will direct pyHMA to print samples information.
The method returns the statistics output in a form of a dictionary (stats) of four entries: Conv and HMA anharmonic energies (e_ah_conv and e_ah_hma) and pressures (p_ah_conv and p_ah_hma), each with three elements of average (avg), uncertainty (err), and adjacent blocks correlation (cor).
The output can be presented in a more user-friendly format by using pyhma.processor.Processor.print_stats() method, which yields the output shown below.
>>> stats = proc.get_stats(steps_eq=1000, blocksize=90, verbose=True)
Block averaging statistics
==========================
9000 production steps (after 1000 equilibration steps)
100 blocks (blocksize = 90 steps)
Computing statistics ...
>>> proc.print_stats(stats)
e_ah_conv (meV/atom): 2.10911 +/- 1.1e+00 cor: 0.35
e_ah_hma (meV/atom): 0.42650 +/- 4.3e-02 cor: 0.11
p_ah_conv (GPa): 0.01371 +/- 3.1e-02 cor: 0.36
p_ah_hma (GPa): -0.03419 +/- 4.1e-03 cor: 0.26
Note
- The correlation should be as small as possible (less than \(\lessapprox 0.2\)) to ensure accurate estimate of uncertainty. Although increasing the
blocksizereduces the correlations, the number of blocks should be large enough (\(\gtrapprox 50\)) to yield meaningful statistics. - The Conv and HMA should be statistically consistent, as long as the results are converged with respect to timestep. However, the above example has inconsistent results due to using relatively large timestep (\(\Delta t=2\) fs), though the HMA estimate is still accurate as it converges faster than Conv (see our JCP2018 work for details).
2. pyhma script¶
Anharmonic properties can be computed in one step from the command-line using pyhma script, which uses the same arguments as those used above, except for the use of r and v short forms of raw_files and verbose options, respectively. The usage of pyhma is given here, where the square brackets represent optional keys:
$ # Usage:
$ # pyhma --pressure_qh=qh pressure (GPa) --steps_eq=equilib. steps --blocksize=block size
$ # [--steps_tot=used steps] [--force_tol=force tolerance] [--raw_files|-r] [--meV]
$ # [--verbose|-v] vasprun-1.xml vasprun-2.xml ...
Using the pyhma script (with default option) to compute anharmonic energy and pressure of the above fcc aluminum example yields:
$ pyhma --pressure_qh=4.94525 --steps_eq=1000 --steps_tot=10000 --blocksize=90 -r --meV
vasprun-1.xml vasprun-2.xml
e_ah_conv (meV/atom): 2.10911 +/- 1.1e+00 cor: 0.35
e_ah_hma (meV/atom): 0.42650 +/- 4.3e-02 cor: 0.11
p_ah_conv (GPa): 0.01371 +/- 3.1e-02 cor: 0.36
p_ah_hma (GPa): -0.03419 +/- 4.1e-03 cor: 0.26
3. Parameters table¶
The Table below gives a summary of both required and optional arguments used by pyHMA.
