Welcome to the Mapped-Averaging project

This project is a Python implementation of the Mapped-Averaging method for precise estimation of ensemble averages using molecular simulation.

_images/ep_ah_col.png

Theory

Absolute free energy

The classical Helmholtz configurational free energy \(A\) of a system at temperature \(T\) and volume \(V\) is related to its configurational partition function \(Q\) via:

\[A = -k_{\rm B}T \ln{Q} \qquad {\rm where} \qquad Q = \int_{V} e^{-\beta U\left({\bf x}\right)} {\rm d} {\bf x}\]

with \({\bf x}\) representing coordinates of all atoms. For simplicity, from now on we will be using unitless energy \({\cal U}\equiv \beta U\) and free energy \({\cal A}\equiv \beta A\) (and their derivatives; e.g., force), where \(\beta = 1/k_{\rm B}T\).

Free energy derivatives

Derivative of free energy w.r.t external perturbation or distortion (e.g., temperature or volume) is related to material properties. For example, average energy \(U\) and pressure \(P\) are given by

\[U = \partial_{\beta}{\cal A} \qquad {\rm and} \qquad P = -k_{\rm B}T \; \partial_{V}{\cal A}\]

(using \(\partial_{\nu}\) to represent the derivative operator \(\equiv\frac{\partial}{\partial \nu}\)). To get general expression of free energy derivative, we will use the vector \({\bf \lambda}\) to represent all perturbations of interest; e.g., \(\lambda=\left(\beta, V\right)\). The unitless free energy \({\cal A}\) at some \(\lambda\) is then given by

\[{\cal A}\left(\lambda\right) = - \ln{Q\left(\lambda\right)}\]

and its first and second derivatives are given by

\[\partial_{\nu}{\cal A} = -\frac{\partial_{\nu} Q}{Q} \qquad {\rm and} \qquad \partial_{\mu\nu}{\cal A} = -\frac{\partial_{\mu\nu}Q }{Q} + \frac{\partial_{\nu} Q}{Q} \; \frac{\partial_{\mu} Q}{Q}\]

To get \(Q\left(\lambda\right)\) derivatives, we need to recognize that, in general, the integration boundary \(\Omega\) is a function of \(\lambda\),

\[Q\left(\lambda\right) = \int_{\Omega\left(\lambda\right)} e^{-{\cal U}\left({\bf y},\lambda\right)} {\rm d} {\bf y}\]

where \({\bf y}\) represents the new coordinates that span the general phase space \(\Omega\left(\lambda\right)\), at some general \(\lambda\). However, we can use the change of variables technique to carry out the integration over some \(\lambda\)-independent phase-space boundary (we will use \(\Omega(\lambda)=V\)) and use the Jacobian determinant \(J\) to transform from the \(\bf x\) configurations (which span \(V\)) to the mapped coordinates \(\bf y({\bf x},\lambda)\), using \({\rm d}{\bf y} = J {\rm d}{\bf x}\)

\[Q\left(\lambda\right) = \int_{V} e^{-{\cal U}\left({\bf y}\left({\bf x},\lambda\right),\lambda\right)} J {\rm d} {\bf x}\]

now, \(Q\) can be written as a function of the “normal” (current) coordinates \(\bf x\)

\[Q\left(\lambda\right) = \int_{V} e^{-{\cal U'}} {\rm d} {\bf x}\]

where we define a modified potential \({\cal U'} \equiv {\cal U} - \ln{J}\), which accounts for the mapping. Now, the \(Q\) derivatives can be easily evaluated

\[\begin{split}\partial_{\nu} Q &=& - \int_{V} e^{-{\cal U'}} \; {\rm D}_{\nu} {\cal U'} \;\; {\rm d}{\bf x}\\ \partial_{\mu\nu}Q &=& \int_{V} e^{-{\cal U'}}\left[ \left({\rm D}_{\nu} {\cal U'}\right) \left({\rm D}_{\mu} {\cal U'}\right) - {\rm D}_{\mu\nu} {\cal U'} \right] \; {\rm d}{\bf x}\end{split}\]

where we used the \({\rm D}_{\nu}\) operator on some function \(f({\bf y}({\bf x},\lambda),\lambda)\) to represent the total (Lagrangian or material) derivative (i.e., \({\rm D}_{\nu} f = \partial_{\nu} f + \left(\partial_{\nu} {\bf y}\right) \cdot \nabla f\)). Accordingly, the derivatives of \({\cal A}\) are simply given by

\[\partial_{\nu}{\cal A} = \; \left< {\rm D}_{\nu} {\cal U'} \right> \qquad {\rm and} \qquad \partial_{\mu\nu}{\cal A} = \; \left< {\rm D}_{\mu\nu} {\cal U'} \right> - {\rm Cov}\left({\rm D}_{\nu} {\cal U'} \;,\; {\rm D}_{\mu} {\cal U'} \right)\]

where \({\rm Cov}\left(X,Y\right)\equiv \left<XY\right> - \left<X\right> \left<Y\right>\) is the covariance between the stochastic variables \(X\) and \(Y\). We are left with evaluating derivatives of \({\cal U'}\), which are related to \(U\) derivatives via \({\rm D}_{\nu} {\cal U'} = {\rm D}_{\nu} {\cal U} - {\rm D}_{\nu} J \; {\rm and} \; {\rm D}_{\mu\nu} {\cal U'} = {\rm D}_{\mu\nu} {\cal U} - {\rm D}_{\mu\nu} J\).

  1. Evaluation of energy derivatives

First, \(\cal U\) derivatives can be directly evaluated using the relation between the total (Lagrangian) and partial (Eulerian) derivatives:

\[{\rm D}_{\nu} {\cal U} = \partial_{\nu} {\cal U} - {\cal F} \cdot {\dot {\bf x}}^{\nu}\]

where \({\cal F}\equiv -\nabla {\cal U}=-\beta \nabla U\) is the force vector on all atoms and \({\dot {\bf x}}^{\nu}\equiv \partial_{\nu} {\bf y}\) represents the mapping “velocity” of the external perturbation \(\nu\). Applying this operator twice, we get the second derivative

\[{\rm D}_{\mu\nu}{\cal U} = \partial_{\mu\nu} {\cal U} - \left({\ddot {\bf x}}^{\mu\nu} + {\dot {\bf x}}^{\mu}\cdot \nabla {\dot {\bf x}}^{\nu} \right)\cdot {\cal F} + {\dot {\bf x}}^{\nu} \cdot {\Phi} \cdot {\dot {\bf x}}^{\mu} - \left({\dot {\bf x}}^{\nu} \cdot \partial_{\mu} {\cal F} + {\dot {\bf x}}^{\mu} \cdot \partial_{\nu} {\cal F} \right)\]

where \({\Phi}\equiv \nabla \nabla {\cal U} = \beta \nabla \nabla {\cal U}\;\) is the force constant matrix and \({\ddot {\bf x}}^{\mu\nu} \equiv \partial_{\mu} {\dot {\bf x}}^{\nu}\) is the \(\mu\nu\) “acceleration”, or the rate of change of \({\dot {\bf x}}^{\nu}\) w.r.t. \(\mu\) (note that \(\;{\ddot {\bf x}}^{\mu\nu} \neq {\ddot {\bf x}}^{\nu\mu}\)).


  1. Evaluation of Jacobian derivatives

Now, in order to get \({\rm D}_{\nu}J\), we need to do two steps. First, perform differentiation of the phase space volume, using (again) the change of variables technique

\[\partial_{\nu} {\Omega(\lambda)} = \partial_{\nu} \int_{\Omega(\lambda)} 1\; {\rm d} {\bf y} = \partial_{\nu} \int_{V} {\rm D}_{\nu}J \; {\rm d} {\bf x}\]

Second,use the Reynolds transport theorem along with the divergence theorem in our multidimensional space

\[\partial_{\nu} \int_{\Omega(\lambda)} f\left({\bf y},\lambda\right){\rm d} {\bf y} = \int_{\Omega(\lambda)} \left[\partial_{\nu} f + \nabla \cdot \left({\dot {\bf x}}^{\nu} f\right)\right] {\rm d} {\bf y}\]

Applying this theorem to our case of interest (i.e., \(f=1\); hence, \(\partial_{\nu}f=0\)), we get

\[\partial_{\nu} \int_{\Omega(\lambda)} 1\; {\rm d} {\bf y} = \int_{\Omega(\lambda)} \nabla \cdot \left({\dot {\bf x}}^{\nu} f\right) {\rm d} {\bf y} = \int_{V} \nabla \cdot \left({\dot {\bf x}}^{\nu} f\right) J {\rm d} {\bf x}\]

where we used the change of variables in the last term on the right-hand side. Now, equating both derivatives we directly get an expression for \({\rm D}_{\nu}J\)

\[{\rm D}_{\nu}J = J \nabla \cdot {\dot {\bf x}}^{\nu}\]

Repeating the same process with another derivative w.r.t. \(\mu\), we directly get

\[{\rm D}_{\mu\nu}J = J \left[\nabla \cdot \left(\partial_{\mu}{\dot {\bf x}}^{\nu}\right) + {\dot {\bf x}}^{\mu}\cdot \nabla\left(\nabla\cdot{\dot {\bf x}}^{\nu}\right)\right]\]

Since we are interested in evaluating the derivatives at \({\bf y}={\bf x}\), then \(J=1\); hence \({\rm D}_{\nu}J = \nabla \cdot {\dot {\bf x}}^{\nu}\) and \({\rm D}_{\mu\nu}J = \nabla \cdot \left(\partial_{\mu}{\dot {\bf x}}^{\nu}\right) + {\dot {\bf x}}^{\mu}\cdot \nabla\left(\nabla\cdot{\dot {\bf x}}^{\nu}\right)\).

Mapping velocity

Since \(Q\) is a function only of \(\lambda\), average free energy derivatives do not depend on how \({\bf x}\) get mapped into the \({\bf y}\) coordinates; or, in other words, they do not depend on the mapping velocity \({\dot {\bf x}}^{\nu}\). However, the fluctuations (or uncertainty) in these averages do depend on the mapping. Therefore, for the purposes of molecular simulation measurements we need to choose \({\dot {\bf x}^{\nu}}\) that reduces the stochastic uncertainty as much as possible.

To develop such a mapping we need to recognize that free energy derivatives are given as ensemble averages over \({\rm D}_{\nu} {\cal U'}\) (and its derivative, \({\rm D}_{\mu\nu} {\cal U'}\)). Therefore, a perfect mapping is such that \({\rm D}_{\nu} {\cal U'}\) is independent on coordinates \(\bf x\); hence

\[\partial_{\nu}{\cal A} = \; \left< {\rm D}_{\nu} {\cal U'} \right> = {\rm D}_{\nu} {\cal U'}\]

Using the above energy and Jacobian derivatives, we get

\[\partial_{\nu}{\cal A} = \partial_{\nu} {\cal U} - \nabla \cdot {\dot {\bf x}}^{\nu} - {\cal F}\cdot {\dot {\bf x}}^{\nu}\]

Solving this equation yields the unique mapping that yields no fluctuations; however, there are two problems. First of all, \(\partial_{\nu}{\cal A}\) is the very quantity that we need to measure. Second, since \({\dot {\bf x}}^{\nu}\) is a multidimensional vector (\(3N\) for the case of atomic systems) we have under-determined system as we only have one equation to solve.

The first problem is solved using the fact that \({\dot {\bf x}}^{\nu}\) does not affect average estimates; hence, it can be derived from another (known) system that approximates \({\mathcal A}\), which we will call reference.

\[\partial_{\nu}{\cal A}^{\rm ref} = \partial_{\nu} {\cal U}^{\rm ref} - \nabla \cdot {\dot {\bf x}}^{\nu} - {\cal F}^{\rm ref}\cdot {\dot {\bf x}}^{\nu}\]

where \(\partial_{\nu}{\cal A}^{\rm ref}\) is a reference-dependent constant (function only of \(\lambda\)), named \(c\). Because the reference only approximates \({\mathcal A}\), the \({\dot {\bf x}}^{\nu}\) obtained from this formula will not yield a zero-fluctuation average; however, if the reference is a good approximation, we can expect substantially smaller fluctuations in the average.

To address the second problem, we will assert that each degree of freedom (dof) is mapped with the same amount (scaling); so

\[\partial_{\nu} {\cal u}^{\rm ref} - \partial_{x} {\dot x}^{\nu} - {\cal f}^{\rm ref} {\dot x}^{\nu} = \partial_{\nu}{\cal a}^{\rm ref} \equiv c(\lambda)\]

where small symbols represent an intensive quantities (e.g., \(u\equiv U/{\rm dof}\)), and \(x\) is one of the coordinates of \({\bf x}\). For a given \(\lambda\), this is a standard first-order differential equation, with the unknown being the velocity of mapping \({\dot x}(x,\lambda)\). For simplicity, we will drop the \(\lambda\) dependency from all terms, hence

\[\partial_{x} {\dot x}^{\nu}\left( x\right) + {\cal f}\left( x\right)^{\rm ref} {\dot x}^{\nu}\left( x\right) = \partial_{\nu}{\cal u}\left( x\right)^{\rm ref} - c \equiv g\left( x\right)\]

where \(g(x)\) is a known function once a reference system is chosen. The solution of this equation is given by

\[{\dot x}^{\nu} = e^{-I(x)} \left(\int g \; e^{I(x)}{\rm d}x + {\rm constant} \right)\]

where \(I(x) \equiv \int f(x)^{\rm ref} {\rm d}x\). The integration constant can be evaluated by requiring the mapping to have some value at some coordinate \(x\).

Application to Crystals

For crystalline systems, a reference for mapping can be chosen to be non-interacting harmonic system, or Einstein crystal (EC). Using the offset from the lattice sites (i.e., \(r-r_{\rm lattice}\)) to represent our coordinate \(x\), the potential energy for each dof is then given by

\[u^{\rm ref} = \alpha(\lambda) x^2\]

hence, the free energy is given by

\[a^{\rm ref} = \frac{1}{2} \ln \left(\alpha(\lambda)/\pi\right)\]

Accordingly, \(f^{\rm ref} = -2 \alpha(\lambda) x\;\) , \(\; I=- \alpha(\lambda) x^2\), and \(g=(\partial_{\nu}\alpha)\left[ x^2 - 1/2\alpha\right]\). Requiring the mapping velocity to vanish when atoms are at their lattice sites; i.e., \({\dot x}^{\nu} =0\) at \(x = 0\), the solution of the velocity mapping equation is reduced to

\[\begin{split}{\dot x}^{\nu} &=& \; e^{-I(x)} \int_{0}^{x} g \; e^{I(x)}{\rm d}x \\ &=& \; e^{\alpha(\lambda) x^2} \int_{0}^{x} g \; e^{-\alpha(\lambda) x^2}{\rm d}x \\ &=& \; - \frac{\partial_{\nu} \alpha}{2 \alpha} \; x\end{split}\]

We will now consider two cases: temperature and volume free energy derivative; or energy and pressure, consequently. Note that all energy quantities (and its derivatives, like forces) are multiplied by \(\beta\); hence, the force constant \(\alpha \rightarrow \beta\alpha\), where \(\alpha\) is now temperature independent.

  • case 1: \(\nu = \beta\)
\[\partial_{\beta} (\beta\alpha) = \alpha\]

hence,

\[{\dot x}^{\beta} = -\frac{1}{2\beta} \; x\]
  • case 2: \(\nu = V\)

Here, we need to use the \(L\)-scaled coordinates (i.e., \(x\rightarrow x/L\)); hence \(\alpha \rightarrow L^2 \beta\alpha = V^{2/3}\beta\alpha(V)\)

\[\begin{split}{\dot x}^{V} &=& - \frac{1}{2} \frac{\partial_{\nu} (V^{2/3}\alpha(V))}{V^{2/3}\alpha(V)} \; x \\ &=& \left(-\frac{1}{2} \frac{\partial_{V}\alpha(V)}{\alpha(V)} - \frac{1}{3V}\right) \; x \\ &=& \left(\beta\, p^{\rm harm}- \frac{1}{3V}\right) \; x\end{split}\]

Where \(p^{\rm harm}\) is the harmonic pressure per each dof. For more details you can refer to our PRE and JCTC work.

Anharmonic energy

Conventional (no mapping):

\[U^{\rm ah} = \left< U \right> - \frac{d(N-1)}{2} k_{\rm B} T - U^{\rm lat}\]

Mapped averaging (Einstein crystal reference):

\[U^{\rm ah} = \left< U + \frac{1}{2} {\bf F}\cdot\Delta{\bf r}\right> - U^{\rm lat}\]

Anharmonic pressure

Conventional (uniform scaling):

\[P^{\rm ah} = \left< P^{\rm vir} \right> + \rho k_{\rm B}T - P^{\rm qh} - P^{\rm lat}\]

Mapped averaging (Einstein crystal reference):

\[P^{\rm ah} = \left< P^{\rm vir} + c \; {\bf F}\cdot\Delta{\bf r} \right> - P^{\rm lat}\]

where \(c\) is a constant and given by, \(c = \frac{\beta P^{\rm qh} - \rho}{d\left(N-1\right)}\)

The formulas for both anharmonic energy and pressure are summarized in Figure 1.

_images/pyhma_eqs.png

Conventional and HMA formulas for anharmonic energy and pressure.

Equivalence of Conv and HMA

The equivalence between both conventional and mapped-averaging expressions can be easily seen by recognizing this equality for crystalline systems:

\[\left<{\bf F}\cdot\Delta{\bf r} \right> = - d\left(N-1\right) k_{\rm B} T\]

Plugging this expression into the HMA expressions yields the conventional average expression.

Proof:

The general expression for the configurational partition function is given by:

\[Q = \int e^{-\beta U} {\rm d} {\bf x}\]

For crystalline systems, we use \(\Delta {\bf x} \equiv {\bf x} - {\bf x}^{\rm lat}\)

\[Q = \int_{\rm WS} e^{-\beta U} \, {\rm d}^{dN}\Delta x\]

Where the integration is carried out withing the Wigner-Seitz (WS) volume of each atom. This can be written as

\[Q = \int_{\rm WS} {\rm d}^{dN-1}\Delta x \; \int_{\rm WS} e^{-\beta U} \, {\rm d} \Delta x_1\]

Using integration by parts:

\[Q = \int_{\rm WS} \; \left[\Delta x_1 e^{-\beta U}\right]_{\Delta x_1^{\rm min}}^{\Delta x_1^{\rm max}} {\rm d}^{dN-1}\Delta x \;\; -\beta \int_{\rm WS} F_1 \Delta x_1 \; e^{-\beta U} {\rm d}^{dN}\Delta x\]

The surface (first) term on the right-hand side vanishes due to large values of \(U\) at the surface of the WS volume. Dividing by Q, we finally get:

\[\left<F_1 \Delta x_1 \right> = - k_{\rm B} T\]

For \(d(N-1)\) degrees-of-freedom, we get: \(\left<{\bf F}\cdot\Delta{\bf r} \right> = - d\left(N-1\right) k_{\rm B} T\)

pyHMA

pyHMA is a VASP post-processor (written in Python 3) for precise measurment of crystalline anharmonic properties using Harmonically Mapped Averaging (HMA) method. It is based on post-processing vasprun.xml output file(s) obtained from NVT Born-Oppenheimer ab initio molecular dynamics (AIMD) simulation. See Table 1 as an example for HMA expressions for anharmonic energy and pressure, along with direct/conventional (Conv) counterpart.

pyHMA is free software: you can modify and/or redistribute it under the terms of the Mozilla Public License (MPL 2.0).

Please cite this paper when using pyHMA package in your research:

Sabry G. Moustafa, Apoorva Purohit, Andrew J. Schultz, and David A. Kofke, pyHMA: A VASP Post-processor for Precise Measurement of Crystalline Anharmonic Properties using Harmonically Mapped Averaging, Comput. Phys. Commun., 2020.

Note

  • The term anharmonicity is commonly used in literature to qualitatively describe a system with no equilibrium configuration at \(0\) K (i.e., imaginary frequencies); in other words, it refers to a “non-harmonic” potential energy surface.
  • Here, however, we define anharmonic contribution of some property \(X\) as the residual in excess of the harmonic approximation; i.e., \(X_{\rm ah} \equiv X - (X_{\rm lat} + X_{\rm qh})\). Therefore, this specific definition is meaningless if the system does not have equilibrium lattice configuration at \(T=0\) K. For this reason, pyHMA checks forces on the first configuration to make sure the system has an equilibrium configuration (i.e., zero forces).

Installation

Attention

pyHMA is written in Python 3 syntax; hence, only compatible pip installer can be used.

From PyPI (Recommended)

pyHMA can be directlly installed from Python Package Index (PyPI) using pip command:

$ pip install pyhma

From Github (development version)

First, clone the source code from the Github repository:

$ git clone https://github.com/etomica/mapped-averaging.git

and then, go to the mapped-averaging/pyhma directory to install pyHMA locally using pip command:

$ pip install --user -e .

Usage

_images/pyhma_fig.png

Overall structure of pyHMA package.

As shown in the diagram, pyHMA post-processes VASP AIMD data in two stages: first, reading vasprun.xml output file(s) (pyhma.vasp_reader module), and then process the data to compute anharmonic properties, using conventional (Conv) and harmonically-mapped averaging (HMA) approaches (pyhma.processor module). Below is a detailed description of each stage along with an example of fcc aluminum at high pressure (\(P_{\rm lat}=114.4\) GPa; corresponds to \(V=10 A^3\)/atom) and temperature (\(1000\) K).

Attention

  1. It is worth emphasizing here that \(U_{\rm lat}\) , \(P_{\rm lat}\) , and \(P_{\rm qh}\) inputs needed for the anharmonic calculations (see Table 1) must be obtained using the same setting with with AIMD (e.g., system size, DFT parameters, etc.) in order to ensure having the same potential-energy surface.
  2. However, lattice and quasiharmonic contributions needed for computing the full property (lat+qh+ah) should be obtained from separate calculations as those components can converge at a different rates from the anharmonic one. The full property \(X\) can be then decomposed to \(X = X^{*}_{\rm lat} + X^{*}_{\rm qh} + X_{\rm ah}\), where the asterisk refers to using different DFT setting.
  3. To summarize points 1 and 2 above: \(X_{\rm lat}\) and \(X_{\rm qh}\) used as inputs to pyHMA should not be used as the lat and qh contributions of the full property. On the other hand, \(X^{*}_{\rm lat}\) and \(X^{*}_{\rm qh}\) used for computing the full property should not be used as the lat and qh input parameters for pyHMA. Mixing between these two uses will result in inconsistent and inaccurate results.

1. Interactive usage

Reading

In this stage, pyhma.vasp_reader.read() function parses vasprun.xml AIMD simulation file and returns a data dictionary with simulation information to be processed in the second stage. In the below example, the AIMD simulation consists of two consecutive runs of the same simulation, where the initial configuration of the first run is the fcc lattice positions.

>>> import pyhma
>>> data = pyhma.read(['vasprun-1.xml', 'vasprun-2.xml'])

The data dictionary contains the following keys:

  • box_row_vecs (Å): box edge row vectors
  • num_atoms: total number of atoms
  • volume_atom (Å^3/atom): average volume per atom
  • basis: list of atomic fractional positions of first configuration
  • position: instantaneous atomic fractional positions
  • force (eV/Å): instantaneous atomic forces
  • energy (eV/atom): instantaneous potential energy (E0)
  • pressure (GPa): instantaneous pressure
  • pressure_ig (GPa): ideal gas pressure
  • timestep (fs): MD timestep
  • temperature (K): NVT set temperature

This function also takes optional arguments: raw_files, force_tol, and verbose. By setting raw_files=True (default is False), the following .dat files will be generated, which contain raw data from vasprun.xml file(s). These files are only for diagnostics purposes, and not used in the process stage.

  • poscar_eq.dat: initial (must be the equilibrium) POSCAR file (in fractional coordinates)
  • energy.dat: instantaneous potential energy, E0 (in eV/atom)
  • pressure.dat: instantaneous pressure (in GPa)
  • posfor.dat: instantaneous atomic positions and forces (in Å and eV/Å)

The second argument (force_tol) is the maximum force allowed on any atom in the initial configuration (default is 0.001 eV/Å). This is used to make sure the initial configuration is the minimized (or, equilibrium). If this condition is not matched, the function will be interrupted and prints a warning with a list of atoms having large initial forces.

To print the lattice vectors, and the initial atomic positions and forces, you can add verbose=True (default is False) to the arguments.

Below is the output resulted from using all optional arguments:

>>> data = pyhma.read(['vasprun-1.xml', 'vasprun-2.xml'], raw_files=True, force_tol=0.002, verbose=True)

Reading vasprun-1.xml vasprun-2.xml
==============================================
 first configuration data from vasprun-1.xml
 ----------------------------------------------
 32 atoms (total)
 Box edge (row) vectors
  6.83990379   0.00000000   0.00000000
  0.00000000   6.83990379   0.00000000
  0.00000000   0.00000000   6.83990379

 atom       xyz (direct) coordinates (A)                   xyz forces (eV/A)
   1    0.00000000   0.00000000   0.00000000      0.00000098   0.00000047  -0.00000207
   2    0.50000000   0.00000000   0.00000000     -0.00000114  -0.00000068  -0.00000010
   3    0.00000000   0.50000000   0.00000000      0.00000092  -0.00000001   0.00000009
   4    0.50000000   0.50000000   0.00000000     -0.00000075   0.00000133   0.00000002
   5    0.00000000   0.00000000   0.50000000      0.00000156  -0.00000124   0.00000148
   6    0.50000000   0.00000000   0.50000000     -0.00000228  -0.00000113   0.00000060
   7    0.00000000   0.50000000   0.50000000      0.00000058   0.00000088  -0.00000025
   8    0.50000000   0.50000000   0.50000000     -0.00000022   0.00000147   0.00000020
   9    0.25000000   0.25000000   0.00000000      0.00000626   0.00000445   0.00000093
  10    0.75000000   0.25000000   0.00000000     -0.00000698   0.00000368  -0.00000093
  11    0.25000000   0.75000000   0.00000000      0.00000876  -0.00000401  -0.00000012
  12    0.75000000   0.75000000   0.00000000     -0.00000828  -0.00000414  -0.00000230
  13    0.25000000   0.25000000   0.50000000      0.00000749   0.00000298  -0.00000160
  14    0.75000000   0.25000000   0.50000000     -0.00000890   0.00000465   0.00000053
  15    0.25000000   0.75000000   0.50000000      0.00000879  -0.00000420   0.00000048
  16    0.75000000   0.75000000   0.50000000     -0.00000810  -0.00000479   0.00000260
  17    0.00000000   0.25000000   0.25000000      0.00000300   0.00000494   0.00000561
  18    0.50000000   0.25000000   0.25000000     -0.00000351   0.00000432   0.00000600
  19    0.00000000   0.75000000   0.25000000      0.00000149  -0.00000550   0.00000703
  20    0.50000000   0.75000000   0.25000000     -0.00000171  -0.00000398   0.00000669
  21    0.00000000   0.25000000   0.75000000      0.00000072   0.00000397  -0.00000567
  22    0.50000000   0.25000000   0.75000000     -0.00000041   0.00000300  -0.00000674
  23    0.00000000   0.75000000   0.75000000     -0.00000135  -0.00000323  -0.00000732
  24    0.50000000   0.75000000   0.75000000      0.00000067  -0.00000412  -0.00000650
  25    0.25000000   0.00000000   0.25000000      0.00000892   0.00000095   0.00000758
  26    0.75000000   0.00000000   0.25000000     -0.00000866  -0.00000109   0.00000571
  27    0.25000000   0.50000000   0.25000000      0.00000780  -0.00000037   0.00000582
  28    0.75000000   0.50000000   0.25000000     -0.00000778   0.00000087   0.00000677
  29    0.25000000   0.00000000   0.75000000      0.00000777  -0.00000093  -0.00000665
  30    0.75000000   0.00000000   0.75000000     -0.00000764  -0.00000253  -0.00000547
  31    0.25000000   0.50000000   0.75000000      0.00000795   0.00000070  -0.00000629
  32    0.75000000   0.50000000   0.75000000     -0.00000705   0.00000248  -0.00000653

  Reading vasprun-1.xml  ( 1 out of 2 )
  Reading vasprun-2.xml  ( 2 out of 2 )
>>>

Note

  • The read() function can handle incomplete vasprun.xml file(s) generated from interrupted AIMD runs (by the user, or due to some time constraint). The was possible with using the recovery option of LXML parser.
  • If your MD simulation starts from a thermalized/equilibrated (not lattice) configuration, you can just run a single-point energy calculation on the lattice configuration (using the same DFT parameters used with AIMD) and use the output as your vasprun-1.xml input to pyHMA, followed by your thermalized vasprun.xml files.

Processing

In this stage, the data from previous step are processed to compute anharmonic properties. This is done, first, by creating a processor instance (proc) of the pyhma.processor.Processor() class, using the data dictionary and the quasiharmonic pressure (GPa) at the given \(V\) and \(T\). The class takes one optional argument, meV, to specify whether to report the energy results in meV (meV=True) or eV (meV=False, default). At this point, the proc object carries the same information exist in the data dictionary.

>>> proc = pyhma.Processor(data, pressure_qh=4.94154, meV=True)

Then, the instantaneous properties are obtained by calling the pyhma.processor.Processor.process() method, which takes two optional arguments: steps_tot and verbose. The steps_tot is the total number of MD steps to be used for ensemble averages (default is all steps found in vasprun.xml) and the verbose (default is False) directs pyHMA to print information while running. The output is saved to a 2D array (proc.out_data attribute) of length equal to all MD steps (or, steps_tot if set) and contains four columns: Conv and HMA anharmonic energies and pressures.

>>> proc.process(steps_tot=10000, verbose=True)

   Simulation data
   ===============
    Set temperature       (K): 1000.00000
    Volume         (A^3/atom):   10.00000
    MD timestep          (fs):    2.00000
    Lattice energy  (eV/atom):   -2.21324
    Harmonic energy (eV/atom):    0.12522
    Lattice pressure    (GPa):  114.44281
    Harmonic pressure   (GPa):    4.94525

    Found 11036  total MD steps
    Using 10000  user-set MD steps

    Computing instantaneous properties ...

The method also generates energy_ah.out and pressure_ah.out output files for the instantaneous anharmonic energy (eV/atom; or meV/atom if meV=True) and pressure (GPa), respectively. Each file contains three columns; time (in fs), Conv, and HMA estimates of the property. This data is plotted below.

_images/ep_ah.png

Time vartaion of the anharmonic energy (energy_ah.out) and pressure (pressure_ah.out).

Lastly, ensemble statistics (average, uncertainty, and block correlation) are obtained using block averaging technique. This is done by invoking the pyhma.processor.Processor.get_stats() method, which takes two required arguments (steps_eq and blocksize) and one optional argument (verbose). The steps_eq is the number of MD steps used for equilibaration and blocksize is the number of MD steps in each block used for block averaging; so, steps_tot/blocksize is the number of blocks to be used. If True, the verbose flag will direct pyHMA to print samples information.

The method returns the statistics output in a form of a dictionary (stats) of four entries: Conv and HMA anharmonic energies (e_ah_conv and e_ah_hma) and pressures (p_ah_conv and p_ah_hma), each with three elements of average (avg), uncertainty (err), and adjacent blocks correlation (cor). The output can be presented in a more user-friendly format by using pyhma.processor.Processor.print_stats() method, which yields the output shown below.

>>> stats = proc.get_stats(steps_eq=1000, blocksize=90, verbose=True)

Block averaging statistics
==========================
 9000 production steps (after 1000 equilibration steps)
 100 blocks (blocksize = 90  steps)

 Computing statistics ...

>>> proc.print_stats(stats)

 e_ah_conv (meV/atom):    2.10911 +/- 1.1e+00    cor: 0.35
 e_ah_hma  (meV/atom):    0.42650 +/- 4.3e-02    cor: 0.11
 p_ah_conv      (GPa):    0.01371 +/- 3.1e-02    cor: 0.36
 p_ah_hma       (GPa):   -0.03419 +/- 4.1e-03    cor: 0.26

Note

  • The correlation should be as small as possible (less than \(\lessapprox 0.2\)) to ensure accurate estimate of uncertainty. Although increasing the blocksize reduces the correlations, the number of blocks should be large enough (\(\gtrapprox 50\)) to yield meaningful statistics.
  • The Conv and HMA should be statistically consistent, as long as the results are converged with respect to timestep. However, the above example has inconsistent results due to using relatively large timestep (\(\Delta t=2\) fs), though the HMA estimate is still accurate as it converges faster than Conv (see our JCP2018 work for details).

2. pyhma script

Anharmonic properties can be computed in one step from the command-line using pyhma script, which uses the same arguments as those used above, except for the use of r and v short forms of raw_files and verbose options, respectively. The usage of pyhma is given here, where the square brackets represent optional keys:

$ # Usage:
$ # pyhma --pressure_qh=qh pressure (GPa) --steps_eq=equilib. steps --blocksize=block size
$ #      [--steps_tot=used steps] [--force_tol=force tolerance] [--raw_files|-r] [--meV]
$ #      [--verbose|-v] vasprun-1.xml vasprun-2.xml ...

Using the pyhma script (with default option) to compute anharmonic energy and pressure of the above fcc aluminum example yields:

$ pyhma --pressure_qh=4.94525 --steps_eq=1000 --steps_tot=10000 --blocksize=90  -r --meV
        vasprun-1.xml  vasprun-2.xml

  e_ah_conv (meV/atom):    2.10911 +/- 1.1e+00    cor: 0.35
  e_ah_hma  (meV/atom):    0.42650 +/- 4.3e-02    cor: 0.11
  p_ah_conv      (GPa):    0.01371 +/- 3.1e-02    cor: 0.36
  p_ah_hma       (GPa):   -0.03419 +/- 4.1e-03    cor: 0.26

3. Parameters table

The Table below gives a summary of both required and optional arguments used by pyHMA.

_images/parameters_tab.png

Modules

pyhma.vasp_reader

pyhma.processor

Indices and tables