QMMM workflow using LAMMPS and VOTCA-XTP

What is this tutorial about

In this tutorial, we will learn how to set and perform excited state calculation using the Votca XTP library. We will use thiophene as our QM region.

Requirements

  • You will need to install VOTCA using the instructions described here

  • Once the installation is completed you need to activate the VOTCA enviroment by running the VOTCARC.bash script that has been installed at the bin subfolder for the path that you have provided for the installation step above

Interacting with the XTP command line interface

The XTP package offers the following command line interface that the user can interact with: * xtp_map * xtp_parallel * xtp_run * xtp_tools

Run the following command to view the help message of xtp_tools:

[1]:
!xtp_tools -h
/bin/sh: 1: xtp_tools: not found

Note

  • In Jupyter the ! symbol means: run the following command as a standard unix command

  • In Jupyter the command %env set an environmental variable

Setting the environment

Remove previous hdf5 file

[2]:
!rm -f state.hdf5

Generate the topology from the Gromacs file

We will first generate the mapping from MD coordinates to segments, creating an hdf5 file to store the results. You can explore the generated state.hdf5 file with e.g. hdf5itebrowser. In Python, you can use the h5py library. The command to generate the mapping is the following,

[3]:
!xtp_map -v -t MD_FILES/newfile.data -c MD_FILES/traj.dump -s system.xml -f state.hdf5 -i 99 > mapping.out
/bin/sh: 1: xtp_map: not found

Check the mapping

Let us first output .pdb files for the segments, qmmolecules and classical segments in order to check the mapping. So we have to pass the calculator the filename. Votca has two ways to specify options for calculators. Using a file with the -o option or for quick things using the -c option on the command line, we will use both.

In the mapchecker section of the manual you can find a table with the mapchecker input variables and their corresponding defaults. Finally, the following command run the check

[4]:
!xtp_run -e mapchecker -c map_file=system.xml -f state.hdf5
/bin/sh: 1: xtp_run: not found

Neighborlist Calculation

The following step is to determine the neighbouring pairs for exciton transport.

We will use a cutoff of 1.5 nm. If you want to have a look at an option just the -d option with the calculator name

[5]:
!xtp_run -d neighborlist
/bin/sh: 1: xtp_run: not found
[6]:
!xtp_run -e neighborlist -c constant=1.5 -f state.hdf5
/bin/sh: 1: xtp_run: not found

Read reorganization energies

In this step we will read the in site reorganization energies and store them in the state.hdf5 file. We just need to copy the input file and execute the calculation.

[7]:
!xtp_run -e einternal -f state.hdf5
/bin/sh: 1: xtp_run: not found

Compute site energy

In this step we will perform some QMMM calculations to compute the site energies. The qmmm_mm.xml file contains more options to perform the MM calculations.

[8]:
!xtp_parallel -e qmmm -o qmmm_mm.xml -f state.hdf5 -j "write"
/bin/sh: 1: xtp_parallel: not found

The previous command generates a qmmm_mm_jobs.xml containing 3000 MM jobs to compute, if you examine that file, it should look something like:

<jobs>
        <job>
                <id>0</id>
                <tag>thiophene_0:n</tag>
                <input>
                        <site_energies>0:n</site_energies>
                        <regions>
                                <region>
                                        <id>0</id>
                                        <segments>0:n</segments>
                                </region>
                        </regions>
                </input>
                <status>AVAILABLE</status>
        </job>

Let us run just the first 4 jobs by settings all jobs status to COMPLETE except for the first four. This can be easily done with sed as follows,

[9]:
!sed -i "s/AVAILABLE/COMPLETE/g" qmmm_mm_jobs.xml
!sed -i '0,/COMPLETE/s/COMPLETE/AVAILABLE/' qmmm_mm_jobs.xml
!sed -i '0,/COMPLETE/s/COMPLETE/AVAILABLE/' qmmm_mm_jobs.xml
!sed -i '0,/COMPLETE/s/COMPLETE/AVAILABLE/' qmmm_mm_jobs.xml
!sed -i '0,/COMPLETE/s/COMPLETE/AVAILABLE/' qmmm_mm_jobs.xml
sed: can't read qmmm_mm_jobs.xml: No such file or directory
sed: can't read qmmm_mm_jobs.xml: No such file or directory
sed: can't read qmmm_mm_jobs.xml: No such file or directory
sed: can't read qmmm_mm_jobs.xml: No such file or directory
sed: can't read qmmm_mm_jobs.xml: No such file or directory

Now we can run the jobs and save the results in the state file

[10]:
!xtp_parallel -e qmmm -o qmmm_mm.xml -f state.hdf5 -j "run"
!xtp_parallel -e qmmm -o qmmm_mm.xml -f state.hdf5 -j "read"
/bin/sh: 1: xtp_parallel: not found
/bin/sh: 1: xtp_parallel: not found

Site energy and pair energy analysis

In this step we generate an histogram and compute the correlation function of site energies and pair energy differences.

[11]:
!xtp_run -e eanalyze -f state.hdf5
/bin/sh: 1: xtp_run: not found

You should now see a set of files prefixed with eanalyze containing the histrogram and correlation functions.

[12]:
!ls eanalyze*
ls: cannot access 'eanalyze*': No such file or directory

QM energy calculation

Our next task is to perform the qm calculations for each segment that we have stored in the hdf5 file. The calculations take place in 3 stages: write the jobs to a file, perform the computation and finally save the results to the state file. We created a small option file to make the calculation cheaper.

[13]:
!cat eqm.xml
cat: eqm.xml: No such file or directory

For the sake of computational time let just compute the gw approximation and the singlet. You can also request the triplet or all, see the gwbse sectionfor the eqm calculator.

First we will write the job in a file and enable only the first 2 jobs

[14]:
!xtp_parallel -e eqm -o eqm.xml -f state.hdf5 -j "write"
!sed -i "s/AVAILABLE/COMPLETE/g" eqm.jobs
!sed -i '0,/COMPLETE/s/COMPLETE/AVAILABLE/' eqm.jobs
!sed -i '0,/COMPLETE/s/COMPLETE/AVAILABLE/' eqm.jobs
/bin/sh: 1: xtp_parallel: not found
sed: can't read eqm.jobs: No such file or directory
sed: can't read eqm.jobs: No such file or directory
sed: can't read eqm.jobs: No such file or directory

Now, let run these 2 jobs

Here we used some more options. -o allows us to read in a file with options. -j changes the writing to running in this case. -x determines how many cores should be used for each job. We can also run multiple jobs in parallel using -p

[15]:
!xtp_parallel -e eqm -o eqm.xml -f state.hdf5 -j run -x 4
/bin/sh: 1: xtp_parallel: not found

QM calculation for pairs

In the following step we will run QM calculations for each pair in the hdf5 file. As the calculations on the previous step, we will first write the jobs in a file, then run them and finally store the results in the state file.

As in the previous section, we set the GWBSE mode to G0W0and the ranges to full, but we compute only the gw approximation. We do not need the BSE results for the coupling calculations. For more information, check the iqm calculator options. We also want to compute the singlet couplings.

Before running the calculations, we need to specify in the iqm input which states to read into the jobfile for each segment type.

Now, let’s write the jobs to the file

[16]:
!xtp_parallel -e iqm -o iqm.xml -f state.hdf5 -s 0 -j "write"
/bin/sh: 1: xtp_parallel: not found

From the jobs that we just write down, let’s make available only the first job

[17]:
!sed -i "s/AVAILABLE/COMPLETE/g" iqm.jobs
!sed -i '0,/COMPLETE/s/COMPLETE/AVAILABLE/' iqm.jobs
sed: can't read iqm.jobs: No such file or directory
sed: can't read iqm.jobs: No such file or directory

Now we can run and store the jobs results

[18]:
!xtp_parallel -e iqm -o iqm.xml -f state.hdf5 -s 0 -j run -q 1 -x 4
/bin/sh: 1: xtp_parallel: not found

Finally, we read the results into the state

[19]:
!xtp_parallel -e iqm -o iqm.xml -f state.hdf5 -j "read"
/bin/sh: 1: xtp_parallel: not found

QMMM Calculations

We will run the QMMM calculations we will use the pregenerated qmmm.jobs file in the current work directory, so we can directly run the calculations. We also provide an option file in the OPTIONFILES folder. In qmmm calculations you can use the jobfile tag inside the optionfile to modify options from the jobfile. Here we modify the size of the staticregion.

[20]:
!cat qmmm.xml
cat: qmmm.xml: No such file or directory

In the jobfile we then provide the specific option

[21]:
!cat qmmm.jobs
cat: qmmm.jobs: No such file or directory
[22]:
!xtp_parallel -e qmmm -o qmmm.xml -f state.hdf5 -j run -x 4
/bin/sh: 1: xtp_parallel: not found

We can if we want plot the spectra from both calculations, for which we have to read the energies and oscillator strengths from the checkpoint files. We need the h5py package for python for it.

[23]:
import h5py
import numpy as np

def getEnergies(orb):
    a=orb['region_0']['orbitals']['BSE_singlet']['eigenvalues'][()]
    a.flatten()
    return a.flatten()

def trans_sort(index):
    return int(index[3:])

def getOscillators(orb):
    energies=getEnergies(orb)
    transdip=[]
    for k in sorted(orb['region_0']['orbitals']['transition_dipoles'].keys(),key=trans_sort):
        transdip.append(np.array(orb['region_0']['orbitals']['transition_dipoles'][k][()]))
    d2=[]
    for b in transdip:
        d2.append(np.sum(b**2))
    d2=np.array(d2)
    oscs=2/3.0*energies*d2
    return oscs

def getSpectrum(filename):
    orb=h5py.File(filename,'r')
    e=getEnergies(orb)*27.2114
    osc=getOscillators(orb)
    return e,osc

You will find the orb files in the QMMM/frame_10000 folder.

[24]:
spectrum_static=getSpectrum("QMMM/frame_10000/job_1_static/checkpoint_iter_1.hdf5")
spectrum_vacuum=getSpectrum("QMMM/frame_10000/job_0_vacuum/checkpoint_iter_1.hdf5")

import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = [12, 8]
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/tmp/ipykernel_4064029/3734934265.py in <module>
----> 1 spectrum_static=getSpectrum("QMMM/frame_10000/job_1_static/checkpoint_iter_1.hdf5")
      2 spectrum_vacuum=getSpectrum("QMMM/frame_10000/job_0_vacuum/checkpoint_iter_1.hdf5")
      3
      4 import matplotlib.pyplot as plt
      5 get_ipython().run_line_magic('matplotlib', 'inline')

/tmp/ipykernel_4064029/1345282021.py in getSpectrum(filename)
     23
     24 def getSpectrum(filename):
---> 25     orb=h5py.File(filename,'r')
     26     e=getEnergies(orb)*27.2114
     27     osc=getOscillators(orb)

/usr/lib/python3/dist-packages/h5py/_debian_h5py_serial/_hl/files.py in __init__(self, name, mode, driver, libver, userblock_size, swmr, rdcc_nslots, rdcc_nbytes, rdcc_w0, track_order, fs_strategy, fs_persist, fs_threshold, fs_page_size, page_buf_size, min_meta_keep, min_raw_keep, locking, **kwds)
    505                                  fs_persist=fs_persist, fs_threshold=fs_threshold,
    506                                  fs_page_size=fs_page_size)
--> 507                 fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
    508
    509             if isinstance(libver, tuple):

/usr/lib/python3/dist-packages/h5py/_debian_h5py_serial/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
    218         if swmr and swmr_support:
    219             flags |= h5f.ACC_SWMR_READ
--> 220         fid = h5f.open(name, flags, fapl=fapl)
    221     elif mode == 'r+':
    222         fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

h5py/_debian_h5py_serial/_objects.pyx in h5py._debian_h5py_serial._objects.with_phil.wrapper()

h5py/_debian_h5py_serial/_objects.pyx in h5py._debian_h5py_serial._objects.with_phil.wrapper()

h5py/_debian_h5py_serial/h5f.pyx in h5py._debian_h5py_serial.h5f.open()

FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = 'QMMM/frame_10000/job_1_static/checkpoint_iter_1.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
[25]:
plt.vlines(spectrum_static[0],0,spectrum_static[1],label="static",color='r')
plt.vlines(spectrum_vacuum[0],0,spectrum_vacuum[1],label="vacuum",color='g')
plt.xlabel("energy [eV]")
plt.ylabel("intensity")
plt.legend()
plt.show()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/tmp/ipykernel_4064029/3258046128.py in <module>
----> 1 plt.vlines(spectrum_static[0],0,spectrum_static[1],label="static",color='r')
      2 plt.vlines(spectrum_vacuum[0],0,spectrum_vacuum[1],label="vacuum",color='g')
      3 plt.xlabel("energy [eV]")
      4 plt.ylabel("intensity")
      5 plt.legend()

NameError: name 'plt' is not defined
[ ]: