All entries for June 2019

June 09, 2019

Minimum requirements for publishing a numerical study using "blackbox" algorithms/software

Writing about web page

In my editorial work, I often encounter manuscripts in which authors are using more or less well-established numerical software packages to produce scientific results. These software packages may sometimes be of a commerical nature or already have a long development history. Typical examples of such codes are VASP, COMSOL, CASTEP, Siesta, GROMACS, etc.

Usually the authors of the manuscript are not developers of the software packages and simply use them as is. There is of course nothing wrong with this. However, in some case, these manuscripts contain (i) results only calculated for a single set of input settings, (ii) give numerical data without any indication of the accuracy of these estimates and/or (iii) any indication as to how these data are sensitive to the chosen parameter settings.

To be concrete, image a numerical DFT study of a certain material under strain and the determination of one of its lattice parameters as "a"=3.1234. Clearly, in many codes this number will depend on, e.g., the chosen ernergy cut-off as well as the number of k-points used in its basis set. In order to ascertain the accuracy of the chosen value for "a" one could, e.g., increase the number of k-points and observe how much "a" changes. Similarly, "a" usually changes when the cut-off energy changes, say from 500meV to 700meV. Both changes result in a new value for "a", say "a"= 3.1500 from the new energy cut-off and 3.1034 from the changes in chosen k-points. Indeed, one can usually get other values by chosing other k-point meshes and larger energy cut-offs. Hence the quoted final result should be something like

"a"= 3.1234 +/- 0.03 +/- 0.02 = 3.12 +/- 0.05

accompanied by an explanaton as to how these error bars were obtained. Only with these error estimates can readers of a scientific article see how accurate the data really is and how stable to variations in input parameters. Note that these values are still subject to further sources of error due to, (i) other input parameter dependencies and (ii) systematic errors that might be present in the chosen software package itself. Nevertheless, as given above, the bold final result at least provides some insight into the validity of the quoted numbers.

June 2019

Mo Tu We Th Fr Sa Su
May |  Today  | Jul
               1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

Search this blog



Most recent comments

  • Please note that this link to download is now obsolete. Attendees for the 2021 MPAGS/CYU course shou… by Rudo Roemer on this entry
  • To load Tensorflow and PyTorch (and Keras) modules on ORAC may you may have to module load a few mor… by Daniel Paget on this entry
  • That's cool. Perhaps you want to show people during class next week how to run on Kaggle? by Rudo Roemer on this entry
  • Regarding possible alternatives to benefit form a GPU, I have personally used Kaggle, by Google, its… by Juan on this entry
  • Hello to you all, I was just trying to make the programs run in my VirtualBox with Ubuntu 18.04 inst… by Malaquias Correa on this entry

Blog archive

Not signed in
Sign in

Powered by BlogBuilder