epsman.epsJob module

Class for ePolyScat job generation, including file IO and multi-E generation.

Main function is local and remote path management (via pathlib), job configuration set-up (manual), and local & remote file IO (via Fabric/Invoke).

19/02/21 Removed _repo, _web and _epsProc methods, these to be set as a separate post-processing class.

16/02/21 Tidying up…
  • Added setHost() and setJob() methods to handle set & reset of parameters. Pretty basic, may need work, also quite a lot of boilerplate in current implementation.

  • Moved job-specific paths functionality to setJobPaths() method.

  • Tidied up setGenFile() and rationalised methods.

class epsman.epsJob.epsJob(host=None, user=None, IP=None, password=None, mol=None, orb=None, batch=None, genFile=None, verbose=1)[source]

Bases: object

Class for ePolyScat job generation, including file IO and multi-E generation.

Main function is local and remote path management (via pathlib), job configuration set-up (manual), and local & remote file IO (via Fabric/Invoke).

19/02/21: removed _repo, _web and _epsProc methods, these to be set as a separate post-processing class.

checkFiles(fileList, scanDir='', verbose=False)

Check files exist on host.

Parameters:
  • fileList (list of strs or Path objects) – Files to check on host. Names only, or full paths. Can optionally set directory with scanDir variable.

  • scanDir (str or Path, optional, default = '') – Directory to scan, defaults to Fabric default (home dir).

  • verbose (bool, optional, default = False) – Print results to screen.

Returns:

checkList – List of Fabric results, bool.

Return type:

list

Note

This uses self.c.run(), as set at self.initConnection(), so will only work for self.host.

checkLocalFiles(fileList, scanDir=None, verbose=False)

Check files on local machine. Use checkFiles() to check files on remote host.

Very basic!

Parameters:
  • fileList (list of strs or Path objects) – Files to check on local machine. Names only, or full paths. Can optionally set directory with scanDir variable.

  • scanDir (str or Path, optional, default = None) – Directory to scan, if not passed this is not set (so current working directory will be used unless full file paths are passed).

  • verbose (bool, optional, default = False) – Print results to screen.

Returns:

checkList – List of Fabric results, bool.

Return type:

list

Note

This uses path().exists(), so will only work for local machine.

createJobDirTree(localHost=False)

” Basic routine to create dir tree for new system (molecule)

Parameters:
  • self (epsJob structure) –

    Contains path and job settings:

    Molecule/system/job name for dir structure Subdirs wrkdir/mol/electronic_structure

    wrkdir/mol/generators

    Will be created if not present.

    cfabric connection object

    Fabric connection used to run commands (over ssh). For local use, set to use local machine, e.g. c = Connection(‘user@localhost’)

    genFilestr, optional, default = None

    Generator file, will be uploaded to wrkdir/mol/generators if passed.

  • localHost (bool, default = False) – Set to true in order to build dir tree on local host instead of remote host.

getFileList(scanDir, fileType='out', subDirs=True, verbose=True)

Get a file list from host - scan directory=dir for files of fileType.

Parameters:
  • scanDir (string) – Directory to scan.

  • fileType (string, default = 'out') – File ending to match.

  • subDirs (bool, optional, default = True) – Include subDirs in processing.

  • verbose (bool, optional, default = True) – Print jobList to screen.

initConnection(host=None, user=None, IP=None, password=None, home=None, overwriteHost=False)

Init connection to selected machine & test.

Parameters:
  • host

  • user

  • IP

  • password

  • home

multiEChunck(Estart=0.1, Estop=30.1, dE=2.5, EJob=None, EJobRange=None, precision=2)

Basic multi-E job set-up, with adaptive chunking into sub-jobs.

Parameters:
  • Estart (int, float, optional, default = 0.1) –

  • Estop (int, float, optional, default = 30.1) –

  • dE (int, float, optional, default = 2.5) – Overall energy ranges (eV) for job. Defaults for basic survey-scan. Jobs will be chunked into sub-jobs with EJob elements if possible.

  • EJob (int, optional, default = None) – Number of energy points per input file (sub-jobs). If set to None, this will be set automatically. If set to an int, this will determine job chunck size, but may be overriden in some cases to nearest common divisor. If set = 1 this will force full Elist return (maximum chuncking).

  • EJobRange (list or np.array, optional, default = [5,21]) – Set [min, max] E per chunck. If EJob < min, EJob will be used. If EJob > max, max will be used.

  • precision (int, optional, default = 2) – Precision for energies, generate via np.round. TODO: automate this from dE.

Returns:

  • Elsit (np.array, 2D) – Final energy list, corresponding to one row per input file.

  • 23/02/22 (updated to self.multiEChunck method… may want to set options in a dictionary rather than self.attribs?)

pullFile(fileLocal, fileRemote, overwritePrompt=True)

Routine to check and pull file from remote

22/02/21 - implemented hacky pullFile(), just modified from pushFile() above, but should consolidate methods here - lots of boilerplate!

Existing notes:

Follows basic method from genFile handling in createJobDirTree() (1) Test if file already exists on remote, prompt for overwrite if so (unless overwritePrompt is set to False or None). (2) Push file. (3) Check file on remote to verify.

Parameters:
  • fileLocal (Path object) – Local file to push. Full path, or file in working dir.

  • fileRemote (Path object) – Remote location. Full path, with or without filename. (If missing, filename will be unchanged.)

  • overwritePrompt (bool, default = True) – If set to True, prompt user for file overwrite. If set to False, overwrite existing files. If set to None, do not overwrite.

Returns:

  • bool, True if sucessful.

  • Fabric object with details if failed.

pullFileDict(fileKey, **kwargs)

Wraps self.pullFile([localhost][fileKey], [host][fileKey], **kwargs)

pushFile(fileLocal, fileRemote, overwritePrompt=True, mkdir='prompt')

Routine to check and push file to remote

Follows basic method from genFile handling in createJobDirTree() (1) Test if file already exists on remote, prompt for overwrite if so (unless overwritePrompt is set to False or None). (2) Push file. (3) Check file on remote to verify.

Parameters:
  • fileLocal (Path object) – Local file to push. Full path, or file in working dir.

  • fileRemote (Path object) – Remote location. Full path, with or without filename. (If missing, filename will be unchanged.)

  • overwritePrompt (bool, default = True) – If set to True, prompt user for file overwrite. If set to False, overwrite existing files. If set to None, do not overwrite.

  • mkdir (bool or str, default = 'prompt') – If set to True, create remote dir (and parents) if missing. If set to ‘prompt’, prompt user for remote dir creation. If set to False, don’t create remote dir.

Returns:

  • bool, True if sucessful.

  • Fabric object with details if failed.

  • TODO (fix fileRemote.is_file() part - this currently appends filename regardless, but OK if dir only passed.)

  • TODO (add mkdir stage? Or option for this at least.)

pushFileDict(fileKey, **kwargs)

Wraps self.pushFile([localhost][fileKey], [host][fileKey], **kwargs)

runJobs(runScript=None)

Basic wrapper for running ePS jobs remotely.

Parameters:

runScript (str, optional, default = None) – Set script to use on remote machine. If None, use self.runScript if set, or set as ‘ePS_batch_nohup.sh’.

setAttribute(attrib, newVal=None, overwriteFlag=False, printFlag=True)

Basic check & set attribute routine.

Set self.attrib = newVal if:
  • attrib doesn’t exist,

  • attrib exists
    • but is None,

    • if overwriteFlag = True is passed.

printFlag: if True, print values, otherwise just confirm value set. (Only if self.verbose and setFlag.)

TODO: consider attrs library here, https://www.attrs.org/en/stable/examples.html#validators

setAttributesFromDict(itemsDict, overwriteFlag=False, printFlag=True)
setGenFile(genFile=None)

Set GenFile & propagate to all hosts.

NOTE: currently have issue with wrkdir vs. genDir, and assumptions on where genFile is located on local machine.

Default filename is set as Path(f’{self.mol}.{self.batch}.{self.orb}.conf’)

setHost(host=None, user=None, IP=None, password=None, overwriteFlag=True)

Very basic host setting.

TODO: add conditional reset case, and differentiate from overwriting existing details case. (Should preserve non-None settings?)

setHostDefns(overwriteFlag=True, host=None, **kwargs)

Update paths in self.hostDefns.

Note that changes to dependent paths WILL NOT be propagated.

Parameters:
  • overwriteFlag (bool, default = True) – Set True to overwrite existing entries.

  • host (str, list, default = None) – Host(s) to update. If None, update all hosts.

  • kwargs (keyword args, or dict) – Define entries to update. E.g. setHostDefns(elecDir=’ estDir’) will set self.hostDefns[all hosts][‘elecDir’].

setJob(mol=None, orb=None, batch=None, jobNote=None, elecStructure=None, genFile=None, jobSettings=None, writeScript=None, runScript=None, overwriteFlag=False)

Init job settings. This is crude, but ultimately all these parameters are required.

Parameters:
  • mol (str, default = None) – Molecule name to use.

  • orb (str, default = None) – Orbital name/label.

  • batch (str, default = None) – Batch to use as job label. By default, jobs will be set with directory structure wrkdir/mol/batch/orb

  • jobNote (str, default = None) – Additional note to be included in job input file.

  • elecStructure (str, default = None) – Filename for electronic structure input file (Gamess .log or Molden .mol) This assumed to be in Path(self.hostDefn[host][‘systemDir’], ‘electronic_structure’) (TODO: better file handling here)

  • genFile (str, default = None) – Filename for generator (settings) file. Will be set to default by setGenFile() if not passed.

  • jobSettings (str, default = None) – Job string, not yet fully implmented.

  • writeScript (str, optional, default = None) –

runScriptstr, optional, default = None

Set which script to use to run ePS on remote. Used by self.runJobs(); if None default will be used.

overwriteFlagbool, default = False

Set to True to force overwrite of existing values with passed params.

setJobPaths()

Set default job paths.

This requires self.mol etc. to be set, and builds from self.hostDefn[host][‘wrkdir’].

setPaths()

Set default (system) paths.

setScripts()

Set list of utility scripts & templates.

setWrkDir(wrkdir=None, host=None)

Reset wrkdir, and related job paths, for host.

Defaults to self.host if specific host name is not passed.

syncFilesDict(fileKey, pushPrompt=True, **kwargs)

Synchronise file between local and host, including push/pull missing files.

Sync for self.hostDefn[host][fileKey], between localhost and remote host (self.host).

TODO:

  • Set for file lists?

  • adapt for multiple hosts? Probably easier to find existing/library code for this case however.

  • Check file paths exist. Currently just flags an error.

  • Methods for updating files, currently only handles missing files.

tidyJobs(chkFlag=True, mvFlag=True, cpFlag=False, owFlag=None, tol=0.05, searchDir=None, searchDirKey='jobComplete', searchString=None)

Check files for job completion (crudely). Move completed jobs to main job folder.

Parameters:
  • chkFlag (bool, optional, default = True) – Perform basic job file batch check for completeness.

  • mvFlag (bool, optional, default = True) – Move files from completed to job dir if True.

  • cpFlag (bool, optional, default = False) – Make a local copy of job files if True.

  • owFlag (bool, optional, default = None) – Overwrite local files on get. If set to None, user will be prompted if local files exist.

  • tol (float, optional, default = 0.05) – Tolerance (%age) for filesize tests.

  • searchDir (string or path, optional, default = None) – Pass to search a custom dir. If None, will use seachDirKey for preset paths instead, default case self.hostDefn[self.host][‘jobComplete’]

  • searchDirKey (string, optional, default = 'jobComplete') – Dir for completed jobs. Default case searches in self.hostDefn[self.host][‘jobComplete’]

  • searchString (string or Path, optional, default = None) – File name string to use for file search. If None, use self.genFile.stem

writeGenFile()
writeInp(scrType=None, wLog=True)

Write ePS input files from job structure, in multi-E chunks.

Parameters:
  • scrType (str, default = None) – Type of shell script to call, as defined in self.scrDefn (see setScripts() function for details) If not set try self.writeScript, default to ‘basic’.

  • wLog (bool, default = True) – Write local log file from script run stdout. Log file will be written using self.genFile path & name.