msl.io.utils module

General functions.

msl.io.utils.checksum(file, algorithm='sha256', chunk_size=65536, shake_length=256)[source]

Get the checksum of a file.

A checksum is a sequence of numbers and letters that act as a fingerprint for a file against which later comparisons can be made to detect errors or changes in the file. It can be used to verify the integrity of the data.

Parameters:
  • file (path-like or file object) – A file to get the checksum of.

  • algorithm (str, optional) – The hash algorithm to use to compute the checksum. See hashlib for more details.

  • chunk_size (int, optional) – The number of bytes to read at a time from the file. It is useful to tweak this parameter when reading a large file to improve performance.

  • shake_length (int, optional) – The digest length to use for the SHAKE algorithm. See hashlib.shake.hexdigest() for more details.

Returns:

str – The checksum containing only hexadecimal digits.

msl.io.utils.copy(source, destination, overwrite=False, include_metadata=True)[source]

Copy a file.

Parameters:
  • source (path-like object) – The path to a file to copy.

  • destination (path-like object) – A directory to copy the file to or a full path (i.e., includes the basename). If the directory does not exist then it, and all intermediate directories, will be created.

  • overwrite (bool, optional) – Whether to overwrite the destination file if it already exists. If destination already exists and overwrite is False then a FileExistsError is raised.

  • include_metadata (bool, optional) – Whether to also copy information such as the file permissions, the latest access time and latest modification time with the file.

Returns:

str – The path to where the file was copied.

msl.io.utils.get_basename(obj)[source]

Get the basename() of a file.

Parameters:

obj (path-like or file-like) – The object to get the basename() of. If the object does not support the basename() function then the __name__ of the obj is returned.

Returns:

str – The basename of obj.

msl.io.utils.git_head(directory)[source]

Get information about the HEAD of a repository.

This function requires that git is installed and that it is available on PATH.

Parameters:

directory (str) – A directory that is under version control.

Returns:

dict or None – Information about the most recent commit on the current branch. If directory is not a directory that is under version control then returns None.

msl.io.utils.is_admin()[source]

Check if the current process is being run as an administrator.

Returns:

bool – Whether the current process is being run as an administrator.

msl.io.utils.is_dir_accessible(path, strict=False)[source]

Check if a directory exists and is accessible.

An accessible directory is one that the user has permission to access.

Parameters:
  • path (str) – The directory to check.

  • strict (bool, optional) – Whether to raise the exception (if one occurs).

Returns:

bool – Whether the directory exists and is accessible.

msl.io.utils.is_file_readable(file, strict=False)[source]

Check if a file exists and is readable.

Parameters:
  • file (str) – The file to check.

  • strict (bool, optional) – Whether to raise the exception (if one occurs).

Returns:

bool – Whether the file exists and is readable.

msl.io.utils.register(reader_class)[source]

Use as a decorator to register a Reader subclass.

See Create a New Reader for an example on how to use @register decorator.

Parameters:

reader_class (Reader) – A Reader subclass.

Returns:

Reader – The Reader.

msl.io.utils.remove_write_permissions(path)[source]

Remove all write permissions of a file.

On Windows, this function will set the file attribute to be read only.

On linux and macOS, write permission is removed for the User, Group and Others. The read and execute permissions are preserved.

Parameters:

path (path-like object) – The path to remove the write permissions of.

msl.io.utils.run_as_admin(args=None, executable=None, cwd=None, capture_stderr=False, blocking=True, show=False, **kwargs)[source]

Run a process as an administrator and return its output.

Parameters:
  • args (str or list of str, optional) – A sequence of program arguments or else a single string. Providing a sequence of arguments is generally preferred, as it allows the module to take care of any required escaping and quoting of arguments (e.g., to permit spaces in file names).

  • executable (str, optional) – The executable to pass the args to.

  • cwd (str, optional) – The working directory for the elevated process.

  • capture_stderr (bool, optional) – Whether to send the stderr stream to stdout.

  • blocking (bool, optional) – Whether to wait for the process to finish before returning to the calling program.

  • show (bool, optional) – Whether to show the elevated console (Windows only). If True then the stdout stream of the process is not captured.

  • kwargs – If the current process already has admin privileges or if the operating system is not Windows then all additional keyword arguments are passed to check_output(). Otherwise, only a timeout keyword argument is used (Windows).

Returns:

bytes, int or Popen – The returned object depends on whether the process is executed in blocking or non-blocking mode. If blocking then bytes are returned (the stdout stream of the process). If non-blocking, then the returned object will either be the Popen instance that is running the process (POSIX) or an int which is the process ID (Windows).

Examples

Import the modules

>>> import sys
>>> from msl.io import run_as_admin

Run a shell script

>>> run_as_admin(['./script.sh', '--message', 'hello world'])

Run a Python script

>>> run_as_admin([sys.executable, 'script.py', '--verbose'], cwd='D:\\My Scripts')

Create a service in the Windows registry and in the Service Control Manager database

>>> run_as_admin(['sc', 'create', 'MyLogger', 'binPath=', 'C:\\logger.exe', 'start=', 'auto'])
msl.io.utils.search(folder, pattern=None, levels=0, regex_flags=0, exclude_folders=None, ignore_permission_error=True, ignore_hidden_folders=True, follow_symlinks=False)[source]

Search for files starting from a root folder.

Parameters:
  • folder (str) – The root folder to begin searching for files.

  • pattern (str, optional) –

    A regex string to use to filter the filenames. If None then no filtering is applied and all files are yielded. Examples:

    • r'data' \(\rightarrow\) find all files with the word data in the filename

    • r'\.png$' \(\rightarrow\) find all files with the extension .png

    • r'\.jpe*g$' \(\rightarrow\) find all files with the extension .jpeg or .jpg

  • levels (int, optional) – The number of sub-folder levels to recursively search for files. If None then search all sub-folders.

  • regex_flags (int, optional) – The flags to use to compile regex strings.

  • exclude_folders (str or list of str, optional) –

    The pattern of folder names to exclude from the search. Can be a regex string. If None then include all folders in the search. Examples:

    • r'bin' \(\rightarrow\) exclude all folders that contain the word bin

    • r'^My' \(\rightarrow\) exclude all folders that start with the letters My

    • [r'bin', r'^My'] which is equivalent to r'(bin|^My') \(\rightarrow\) exclude all folders that contain the word bin or start with the letters My

  • ignore_permission_error (bool, optional) – Whether to ignore PermissionError exceptions when reading the items within a folder.

  • ignore_hidden_folders (bool, optional) – Whether to ignore hidden folders from the search. A hidden folder starts with a . (a dot).

  • follow_symlinks (bool, optional) – Whether to search for files by following symbolic links.

Yields:

str – The path to a file.

msl.io.utils.send_email(config, recipients, sender=None, subject=None, body=None)[source]

Send an email.

Parameters:
  • config

    A path-like object or file-like object of an INI-style configuration file that contains information on how to send an email. There are two ways to send an email – Gmail API or SMTP server.

    An example INI file to use the Gmail API is the following (see GMail for more details). Although all key-value pairs are optional, a [gmail] section must exist to use the Gmail API.

    [gmail]
    account = work [default: None]
    credentials = path/to/client_secrets.json [default: None]
    scopes =       [default: None]
      https://www.googleapis.com/auth/gmail.send
      https://www.googleapis.com/auth/gmail.metadata
    domain = @gmail.com [default: None]
    

    An example INI file for an SMTP server is the following. Only the host and port key-value pairs are required.

    [smtp]
    host = hostname or IP address of the SMTP server
    port = port number to connect to on the SMTP server
    starttls = true|yes|1|on -or- false|no|0|off [default: false]
    username = the username to authenticate with [default: None]
    password = the password for username [default: None]
    domain = @company.com [default: None]
    

    Warning

    Since this information is specified in plain text in the configuration file, you should set the file permissions provided by your operating system to ensure that your authentication credentials are safe.

  • recipients (str or list of str) – The email address(es) of the recipient(s). Can omit the @domain.com part if a domain key is specified in the config file. Can be the value 'me' if sending an email to yourself via Gmail.

  • sender (str, optional) – The email address of the sender. Can omit the @domain.com part if a domain key is specified in the config file. If not specified then it equals the value of the first recipient if using SMTP or the value 'me' if using Gmail.

  • subject (str, optional) – The text to include in the subject field.

  • body (str, optional) – The text to include in the body of the email. The text can be enclosed in <html></html> tags to use HTML elements to format the message.