{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 03: Useful standard library modules\n", "(pathlib, shutil, sys, os, subprocess, zipfile, etc.)\n", "\n", "These packages are part of the standard python library and provide very useful functionality for working with your operating system and files. This notebook will provide explore these packages and demonstrate some of their functionality. Online documentation is at https://docs.python.org/3/library/.\n", "\n", "\n", "#### Topics covered:\n", "* **pathlib**:\n", " * listing files\n", " * creating, moving and deleting files\n", " * absolute vs relative paths\n", " * useful path object attributes\n", "* **shutil**: \n", " * copying, moving and deleting files AND folders\n", "* **sys**: \n", " * python and platform information\n", " * command line arguments\n", " * modifying the python path to import code from other locations\n", "* **os**:\n", " * changing the working directory\n", " * recursive iteration through folder structures\n", " * accessing environmental variables\n", "* **subprocess**: \n", " * running system commands and checking the results\n", "* **zipfile**:\n", " * creating and extracting from zip archives" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "from pathlib import Path\n", "import shutil\n", "import subprocess\n", "import sys\n", "import zipfile" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ``pathlib`` — Object-oriented filesystem paths\n", "Pathlib provides convenient \"pathlike\" objects for working with file paths across platforms (meaning paths or operations done with pathlib work the same on Windows or POSIX systems (Linux, OSX, etc)). The main entry point for users is the ``Path()`` class.\n", "\n", "further reading: \n", "https://treyhunner.com/2018/12/why-you-should-be-using-pathlib/ \n", "https://docs.python.org/3/library/pathlib.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Make a ``Path()`` object for the current folder" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cwd = Path('.')\n", "cwd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Listing files" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for f in cwd.iterdir():\n", " print(f)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### List just the notebooks using the ``.glob()`` method" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for nb in cwd.glob('*.ipynb'):\n", " print(nb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Note: ``.glob()`` works across folders too\n", "List all notebooks for both class components" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for nb in cwd.glob('../*/*.ipynb'):\n", " print(nb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### But ``glob`` results aren't sorted alphabetically!\n", "(and the sorting is platform-dependent)\n", "\n", "https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/?comments=1&post=38113333\n", "\n", "we can easily sort them by casting the results to a list" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sorted(list(cwd.glob('../*/*.ipynb')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note:** There is also a glob module in the standard python library that works directly with string paths" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import glob\n", "sorted(list(glob.glob('../*/*.ipynb')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### List just the subfolders" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[f for f in cwd.iterdir() if f.is_dir()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Create a new path for the data subfolder" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_path = cwd / 'data'\n", "data_path" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### or an individual file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f = cwd / '00_python_basics_review.ipynb'\n", "f" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### check if it exists, or if it's a directory" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f.exists(), f.is_dir()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating files and folders\n", "\n", "#### make a new subdirectory" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_folder = cwd / 'more_files'\n", "new_folder" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_folder.exists()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_folder.mkdir(); new_folder.exists()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that if you try to run the above cell twice, you'll get an error that the folder already exists\n", "``exist_ok=True`` suppresses these errors." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_folder.mkdir(exist_ok=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### make a new subfolder within a new subfolder\n", "The ``parents=True`` argument allows for making subfolders within new subfolders" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(new_folder / 'subfolder').mkdir(exist_ok=True, parents=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### absolute vs. relative pathing\n", "\n", "Get the absolute location of the current working directory" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "abs_cwd = Path.cwd()\n", "abs_cwd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Go up two levels to the course repository" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class_root = (abs_cwd / '../../')\n", "class_root" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Simplify or resolve the path" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class_root = class_root.resolve()\n", "class_root" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get the cwd relative to the course repository" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "abs_cwd.relative_to(class_root)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "check if this is an absolute or relative path" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "abs_cwd.relative_to(class_root).is_absolute()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "abs_cwd.is_absolute()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**gottcha:** `Path.relative_to()` only works when the first path is a subpath of the second path, or if both paths are absolute\n", "\n", "For example, try executing this line: \n", "\n", "```python\n", "Path('../part1_flopy/').relative_to('data')\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you need a relative path that will work robustly in a script, `os.path.relpath` might be a better choice" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "os.path.relpath('../part1_flopy/', 'data')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "os.path.relpath('data', '../part1_flopy/')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### useful attributes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "abs_cwd.parent" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "abs_cwd.parent.parent" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f.name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f.suffix" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f.with_suffix('.junk')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f.stem" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Moving and deleting files\n", "\n", "Make a file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fname = Path('new_file.txt')\n", "with open(fname, 'w') as dest:\n", " dest.write(\"A new text file.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fname.exists()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Move the file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fname2 = Path('new_file2.txt')\n", "fname.rename(fname2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fname.exists()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Delete the file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fname2.unlink()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fname2.exists()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Delete the empty folder we made above\n", "Note: this only works for empty directories (use ``shutil.rmtree()`` very carefully for removing folders and all contents within)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Path('more_files/subfolder/').rmdir()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ``shutil`` — High-level file operations\n", "module for copying, moving, and deleting files and directories.\n", "\n", "https://docs.python.org/3/library/shutil.html\n", "\n", "The functions from shutil that you may find useful are:\n", "\n", " shutil.copy()\n", " shutil.copy2() # this preserves most metadata (i.e. dates); unlike copy()\n", " shutil.copytree()\n", " shutil.move()\n", " shutil.rmtree() #obviously, you need to be careful with this one!\n", " \n", "Give these guys a shot and see what they do. Remember, you can always get help by typing:\n", "\n", " help(shutil.copy)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#try them here. Be careful!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "shutil.rmtree(new_folder)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ``sys`` — System-specific parameters and functions\n", "\n", "### Getting information about python and the os\n", "where python is installed" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(sys.prefix)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(sys.version_info)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sys.platform" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding command line arguments to a script\n", "Here the command line arguments reflect that we're running a Juptyer Notebook. \n", "\n", "In a python script, command line arguments are listed after the first item in the list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sys.argv" ] }, { "cell_type": "markdown", "metadata": { "jp-MarkdownHeadingCollapsed": true }, "source": [ "### Exercise: Make a script with a command line argument using sys.argv\n", "\n", "1) Using a text editor such as VSCode, make a new ``*.py`` file with the following contents:\n", "\n", "```python\n", "import sys\n", "\n", "if len(sys.argv) > 1:\n", " for argument in sys.argv[1:]:\n", " print(argument)\n", "else:\n", " print(\"usage is: python