02: Namespace, modules, packages, and objects

There are a variety of ways to import existing code into a Python script or interactive session.

There is alot of flexibility in how this is done, but a few suggested practices will be covered here.

[1]:
import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

In the above Easter Egg, we can learn a couple things. First, the end line highlights that namespaces are important!

Also, by importing this, it actually executed some code (printing out the Zen of Python). This means Python knew where to find a module called this and executed it upon import.

Namespaces

There’s a nice explanation of namespaces here.

First, we need to understand what is a name in Python. A name is a general container referencing something. Like in many languages, think of a variable:

[2]:
a=5
a
[2]:
5

In python, we can also use a name for a function.

[3]:
def funky(description):
    print (f'this {description} function is funky!')
[4]:
funky
[4]:
<function __main__.funky(description)>
[5]:
funky('Town')
this Town function is funky!
[6]:
f = funky
f
[6]:
<function __main__.funky(description)>
[7]:
f("Skunk")
this Skunk function is funky!

So, we assigned f to, in a sense, point to the function funky.

Names (and therefore variables) can assume various types and get reused without definition.

[8]:
a=5
print (a)
a = [12.3, 44.9]
print (a)
a = 'stuff in quotes'
print (a)
5
[12.3, 44.9]
stuff in quotes

So, namespace is just a space containing all the names in use during a Python session.

An important caution with names:

Since you can think of a name of a variable as a tag, there is a special behavior related to lists that can cause massive grief!

First, what happens when a single value is associated with a name (like a variable)

[9]:
a = 5
b = a
print(f"{a=}, {b=}")
b = 6
print(f"{a=}, {b=}")
a=5, b=5
a=5, b=6

Now what happens when we have a list and change an element in b

[10]:
a=[1.0, 2.0, 3.5, 4.9]
print (f'{a=}')
b=a
print (f'{b=}')
print ('_'*15)
b[2]=999
print (f'{a=}')
print (f'{b=}')
a=[1.0, 2.0, 3.5, 4.9]
b=[1.0, 2.0, 3.5, 4.9]
_______________
a=[1.0, 2.0, 999, 4.9]
b=[1.0, 2.0, 999, 4.9]

Oh no! Changing ``b`` also changed ``a``!

The reason for this is that a and b are both pointing to the same memory location that’s storing the information (in this case, starting with the list [1.0, 2.0, 3.5, 4.9] and later becoming the list [1.0, 2.0, 999, 4.9]). This same behavior happens when using numpy arrays.

The way around this is to make a full copy of the information (by value rather than by reference). In typical Python, this means importing a module called copy and using either the function copy.copy or copy.deepcopy. In numpy, copy is built-in.

[11]:
import copy
a = [1,2,3]
b = copy.copy(a)
b[2] = 99
print (a)
print (b)
[1, 2, 3]
[1, 2, 99]

Namespaces: Global to Local

Python contains four common namespaces. We are going to investigate the behaviour of three of these:

The global namespace contains all variables that defined in the “main” level of the program
The enclosing namespace contains variables that are accessible to a function/method and are also accessible to any methods defined inside this namespace
The local namespace contains variables that are only accessible to the method that they are defined in
[12]:
x = "global"

def f():
    x = "enclosing"
    print(x)

    def g():
        x = "local"
        print(x)

    g()
    print(x)

print(x)
f()
print(x)
global
enclosing
local
enclosing
global

Now remove x = "enclosing" and/or x = "local" and run the code. What’s happening here?

[13]:
x = "global"

def f():
    print(x)

    def g():
        x = "local"
        print(x)

    g()
    print(x)

print(x)
f()
print(x)
global
global
local
global
global

Objects

Python supports object-oriented programming. This is, in fact, awesome! It can, however, be confusing at first. Let’s break it down…

Staring with a few definitions. First of all, basically everything in Python is an object. You can think of the word “object” to mean “thing”. Any of these things–or objects–can have both attributes and methods.

Attributes are just data associated with (or stored by) an object

Methods are functions that do something with that data (or with other data).

Properties are special functions that do something with data, but behave like attributes.

A class is a set of definitions for the data structure and methods of an object. You can think of this like a blueprint.

An instance is an object using the definitions of a class. You can think of this as a building made from the blueprint.

Let’s try out some examples.

[14]:
def hola():
    print("hello world")
    return 42, "african or european?"
[15]:
hola
[15]:
<function __main__.hola()>
[16]:
meaning_of_life, sparrow_velocity = hola()
hello world
[17]:
print(f"{meaning_of_life=}, {sparrow_velocity=}")
meaning_of_life=42, sparrow_velocity='african or european?'
[ ]:

[18]:
class Person:
    def __init__(self, input_name, input_fav):
        self.name = input_name
        self.fav = input_fav

    def introduce_yourself(self):
        print (f"Hi, I'm {self.name}. I like {self.fav}")
[19]:
Person
[19]:
__main__.Person
[20]:
Fred = Person('Fredrick', 'beer')
Fred.name
[20]:
'Fredrick'
[21]:
Fred
[21]:
<__main__.Person at 0x1106c4b50>
[22]:
Fred.introduce_yourself()
Hi, I'm Fredrick. I like beer

A More Useful Class*

*marginally more useful

[23]:
class Rectangle(object):
    """
    this is a doc string
    """
    #this is just a comment
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID
[24]:
print(Rectangle)
<class '__main__.Rectangle'>
[25]:
r1 = Rectangle(2,3,'f')
print(r1.length)
print(r1.ID)
r2 = Rectangle(5,5,'dd')
2
f
[26]:
r2.length
[26]:
5
[27]:
all_my_rectangles = [r1,r2]
[28]:
all_my_rectangles[0].length
[28]:
2
[29]:
for rect in all_my_rectangles:
    print(rect.ID)
f
dd

Here, we’ve set up a class from which we can create instances later. Note that the syntax looks like a function. There are a couple strange things that deserve an explanation.

  • The argument object is optional and has to do with inheritence (which will only be briefly introduced in this class).

  • It is common to include at least one method

  • __init__ is a special operator that initializes the class.

  • The first argument of __init__ and really any method of a class is self.

More about self

self is the instance of the class that is being operated on. One could use a different name, but it is convention (deeply seated!!) to use self. A nice explanation is found on Stack Overflow and Guido van Rossum wrote an essay on why explicit self can’t go away.

Here’s one more explanation of the use and need for self self history.

Basically, it comes down to Explicit is better then implicit. We want to know explicitly that we are working on an a property of the object we are defining rather than some other function or variable that might be globally defined.

Now let’s make an instance and try all this out.

[30]:
big_rectangle = Rectangle(25, 35, 'rectangle one')
big_rectangle
vars(big_rectangle)
[30]:
{'length': 25, 'width': 35, 'ID': 'rectangle one'}
[31]:
big_rectangle.length
[31]:
25

We see now that we’ve made an instance and it is of the type rectangle. We can check out the attributes using a dot (.).

[32]:
print(big_rectangle.width)
print(big_rectangle.length)
print(big_rectangle.ID)
35
25
rectangle one

We can use this set of attributes as a kind of database.

Questions

How could we make a group of rectangles of varying lengths and widths? - We could make each attribute a list. - We could make a list of instances with a member for each rectangle.

[33]:
myrectangles = Rectangle([25, 78], [44, 42], ['r1', 'r2'])
print(myrectangles)
myrectangles.width[0]

<__main__.Rectangle object at 0x111a111d0>
[33]:
44
[34]:
myrectangles_better = list()
myrectangles_better.append(Rectangle(25, 44, 'r1'))
myrectangles_better.append(Rectangle(78, 42, 'r2'))
myrectangles_better
[34]:
[<__main__.Rectangle at 0x111a1b050>, <__main__.Rectangle at 0x111a1afd0>]
[35]:
myrectangles_better[1].length
[35]:
78

Test your skills

Could we do this is a dictionary rather than a list?

keys —> ‘R1’ ‘R2’

[36]:
d = dict()
d["R1"] = Rectangle(3, 4, 'rect1')
d["R2"] = Rectangle(5, 99, "rect2")
[37]:
d["R1"].length
[37]:
3
[38]:
d["R2"].width
[38]:
99
[ ]:

There are advantages to both approaches. It would also be possible to define each attribute as a list or dictionary and make a single class. This is a bit more cumbersome, though, and part of the flexibility of dynamic lists and dictionaries is the ability to define multiple objects within them on the fly.

Methods

Now say we want to operate on these data, like to calculate the area of each rectangle.

Test your skills

In the blank code block below, calculate the areas for each rectangle using a loop.

[39]:
for k, val in d.items():
    area = val.length * val.width
    print(area)
12
495
[ ]:

[ ]:

Is there a more efficient way do this if we know, for example, that area will be of interest?

We can create a method at definition of the class that uses the attributes of length and width to derive area if called and store it as an additional attribute.

[40]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID

    def calc_area(self):
        # we only pass self because there are no additional attributes of concern
        self.area = self.length * self.width
[41]:
rr = Rectangle(3,4,'this')
vars(rr)
[41]:
{'length': 3, 'width': 4, 'ID': 'this'}
[42]:
rr.calc_area()
rr.__dict__
[42]:
{'length': 3, 'width': 4, 'ID': 'this', 'area': 12}

Let’s make our list of rectangle objects again, but use a method to calculate the areas.

[43]:
all_rectangles = list()
all_rectangles.append(Rectangle(35,25,'rectangle one'))
all_rectangles.append(Rectangle(150, 1000, 'big dog'))
vars(all_rectangles[0])
all_rectangles[0].__dict__
[43]:
{'length': 35, 'width': 25, 'ID': 'rectangle one'}
[44]:
for csqr in all_rectangles:
    csqr.calc_area()
[45]:
# let's loop over again to show the area
for csqr in all_rectangles:
    print(f"{csqr.ID} area={csqr.area}")
vars(all_rectangles[0])
rectangle one area=875
big dog area=150000
[45]:
{'length': 35, 'width': 25, 'ID': 'rectangle one', 'area': 875}

We could even incorporate calculations into the __init__ constructor so the area is calculated on instantiation.

[46]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.area = x*y
        self.ID = ID
[47]:
rr = Rectangle(4,5,'bummer')
rr.__dict__
[47]:
{'length': 4, 'width': 5, 'area': 20, 'ID': 'bummer'}
[48]:
rr.length = 100
rr.__dict__
[48]:
{'length': 100, 'width': 5, 'area': 20, 'ID': 'bummer'}

Uh oh! Length changes but the overall area of the rectangle doesn’t

Properties to the rescue!!!

Properties are a special method that behaves similar to an attribute. These methods allow for on the fly (“dynamic”) calculations and variable construction among other things.

Properties are defined with a special decorator (@property). Decorators are an advanced topic and won’t be covered in this course. More information about decorators, how they are used, and how they work can be found here.

[49]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID

    @property
    def area(self):
        return self.length * self.width
[50]:
rect = Rectangle(5, 9, "awesome rectangle")
print(rect.__dict__)
print(rect.area)
{'length': 5, 'width': 9, 'ID': 'awesome rectangle'}
45

what happens if the length of the rectangle grows?

[51]:
for new_length in range(5, 100, 11):
    rect.length = new_length
    print(f"ID={rect.ID}, length={rect.length}, width={rect.width}, area={rect.area}")
ID=awesome rectangle, length=5, width=9, area=45
ID=awesome rectangle, length=16, width=9, area=144
ID=awesome rectangle, length=27, width=9, area=243
ID=awesome rectangle, length=38, width=9, area=342
ID=awesome rectangle, length=49, width=9, area=441
ID=awesome rectangle, length=60, width=9, area=540
ID=awesome rectangle, length=71, width=9, area=639
ID=awesome rectangle, length=82, width=9, area=738
ID=awesome rectangle, length=93, width=9, area=837

Test your skills!!!

build a Rectangle class that includes an area and perimiter_length property method.

[52]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID

    @property
    def area(self):
        return self.length * self.width

    @property
    def perimeter(self):
        return 2 * (self.length + self.width)
[53]:
r = Rectangle(5, 6, "f")
r.perimeter
[53]:
22
[ ]:

Operator and Special Method Overloading*

One special thing we can do is overload operators. This means we can customize the behavior that an object (as an instance of a class) will exhibit when called. The __init__ constructor was an example of this which we have done something similar already. When an instance is made from a class, whatever is defined in __init__ is performed as part of creating the instance which is effectively overloading the special method __init__.

Another really common example is __str__. This function provides a string for Python to display when print of str is called on an object.

Looking at the example of rectangle objects we used above.

*N.B. –> The concept of overloading is different in Python than in FORTRAN. Also, there appears to be a lack of precision about the use of the terms overload and override. I’m using the terminology from the O’ Reilly book here, but note that you might find other people using override instead. The concept in this case is the same.

[54]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID

    @property
    def area(self):
        return self.length * self.width

    def __str__(self):
        return f"I'm a Rectangle and my name is {self.ID}, " \
            f"length={self.length}, width={self.width}, area={self.area}"
[55]:
rect = Rectangle(3, 4, 'super_awesome_rectangle')
[56]:
rect
[56]:
<__main__.Rectangle at 0x111a3f990>
[57]:
print(rect)
I'm a Rectangle and my name is super_awesome_rectangle, length=3, width=4, area=12

Notice that this prints out a default string telling us the rectangle is an object in this code (e.g. under __main__). But, we can overload and provide a more useful return string using __repr__. This serves the same purpose as __str__ but also returns the string in other circumstances.

What’s the difference between __repr__ and __str__? The Diet Mountain Dew crew has discussed this here

[58]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID

    @property
    def area(self):
        return self.length * self.width

    def __repr__(self):
        return f"I'm a Rectangle and my name is {self.ID}, " \
            f"length={self.length}, width={self.width}, area={self.area}"
[59]:
rect = Rectangle(3, 4, 'super_awesome_rectangle')
[60]:
rect
[60]:
I'm a Rectangle and my name is super_awesome_rectangle, length=3, width=4, area=12
[61]:
 print(rect)
I'm a Rectangle and my name is super_awesome_rectangle, length=3, width=4, area=12

We can even overload other operators like __add__ which will control what happens when this object is added to another.

A complete list of which special methods and operators can be overloaded is found here.

[62]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID

    @property
    def area(self):
        return self.length * self.width

    def __add__(self, other):
        print("what should we add today???")
[63]:
r1 = Rectangle(1, 1, "Small rect")
r2 = Rectangle(3, 4, "bigger one")
[64]:
r1 + r2
what should we add today???

Test your skills – overload __add__ so that adding two rectangles adds their area

Start with the definition we just made. HINT: you will need to represent the other object with other.

[65]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID

    @property
    def area(self):
        return self.length * self.width

    def __add__(self, other):
        print(self.add + other.add)
        return self.add + other.add
[ ]:

[ ]:

Operator overloading is very powerful, but with great power comes great responsibility. Use with caution.

Object-oriented programming (OOP)

In the O’Reilly book Learning Python, 5th Edition is a great discussion about Object-Oriented Programming. The author makes ths distinction that much of what we are doing with Python is object-based but to truly be object-oriented, we need to also use something called inheritence.

Inheritence

Let’s revisit our class for rectangles without the overloading of __add__

[66]:
class Rectangle(object):
    def __init__(self, x, y, ID):
        self.length = x
        self.width = y
        self.ID = ID
        self._type = "Rectangle"

    @property
    def area(self):
        return self.length * self.width

    def __repr__(self):
        return f"I'm a {self._type} and my name is {self.ID}, " \
            f"length={self.length}, width={self.width}, area={self.area}"

We can inherit these characteristics (the methods and properties) in a new kind of class that has a custom bit of functionality. Say we would like to create a second class that is specifically for representing squares.

We can redefine a new class inheriting the rectangle attributes and methods. We can even add the new functionality on top of it.

[67]:
# Rectangle is the parent class
class Square(Rectangle):
    """
    Doc strings for Square
    """
    def __init__(self, x, ID):
        # initialize the parent class using the super().__init__ call
        super().__init__(x, x, ID)
        self._type = "Square"

    @property
    def perimiter(self):
        return 2 * (self.length + self.width)

    def __repr__(self):
        # get the __repr__ from Rectangle using super()
        s = super().__repr__()
        s += f", perimiter={self.perimiter}"
        return s
[68]:
# live code example
rect = Rectangle(2, 4, "Tony")
square = Square(5, "Ravioli")
[69]:
rect
[69]:
I'm a Rectangle and my name is Tony, length=2, width=4, area=8
[70]:
square
[70]:
I'm a Square and my name is Ravioli, length=5, width=5, area=25, perimiter=20

Modules, Packages, and the Standard Python Library

The Standard Python Library is the set of functions that are part of Python by default.

More technically, names point to “objects”. a “module” is a file (with extension .py) that contains python code. If there are functions in that code, they can be accessed using the name of the module and a dot (.).

Packages are collections of modules and are often “installed” to be accessible to Python from anywhere. More on that at the end of the lesson.

Let’s import a module and find a function within it.

[71]:
# live code example
import random
[72]:
random.random
[72]:
<function Random.random()>
[73]:
random.random()
[73]:
0.8012558485047404

Importing code and handling namespaces

There are several main ways to import a module.

The most straightforward way is to just use import <somepackage> as we did above.

[74]:
import math
math
[74]:
<module 'math' from '/Users/mnfienen/miniforge3/envs/pyclass/lib/python3.11/lib-dynload/math.cpython-311-darwin.so'>

This then shows that numpy is a module. Whenever you want to use a function from numpy, you just use the dot like math.sqrt.

The main advantage to this approach is you always know the provenance of any function. Also, you could (bad idea!) make your own functions called sqrt.

[75]:
def sqrt(numb):
    # newton's method
    def f(i0, numb):
        return (i0 ** 2) - numb

    def f_prime(i0):
        return 2 * i0

    err = 100000
    i0 = 5
    while err > 0.01:
        i1 = i0 - f(i0, numb) / f_prime(i0)
        err = abs(i1 - i0)
        i0 = i1

    print (f'my complicated function estimates sqrt as--> {i1}!')

[76]:
# live code example
math.sqrt
[76]:
<function math.sqrt(x, /)>
[77]:
sqrt(5)
my complicated function estimates sqrt as--> 2.2360688956433634!
[78]:
math.sqrt(5)
[78]:
2.23606797749979

Another option is to import only some function you need from a module like from math import sqrt. The problem here is, we don’t necessarily know where this came from. Whichever was either imported or created most recently gets that name in the namespace. DANGER!

[79]:
from math import sqrt
sqrt
[79]:
<function math.sqrt(x, /)>

You can also use an alias to import a specific function like from math import sqrt as math_sqrt. In this case, and in the case above, you can get the provenance from the import statements at the top of the code, but if the code gets really long, this can be hard to keep track of.

[80]:
from math import sqrt as math_sqrt
math_sqrt
[80]:
<function math.sqrt(x, /)>

Living really dangerously, you can import all functions from a module like from math import *

[81]:
from math import *
sqrt, log, log10, floor, ceil
[81]:
(<function math.sqrt(x, /)>,
 <function math.log>,
 <function math.log10(x, /)>,
 <function math.floor(x, /)>,
 <function math.ceil(x, /)>)

The problem here is, you now have access to all these functions, but you also don’t know provenance at all. Some modules, like numpy, which will be covered later in this class, are large and have many functions (many of which may have common names that you might use yourself and that you might not be aware of).

So…..really, the safest way is like the first way, but that can get long (for example, if you use import matplotlib, then every time you use a function from the module you have to type matplotlib.<some function> and that gets verbose. A compromise is importing an entire module but assigning it an alias like import numpy as np

[82]:
import numpy as np

There is a commonly accepted set of aliases for some common scientific computing modules that we recommend:

  • import matplotlib.pyplot as plt

  • import numpy as np

  • import matplotlib as mpl

  • import pandas as pd

In addition to keeping the provenance straight, adopting this protocol helps make your code more readable by other people. Remember the Zen of Python!!!

[83]:
import this
print("".join([this.d.get(c, c) for c in this.s]))
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Paths for importing and installation

From the official documentation, the hierarchy of searching for modules and packages is:

  • the directory containing the input script (or the current directory).

  • PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).

  • the installation-dependent default.

The PYTHONPATH variable is a system variable on Windows and thus requires an administrative account to change/add to. You can see your search path using the built-in sys module.

[84]:
import sys
sys.path
[84]:
['/Users/mnfienen/Documents/GIT/python-for-hydrology/notebooks/part0_python_intro/solutions',
 '/usr/local/condor/lib/python3',
 '/Users/mnfienen/miniforge3/envs/pyclass/lib/python311.zip',
 '/Users/mnfienen/miniforge3/envs/pyclass/lib/python3.11',
 '/Users/mnfienen/miniforge3/envs/pyclass/lib/python3.11/lib-dynload',
 '',
 '/Users/mnfienen/miniforge3/envs/pyclass/lib/python3.11/site-packages']

Exercise: putting it all together

In this exercise we’ll create our first module and import it into this notebook.

Open the python file “exercise_xx.py” in an IDE or text file and create a class called Circle. Inputs to circle should be a radius and ID. Include in the Circle class a way to the calculate area and the circumference. After building the class, try importing into this notebook and using the it.

Bonus exercise: Find a way to make the Circle objects divisible and compare the difference in area between a 12” and 14” pizza.

[85]:
from circle_module import Circle
[86]:
med = Circle(12, 'medium')
large = Circle(14, "large")

print(med/large)
0.7346938775510204
[ ]:

[ ]:

[ ]: