Introduction
This tutorial was presented at the NGCM Summer Academy 2016 Basics B course, and is created for the purpose of teaching Python programming concepts to extend that of the Software Carpentry style basics course. The material is taught in Python 3, and code outputs displayed are the result from a Python 3.5 Jupyter notebook.
The concepts of OOP are often covered lightly in undergraduate and basics courses. Despite this, Large amounts of scientific Python packages are written in an object oriented manner. The skills learned in this workshop should therefore enable scientists to write (and more importantly, read) the packages they are using, and also serve as a starting platform for those wanting to learn more advanced OOP languages such as C++ or Java.
Software Requirements
Setup instructions for a range of users are available here. Click here for more info on how to start a live Jupyter notebook.
Getting Started
Teaching material is largely in Jupyter Notebook format, and should be downloaded or cloned from GitHub - data included in the links will be necessary for completing the exercises.
Material Outline¶
Basics Revision
- Lists, loops, functions, Numpy, Matplotlib
Tuples, mutability
Dictionaries
- Associative data, function argument unpacking
Material Outline (ctd)¶
- Object Oriented Programming
Classes
- Initialization and
__init__
,self
- Encapsulation
- Inheritance
- Magic operators
Prerequisites¶
- Python >= 3.4, Jupyter (Notebook)
- Numpy, Matplotlib
- Download material: https://github.com/p-chambers/Intermediate_Python/archive/master.zip
If you are following this course and do not know how to obtain the above requirements, see Setup Instructions.
# Run this cell before trying examples
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
Basics Refresher¶
- Lists
For
loops- Functions
- Numpy
- Matplotlib
External Material¶
- If you missed the basics, try these:
Lists¶
- Mutable Container
- Often appended to in a loop with '
append
' - Handles mixed data types
- Allows indexing and slicing
Example: List indexing and slicing¶
lists = [10, 12, 14, 16, 18]
print(lists[0]) # Index starts at 0
print(lists[-1]) # Last index at -1
print(lists[0:3]) # Slicing: exclusive of end value
# i.e. get i=(0, 1, .. n-1)
print(lists[3:]) # "slice from i=3 to end"
Example: Methods of lists¶
# List construction example
a = []
print(a)
a.append('Hello world')
print(a)
a.extend([1, 2, 3, 4])
print(a)
a.remove(1) # Remove value 1 from a
print(a)
a.pop(0)
print(a)
# All methods
# a. # Tab complete behaviour?
For Loops¶
- Need to iterate through something in Python
- Not an explicit counting loop as used in lower level languages
- Syntax (
var
is user defined) -For var in iterable:
Basic Example:¶
powers = [0, 1, 2, 3]
for power in powers:
value = 10 ** (power)
print("10 to the power of {} is {}".format(power, value))
# Better to use Pythons built in 'range' here:
for i in range(4):
print("10 to the power of {} is {}".format(i, 10**i))
List compehension¶
x = [i**2 for i in range(10)]
y = [i*10 for i in range(10)]
print(x)
print(y)
#### How would you create a list from zero to 100 in increments of 5 in one line
Functions¶
def
followed by function name and parameters in a parenthesisreturn output_var(s)
ends the function- Args parsed by assignment: Mutability dictates whether the function can update args directly
Basic function¶
def square_root(x):
"""Useful docstring: Calculates and returns square root of x"""
i = x ** 0.5
return i
x = 10
y = square_root(x)
print('The square root of {} is {}'.format(x, y))
# We can set a default value to the function
def square_root(x=20):
i = x ** 0.5
return i
print(square_root())
# Loops, functions and appending
mylist = []
for i in range(1,5):
mylist.append(square_root(i))
print(mylist)
Example: Arguments and mutability¶
def update_integer(i):
# attempt to update i (integers) are immutable
i += 1
def update_list_end(arglist):
arglist[-1] = 50 # Lists are mutable: updates args directly!
a = 1
update_integer(a)
print(a)
mylist = [0, 1, 2, 3, 4]
update_list_end(mylist)
print(mylist)
Note above that there is no return
statement required: implicitly this function will return the Python builtin-in value None
.
Numpy¶
- Arrays and array operations
- Mathematical evaluations - fast on
np.array
- Linear algebra
- Useful functions
- Allows integration between C/C++ and Fortran
Examples: Basic functions¶
import numpy as np
# basic usage: arange, linspace, array ops
x = np.linspace(0, 10, 11) # use 11 points
print(x)
y = np.arange(0, 10, 1) # use step size of 1
print(y)
print('The average of x is', np.average(x))
Example: 2D arrays¶
M1 = np.array([[2,3],[6,3]])
M2 = np.array([[5,6],[2,9]])
print('M1:')
print(M1)
print('M2:')
print(M2)
M3 = M1 * M2 # Element-wise multiplication
print(M3, '\n')
M4 = np.dot(M1, M2) # Matrix multiplication
print(M4)
# Given array [0, np.pi/2., np.pi, 3*np.pi/4.] what would you
# expect passing it to np.sin ????
# live coding show some numpy functions.
Matplotlib¶
- Popular plotting library
- Can produce publication quality plots
- Allows embedded LaTeX formatting
Example: 2D plotting¶
x = np.linspace(0, 2*np.pi)
y = np.sin(x)
fig = plt.figure(figsize=(12, 5))
ax = fig.add_subplot(111)
ax.plot(x, y,'o-')
ax.margins(0.1)
ax.set_title('2D plot')
ax.set_xlabel('$x$')
ax.set_ylabel(r'$sin(x)$')
What if we want increments of $\pi$ on our x axis¶
xtick_values = np.linspace(0, 2*np.pi, 5)
xtick_labels = ['$0$', r'$\frac{\pi}{2}$', r'$\pi$', r'$\frac{3\pi}{2}$',
r'$2\pi$']
fig = plt.figure(figsize=(12, 5))
ax = fig.add_subplot(111); ax.plot(x, y,'-o')
ax.set_title('2D plot')
ax.margins(0.1)
ax.set_xlabel('$x$'); ax.set_ylabel(r'$sin(x)$')
ax.set_xticks(xtick_values)
ax.set_xticklabels(xtick_labels, fontsize=25);
Example: 3D plot¶
x = np.linspace(-1, 1, 101)
y = np.linspace(-1, 1, 101)
X, Y = np.meshgrid(x, y)
Z = np.sin(X + Y)**2
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(12, 5))
ax = fig.add_subplot(111, projection='3d')
surf = ax.plot_surface(X, Y, Z)
ax.set_xlabel(r'$X$')
ax.set_ylabel(r'$Y$')
ax.set_zlabel(r'$Z$')
plt.show()
# Live coding Bay....
Tuples¶
- An Immutable List
- Faster than a List (fixed memory)
- Useful for structured data
- No append method - bad for sequential data
Example: Tuple Syntax¶
# Create a 'Name, Age' Tuple using bracket notation
my_tuple = ('Dave', 42)
print(type(my_tuple))
print(my_tuple)
# Create Tuple using bracket-less notation
my_tuple2 = 'Bob', 24
print(type(my_tuple2))
print(my_tuple2)
Example: Usage¶
# Tuple indexing
my_tuple = ('Dave', 42)
print(my_tuple[0])
print(my_tuple[1])
# Could make a list of tuples:
tups = [('Dave', 42), ('Bob', '24')]
# ... and then iterate over it
for tup in tups:
print("{} is {} years old".format(tup[0], tup[1]))
Example: Tuple Unpacking¶
# Store multiple variables using tuples:
my_tuple = 'Dave', 42
a, b = my_tuple
print('a = {}'.format(a))
print('b = {}'.format(b))
# Swap Variables using tuples:
b, a = a, b
print('a = {}'.format(a))
print('b = {}'.format(b))
Example: When NOT to use a Tuple (1)¶
# extending or overwriting contents
my_tuple = 'Dave', 42
# my_tuple[0] = 'Steve' # Will give an error
Example: When NOT to use a Tuple (2)¶
# Sequences: Stick with a list
seq = [] # tuples have no append method, so need a list []
for i in range(10):
seq.append(i**2)
print(seq)
# Or a numpy array:
print(np.arange(10)**2)
# Create a tuple of lists 'a' - can you change the values in
# the lists?
# Live coding Bay....
Dictionaries¶
- Set of
key
:value
pairs - Ordering follows hash table rules, not so intuitive to humans
- Use curly braces - {} or
dict
keyword
Example: Fruit Prices Lookup Table - Construction¶
# Using the dict function:
fruit = [('apples', 2), ('bananas', 5), ('pears', 10)]
price_table = dict(fruit)
print(price_table)
# Short hand (Arguably neater)
price_table = {'apples': 2, 'pears': 10, 'bananas': 5}
print(price_table)
Note: notice that the consistent order on printing of the dictionaries, even though the inputs are reordered. The ordering of hash tables is well defined, but not in a human-intuitive sense. We should therefore treat the data as if it was unordered.
Example: Accessing values from keys¶
price_table = {'apples': 2, 'bananas': 5, 'pears': 10}
akey = 'apples'
print("The price of {} is {}p".format(akey, price_table[akey]))
Example: Iterating over a dictionary¶
# Iterating over the dictionary will iterate over its keys
price_table = {'apples': 2, 'bananas': 5, 'pears': 10}
for key in price_table:
print("{} cost {}p".format(key, price_table[key]))
# Or use the items method:
for key, val in price_table.items():
print("{} cost {}p".format(key, val))
Example: Shopping list using dictionary price lookup¶
# I don't like pears, so let's buy apples and bananas
shopping_list = [('apples', 50), ('bananas', 20)]
total = 0
for item, quantity in shopping_list:
price = price_table[item]
print('Adding {} {} at {}p each'.format(quantity, item, price))
total += price * quantity
print(total)
Example: When NOT to use a Dictionary¶
# Hoping for ordered data:
alpha_num = {'a': 0, 'b': 1, 'c': 2}
for i, key in enumerate(alpha_num.keys()):
print("{} has a value of {}".format(key, i)) # This is wrong
Example: Dictionary unpacking using '**'¶
mydict = {'a':1, 'b':2, 'c':3}
def myFunc(a,b,c):
return a*2, b*2, c*2
myFunc(**mydict)
# Live coding Bay....
Structured data: Numpy dtypes¶
- Not a class, but motivational example
- Structured associative data
- Multiple data accessible from a single data type
- Identifiers to indicate data type
Example: Data about people¶
with open('data/structured_data.txt', 'w') as f:
f.write('#Name Height Weight\n')
f.write('John 180 80.5\n')
f.write('Paul 172 75.1\n')
f.write('George 185 78.6\n')
f.write('Ringo 170 76.5\n')
# Notice that the argument is a list of tuples
dt = np.dtype([('Name', np.str_, 16), ('Height', np.int32),
('Weight', np.float64)])
data = np.loadtxt('data/structured_data.txt', dtype=dt)
print(data)
print(data['Name'])
print("{} has weight {}".format(data[0]['Name'], data[0]['Weight']))
# Live coding Bay....
Exercise: Numpy dtypes
¶
- Data is structured, but not elegant
- no methods etc
Intro to Python OOP¶
(For The Classy Programmer)¶
"Object-oriented programming (OOP) refers to a type of computer programming in which programmers define not only the data type of a data structure, but also the types of operations (functions) that can be applied to the data structure."
Source: Webopedia
Why OOP?¶
- Naturally structured data
- Functions used in context
- Reduce duplicate code
- Maintainability in large codes/software
- Many other reasons
OOP in Scientific Computing¶
- Java,
C++
and Python designed for OOP - Everything in Python is an object
- Scientific libraries, visualisation tools etc.
- Pseudo Object Orientation in
C
andFortran
What will I learn about OOP here?¶
- Language in OOP is very different
- Learn language used in eg. C++, Java
- Ability to read code is essential
- Write/migrate code for community library
- Better world! Work recognition etc...
OOP: Four Fundamental Concepts¶
Inheritance
- Reuse code by deriving from existing classes
Encapsulation
- Data hiding
OOP: Four Fundamental Concepts (2)¶
Abstraction
- Simplified representation of complexity
Polymorphism
- API performs differently based on data type
Note: Encapsulation is sometimes also used in OOP to describe the grouping of data with methods. It is however more common for texts to use it to describe the hiding of data as will be done here.
Useful explanations of these concepts for Python can also be found here
Classes: Basics¶
- Attributes (data)
- Methods (Functions operating on the attributes)
- 'First class citizens': Same rights as core types
- pass to functions, store as variables etc.
Example: Numpy arrays showing how it's done¶
# Numpy arrays are classes
import numpy as np
a = np.array([0, 1, 6, 8, 12])
print(a.__class__)
print(type(a))
# We want to operate on the array: try numpy cumulative sum function
print(np.cumsum(a))
# np.cumsum('helloworld') # Should we expect this to work?
Example: Numpy arrays showing how it's done (ctd.)¶
We only know what a cumulative sum means for a narrow scope of data types
Group them together with an object!
# cumsum is a method belonging to a
a.cumsum()
Example: Simple class¶
- For now, assume all classes defined by:
class ClassName(object)
class Greeter(object):
def hello(self): # Method (more on 'self' later)
print("Hello World")
agreeter = Greeter() # 'Instantiate' the class
print(agreeter)
# agreeter. # Tab complete?
There's a few things here which I haven't introduced, but all will become clear in the remainder of this workshop.
# Note that we don't pass an argument to hello!
agreeter.hello()
Classes: Initialisation and self
¶
__init__
class method- Called on creation of an instance
- Convention:
self
= instance - Implicit passing of self, explicit receive
Note: Passing of self
is done implicitly in other languages e.g. C++ and Java, and proponents of those languages may argue that this is better. "Explicit is better than implicit" is simply the python way.
What is an "instance"?¶
Class
is like a type- Instance is a specific realisation of that type
- eg. "Hello World" is an instance of string
- Instances attributes are not shared
Classes: Initialisation vs Construction¶
- Initialisation changes the instance when it is made
- ...
__init__
is not technically Construction (see: C++) __new__
'constructs' the instance before__init__
__init__
then initialises the content
More info: The Constructor creates the instance, and the Initialiser Initialises its contents. Most languages e.g. C++ refer to these interchangably and perform these steps together, however the new style classes in Python splits the process.
The difference is quite fine, and for most purposes we do not need to redefine the behaviour of __new__
. This is discussed in several Stack Overflow threads, e.g.
Example: Class Initialisation¶
class A(object):
def __init__(self):
print("Hello")
a_instance = A()
print(type(a_instance))
Instance attributes and Class attributes¶
- Instance attributes definition:
self.attribute = value
- Class attributes defined outside functions (class scope)
- Class attributes are shared by all instances
- Be careful with class attributes
Example: Defining Instance attributes/methods¶
class Container(object):
"""Simple container which stores an array as an instance attribute
and an instance method"""
def __init__(self, N):
self.data = np.linspace(0, 1, N)
def plot(self):
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(self.data, 'bx')
mydata = Container(11) # 11 is passed as 'N' to __init__
print(mydata.__dict__) # __dict__ is where the attr: value
# pairs are stored!
mydata.plot()
Your turn!¶
- Implement a class which takes an input number 'N', and doubles and stores it as an instance attribute
- Test it!
# Code solution here:
Example: Class attributes vs instance attributes¶
class Container(object):
data = np.linspace(0, 1, 5) # class attribute
def __init__(self):
pass
a, b = Container(), Container()
print(a.data)
print(b.data)
a.data = 0 # Creates INSTANCE attribute
Container.data = 100 # Overwrites CLASS attribute
print(a.data)
print(b.data)
Class vs Instance attributes: priority¶
- Source: toptotal.com
Note: There's a couple of things going on in this example which are worth elaborating on. By specifying ClassName.attribute
, in this case Container.data = 100
we've overwritten the value of data
that EVERY instance of the Container
class will access. Hence printing b.data
gives the expected result.
By setting a.data
at the same time, we have set an instance attribute, which is given priority and called first even though we overwrote the class attribute after assigning this.
This could create a hard to track bug. To avoid it:
- Stick to instance variables unless you specifically need to share data e.g. constants, total number or list of things that are shared
- Don't overwrite things you know are class attributes with
instance.attr
unless you really know what you're doing (even then, it's probably better and more readable to make it an instance attribute)
For a really in depth explanation of class vs instance attributes, see either of the following links:
Example: Implicit vs Explicit passing Instance¶
class Container(object):
def __init__(self, N):
self.data = np.linspace(0, 1, N)
def print_data(self):
print(self.data)
a = Container(11)
a.print_data() # <<< This is better
Container.print_data(a)
Classes: Encapsulation¶
- Hiding data from users (and developers)
- Use underscores '
_
' or '__
' - Useful if data changing should be controlled
- Convention only in Python - not enforced!
Single vs Double Underscore¶
Single underscore
- Nobody outside this class or derived classes should access or change
Double underscore
- Stronger attempt to enforce the above
- Also 'mangles' the attribute name with
instance._ClassName__Attribute
Example: Data hiding "protected", single underscore¶
class Fruit(object):
def __init__(self):
self._hasjuice = True
def juice(self):
if not self.isfull(): raise ValueError('No juice!')
self._hasjuice = False
def isfull(self):
return self._hasjuice
orange = Fruit()
print(orange.isfull())
orange.juice()
print(orange.isfull())
# orange. # tab completion behaviour?
# orange._ # tab completion behaviour now?
orange._hasjuice = True # bad!
orange.isfull()
Example: Data hiding "private", double underscore¶
class Fruit(object):
def __init__(self):
self.__hasjuice = True
def juice(self):
if not self.isfull(): raise ValueError('No juice!')
self.__hasjuice = False
def isfull(self):
return self.__hasjuice
apple = Fruit()
# apple._ # tab completion behaviour?
apple.juice()
apple._Fruit__hasjuice = False # Definitely bad!
apple.isfull()
Note: This behaviour can be over used in Python. Programmers from C++ or Java backgrounds may want to make all data hidden or private
and access the data with 'getter' or 'setter' functions, however it's generally accepted by Python programmers that getters and setters are unnecessary. The Pythonista phrase is "we are all consenting adults here", meaning you should trust the programmer to interact with your classes and they should trust you to document/indicate which parts of the data not to touch unless they know what they're doing (hence the underscore convention). See the top answer on this Stack Overflow thread.
For an entertaining view of encapsulation, see this blog
# Live coding Bay....
Classes: Inheritance¶
- Group multiple objects and methods
Child
/Derived
class inherits fromParent
/Base
- Reduce duplicate code
- Maintanable: changes to base falls through to all
- Beware multiple inheritance rules - we won't cover this
Example: Simple inheritance¶
class Parent(object):
# Note the base __init__ is overridden in
# Child class
def __init__(self):
pass
def double(self):
return self.data*2
class Child(Parent):
def __init__(self, data):
self.data = data
achild = Child(np.array([0, 1, 5, 10]))
achild.double()
Example: Calling parent methods with super
¶
class Plottable(object):
def __init__(self, data):
self.data = data
def plot(self, ax):
ax.plot(self.data)
class SinWave(Plottable):
def __init__(self):
super().__init__(
np.sin(np.linspace(0, np.pi*2, 101)))
class CosWave(Plottable):
def __init__(self):
super().__init__(
np.cos(np.linspace(0, np.pi*2, 101)))
fig = plt.figure()
ax = fig.add_subplot(111)
mysin = SinWave(); mycos = CosWave()
mysin.plot(ax); mycos.plot(ax)
Notes:
- We didn't need any arguments to
super
here as Python 3 allows this super
requires additional arguments in Python 2, e.g.super(Class, self).method(args...)
If you were wondering why we should use super().method
instead of BaseClass.method
, other than the convenience of renaming classes, it relates to multiple inheritance which is beyond the scope of this course. If you need to write programs with multiple inheritance (and there are strong arguments against this), you may want to look at this blog for advanced use of super
.
Classes: Magic Methods¶
- The workhorse of how things 'just work' in Python
object
& builtin types come with many of these- Surrounded by double underscores
- We've seen one already:
__init__
!
Example: The magic methods of object¶
dir(object)
Magic Methods: Closer look¶
__lt__
:- Called when evaluating
a < b
- Called when evaluating
__str__
- Called by the print function
All magics can be overridden
Example: overriding __str__
and __lt__
magic methods¶
class Wave(object):
def __init__(self, freq):
self.freq = freq
self._data = np.sin(np.linspace(0, np.pi, 101)
* np.pi*2 * freq)
def __str__(self):
"""RETURNS the string for printing"""
return "Wave frequency: {}".format(self.freq)
def __lt__(self, wave2):
return self.freq < wave2.freq
wav_low = Wave(10)
wav_high = Wave(50) # A high frequency wave
print(wav_high)
wav_low < wav_high
# Live coding Bay....
Note: Magic methods are very briefly introduced here. For an extensive overview of magic methods for Python classes, view Rafe Kettlers blog
Exercise: Inheritance and Magic Methods¶
Python 2 vs 3¶
- Dont need to inherit
object
in Python 3 - New classes inherit from
object
- Inheritance behaviour is different
- Old classes removed in Python 3
Example: New class default Python 3¶
class OldSyntax:
pass
class NewSyntax(object): # This means 'inherit from object'
pass
print(type(OldSyntax)) # Would give <type 'classobj'>
# in Python 2
print(type(NewSyntax))
- Backwards compatibility: inherit
object
in Py3
Notes: There are other differences affecting classes which we have not included, such as metaclasses and iterator behaviour, but here is a link to a more complete comparison:
- Other things worth looking at (an incredibly biased opinion):
- Building Pythonic Packages with setuptools
- Unit testing with py.test
- Conda Environments
- Play with a visualisation/GUI package!
Summary¶
- Refreshed your basic Python skills
- Python builtins not included in standard basics course
- Tuples, Dictionaries, Generators
- Software data structuring
Summary (ctd)¶
- Covered the language and concepts for Object Orientation (Transferable!)
- OOP Implementation in Python
Read
the packages you use daily- Create maintainable packages for yourself and scientific community
Thank You¶
P.R.Chambers@soton.ac.uk
Comments
comments powered by Disqus