#+GIS Programming with Python & Julia:
GIS Programming, Dr. Qiusheng Wu
Jupyter Notebooks
-
The name is a reference to the Julia, Python, and R
-
It has the UI to edit code and text
-
It has the kernel that executes the code
- And the underlying JSON file format with the ".ipynb" extension.
-
It has the kernel that executes the code
- It has the ability to move, execute, delete outside of novelty read-eval-print-loop
-
It has the UI to edit code and text
-
Project Jupyter provides two options both in form of a web app
- Jupyter Notebook
-
Jupyter Lab
- with fully-featured IDE
- So many other options both web-based and client-based
Marimo https://marimo.io/
-
is an open source reactive Python notebook:
-
run a cell or interact with a UI element, and marimo automatically runs dependent cells
- keeping code and outputs consistent and preventing bugs before they happen.
-
run a cell or interact with a UI element, and marimo automatically runs dependent cells
Emacs IPython Notebook (EIN) : https://millejoh.github.io/emacs-ipython-notebook/
-
is a jupyter client for all languages in Emacs
- Copy/paste cells in and between notebooks.
- Console integration: You can easily connect to a kernel via a console application. This enables you to start debugging in the same kernel. It is even possible to connect a console over ssh.
- An IPython kernel can be “connected” to a buffer. This enables you to evaluate buffer/region using same kernel as notebook. Notebook goodies such as tooltip help, help browser and code completion are available in these buffers.
- Jump to definition (go to the definition by executing M-. over an object).
- Execute code from an org-mode source block in a running kernel.
XJupyter
- no heavy weight notebook server
- no ipynb files
- notebooks are saved like a regular txt file to the '*.jpr' extension
- XJupyter uses mode overlays to intersperse python mode blocks in a non-python buffer
- fully fledged undo
Commercial Emacs
-
Gnus is rewritten to be non-blocking
- Gnus Network User Services, a message reader supports news and mail, MIME-compliant, etc..
-
Process management is rewritten
-
GNU emacs process management encompasses creating, controlling, and communicating with subprocesses, network connections, serial port connections, and pipe connections
-
process object types and representations
- child processes of Emacs (shell commands)
- TCP and UDP network connections
- connections to serial ports
- communication through pipes **All these connection types are represented by the same C structure: LispProcess ** READ MORE HERE
-
process object types and representations
-
GNU emacs process management encompasses creating, controlling, and communicating with subprocesses, network connections, serial port connections, and pipe connections
-
See more features at gh repo https://github.com/commercial-emacs/commercial-emacs
-
also moving garbage collector
-
moving collectors relocate Lisp values in memory
-
in GNU emacs allocating say a cons cell, will let it remain as its birth address in perpetuity
-
a cons cell, also known as a dotter pair, is simply a pair of two objects.
-
the car of a list is the first item
-
the cdr returns the part of the list that follows the first item
(cons "a" 4) (car (cons "a" 4)) ;;=> "a" (cdr(cons "a" 4)) ;;=> 4
-
-
a cons cell, also known as a dotter pair, is simply a pair of two objects.
-
in GNU emacs allocating say a cons cell, will let it remain as its birth address in perpetuity
-
non-moving collectors can not do generational sequestration
-
that is keeping the youngest cohort of Lisp values separated from older ones
- allows for fast intermediary cycles which only scan the nursery generation
- non moving collector must traverse the full set on each cycle since its allocations are interleaved
-
that is keeping the youngest cohort of Lisp values separated from older ones
-
moving collectors relocate Lisp values in memory
Quickstart
https://learnxinyminutes.com/julia/
https://learnxinyminutes.com/python
Overview of variables and data types
-
variables allow you to store and manipulate information
-
in Julia, a variable is a name associated (or bound) to a value
- they can be assigned using the `=` operator
- PascalCase/camelCase
-
in Python,a variable is a name that is a pointer to a object
- they can be assigned using the `=` operator
- snakecase
-
in R, variables are named storage locations that hold data values
-
they can be assigned a value using operators like `<-` or `=`
- snakecase
-
they can be assigned a value using operators like `<-` or `=`
-
in Julia, a variable is a name associated (or bound) to a value
-
data types define the kind of operations you can perform on this information
- Julia, Python, and R are all dynamically typed.
Stylistic Conventions
Style Guide for Python Code
-
Variable names must start with a letter or an underscore
-
The remainder of the variable name can consist of letters, numbers, and underscores
-
variable names are case-sensitive, so numpoints and NumPoints are different variables
-
variables names should be descriptive and meaningful, such `numpoints` instead of n
- avoid using python keywords and built-in functions as variable names
-
Names to avoid: never use single character variable names with characters `l`, `O`, `I`
-
ASCII compatibility
-
Package and Module names:
- modules should have short, lowercase names, w/ underscores if readability is improved
- packages should have short, lowercase names, and underscores are discouraged
-
when extension module written in C or C++ has a Python module that provides OOP interface
- the C/C++ module has a leading underscore
-
Class names: use CapWords convention
-
Type Variable names: use CapWords convention and short names, also add suffixes to delcare covariant/contravariant behavior
-
Exception names: CapWords using Error suffix
-
Global Variable names: lowercase w/ underscores for readability
-
Modules designed for use `from M import *` should use the all mechanism to prevent exporting globals
- the older convention is prefixing such globals withh an underscore, which can be used to indicate "module non public"
-
Modules designed for use `from M import *` should use the all mechanism to prevent exporting globals
-
Function and Variable names: lowercase w/ underscores for readability
-
Function and Method Arguments: always use `self` for the first argument to instance methods, always use `cls` for the first argument to class methods.
- if a function argument's name clashes with a reserved keyword, it is best to append a single trailing underscore
-
Method names and Instance Variables: lowercase with words separated by underscores to increase readability if neccessary
-
use one leading underscore only for non-public methods and instance variables
- to avoid name clashes with subclasses, use two leading underscores to invoke Python's name mangling rules
-
use one leading underscore only for non-public methods and instance variables
-
Constants: usually defined on a module level and written in all capital letters with underscores separating words
-
Designing for Inheritance: Always decide whether a class's methods and instance variables (collectively: attributes) should be public or non-public
-
public attributes have no leading underscores
- if clashing with a reserved keyword then append a trailing underscore
-
for simple public data attributes, it is best to expose just the attribute name
-
use properties to hide functional implementation behind simple data attribute access syntax
- avoid using properties for computationally expensive operations
-
use properties to hide functional implementation behind simple data attribute access syntax
-
if your class is to be subclassed and there are attributes that you do not want subclasses to use,
- consider naming them with double leading underscores and no trailing underscores
-
public attributes have no leading underscores
-
variables names should be descriptive and meaningful, such `numpoints` instead of n
-
variable names are case-sensitive, so numpoints and NumPoints are different variables
-
The remainder of the variable name can consist of letters, numbers, and underscores
Python Objects, Values, Types, Functions, Classes, Coroutines,
object, garbage collection, truth value, etc.
-
objects are python's abstraction for data
- every object has an address in memory, a type, and a value For CPython, id(x) is the memory address where x is stored
-
values of some objects can change and these are mutable
-
some objects are unchangeable and these are immutable
- instance, numbers, strings, and tuples are immutable
- dictionaries and lists are mutable
-
some objects are unchangeable and these are immutable
-
objects are never explicitly destroyed
- when they become unreachable they may be garbage-collected see the gc module for info on controlling collection of cyclic garbage in CPython
-
when objects contain references to 'external' resources like open files or windows
-
garbage collection is not guaranteed to happen
-
programs are strongly recommended to explicitly close such objects
- the try … finally statement and the with statement provide convenient ways to do this
-
programs are strongly recommended to explicitly close such objects
-
garbage collection is not guaranteed to happen
-
some objects contain references to other objects and these are containers
-
the references are part of a container's value
-
the mutability of a container is implied through the identities of the immediately contained objects
-
practically all objects can be compared for equality
- and converted to a string using the `repr()` function or `str()` function
-
-
Any object can be tested for truth value, for use in an if or while condition or as operand of Boolean operations
-
by default an object is considered true
-
unless its class defines a `_bool_()` method that returns false or
-
a _len_() method that returns zero, when called with the object
built-in objects considered false (None, False, 0, 0.0, 0j, Decimal(0), Fraction(0,1), '', (), [], {}, set(), range(0))
-
-
by default an object is considered true
-
python provides a built-in object called Ellipsis to be used as a placeholder
-
can be used in comparisons or custom logic
- placeholder for 'defined but not yet implemented'
-
NumPy shorthand for accessing and slicing high-dimensional arrays
-
represents all preceding dimensions
- no need to specify each index for every dimension
-
ellipsis can only represent all preceding dimensions once in the slice
- using it multiple times will raise an IndexError
-
represents all preceding dimensions
- type hinting that a function can accept any number or type of parameters
- used as a secondary prompt in python's REPL to indicate that the interpreter is expecting more input
- can be used as a default argument to distinguish between a value not being provided and it being explicitly set to None
-
can be used in comparisons or custom logic
-
types affect almost all aspects of object behavior
- below these are standard types that are built into the intepreter
boolean
-
Boolean represent truth values, True and False
-
bool() converts any value to a boolean
- and, or, and != should be preferred over &, |, and ^
-
bool() converts any value to a boolean
-
bool is a subclass of int
- please explicitly convert using int() for integer behavior
x or y # if x is true, then x, else y
# short-circuit operators
x and y # if x is falsee, then x, else y
not x # if x is false, then True, else False
not a == b
# is interpreted as
not (a == b)
# but below is syntax error because not has a lower priority than non-Boolean operators
a == not b
comparison
-
there are eight comparison operators
-
can be chained arbitrarily
x < y <= z # is equivalent to x < y and y <= z # except that y is evaluated only once # but in both cases z is not evaluated at all when x < y is false operations = [<, <=, >, >=, ==, !=, is, is not]
-
Objects of different types, except different numeric types, NEVER compare equal
-
the == operator is always defined but for some object types is equivalent to is.
- <, <=, >, >= operators are only defined where they make sense
-
Non-identical instances of a class normally compare as non-equal unless the class defines the _eq_() method
-
other conventional class instance means of comparison operators _lt_(), _le_(), _gt_(), and _ge_()
-
behavior of is and is not operators cannot be customized
- also can be applied to any two objects and never raise an exception.
-
in and not in are operations with the same syntactic priority
- supported by types that are iterable or implement the _contains_() method.
-
behavior of is and is not operators cannot be customized
-
other conventional class instance means of comparison operators _lt_(), _le_(), _gt_(), and _ge_()
numerics
- numeric objects are immutable
numbers.Number
- created by numeric literals and returned by arithmetic operators and arithmetic built-in functions
- integers, floating-point numbers, complex numbers
numbers.Integral
- represents elements from the mathematical set of integers (positive and negative)
- int, bool
numbers.Real
- represents machine-level double precision floating-point numbers
- float
numbers.Complex
- represents complex numbers as a pair of machine-level double precision floating-point numbers
- complex
-
three distinct types
-
integers
- represents numbers in an unlimted range, subject to avaible (virtual) memory only
-
floating-point numbers implemented using double in C
- use sys.floatinfo for precision of f-p nums for host machine
- standard library includes additional numeric types fractions.Fraction, for rationals, and decimal.Decimal, for f-p nums w/ user definable precision
-
complex numbers A complex number z, use z.real and z.imag
- numbers that can be expressed in the form \(a + bi\)
-
a is the real, b is the imag
- i is the imaginary unit, defined as \(i = \sqrt{-1}\)
-
integers
-
Numbers are created by numeric literals or as the result of built-in functions and operators
-
Supports mixed arithmetic
-
when a binary arithmetic operator has operands of different numeric types
-
the `narrower` type is `widened` to that of the other
- int(), float(), complex()
-
the `narrower` type is `widened` to that of the other
-
when a binary arithmetic operator has operands of different numeric types
All numeric types (except complex) support the following operations
All numbers.Real types (int and float) also include the following operations
- math.trunc(x) x truncated to Integral
- round(x[,n]) x rounded to n digits, rounding half to even. if n is omittted, it defaults to 0.
- math.floor(x) the greatest Integral <= x
- math.ceil(x) the least Integral >= x
see more numeric operations on math and cmath modules
Deprecated since version 3.12: The use of the bitwise inversion operator ~ is deprecated and will raise an error in Python 3.16.
All bitwise operators
x | y # bitwise /or/ of x and y
x ^ y # bitwise /exclusive or/ of x and y
x & y # bitwise /and/ of x and y
x << n # x shifted left by n bits
x >> n # x shifted right by n bits
~x # the bits of x inverted
- negative shift counts will cause a ValueError to be raised
- left shift by n bits is equivalent to multiplication by `pow(2, n)`
- right shift by n bits is equivalent to floor division by `pow(2, n)`
Additional methods
-
int.bitlength()
- # of bits neccessary to represent an int in binary
-
int.bitcount()
- # of ones in the binary representation of the abs value of the int
-
int.tobytes(length=1, byteorder='big', *, signed=False)
-
an array of bytes representing an int
- OverflowError raised if int is not representable
-
an array of bytes representing an int
-
int.frombytes(bytes, byteorder='big', *, signed=False)
- the int represented by given array of bytes
-
int.asintegerratio()
-
returns a pair of ints
- whose ratio is equal to the original int and has a positive denominator
-
returns a pair of ints
-
int.isinteger()
- duck type compatibility
-
float.asintegerratio()
-
return pair of ints
- whose ratio is exactly equal to the original float
-
return pair of ints
-
float.isinteger()
-
float.hex()
- returns a representation of a floating-point number as hexidecimal string
-
float.fromhex()
Hashing of Numeric Types
-
Python's hash for numeric types is based on a single mathematical function that's defined for any rational number
-
hence applies to all instances of int and fractions.Fraction, and all finite instances of float and decimal.Decimal
-
this function is given by the reduction modulo P for a fixed prime P
- the value of P is made available to Python as the modulus attribute of sys.hashinfo
-
this function is given by the reduction modulo P for a fixed prime P
-
hence applies to all instances of int and fractions.Fraction, and all finite instances of float and decimal.Decimal
Here’s some example Python code, equivalent to the built-in hash
import sys, math
def hash_fraction(m, n):
"""Compute the hash of a rational number m / n.
Assumes m and n are integers, with n positive.
Equivalent to hash(fractions.Fraction(m, n)).
"""
P = sys.hash_info.modulus
# Remove common factors of P. (Unnecessary if m and n already coprime.)
while m % P == n % P == 0:
m, n = m // P, n // P
if n % P == 0:
hash_value = sys.hash_info.inf
else:
# Fermat's Little Theorem: pow(n, P-1, P) is 1, so
# pow(n, P-2, P) gives the inverse of n modulo P.
hash_value = (abs(m) % P) * pow(n, P - 2, P) % P
if m < 0:
hash_value = -hash_value
if hash_value == -1:
hash_value = -2
return hash_value
def hash_float(x):
"""Compute the hash of a float x."""
if math.isnan(x):
return object.__hash__(x)
elif math.isinf(x):
return sys.hash_info.inf if x > 0 else -sys.hash_info.inf
else:
return hash_fraction(*x.as_integer_ratio())
def hash_complex(z):
"""Compute the hash of a complex number z."""
hash_value = hash_float(z.real) + sys.hash_info.imag * hash_float(z.imag)
# do a signed reduction modulo 2**sys.hash_info.width
M = 2**(sys.hash_info.width - 1)
hash_value = (hash_value & (M - 1)) - (hash_value & M)
if hash_value == -1:
hash_value = -2
return hash_value
iterables
-
Python supports a concept of iteration over containers
- sequences always support iteration methods
-
one method needs to be defined for container objects to provide iterable support: container._iter_()
- returns an iterator object
-
iterator objects are required to support the two methods (iterator protocol)
-
iterator._iter_()
- containers and iterators to be used with for and in statements
-
iterator._next_()
- if no items, raise StopIteration exception
-
iterator._iter_()
-
generators
-
allow you to create iterators using the yield keyword
-
they produce a series of values over time, pausing their execution after each yield
-
they produce a series of values over time, pausing their execution after each yield
-
Python's generators provide a convenient way to implement the iterator protocol
-
if a container's object's _iter_() method is implemented as a generator -it will automatically return an iterator, technically a generator, object yield expression is used when defining a generator function or an asynchronous generator function
- thus only used in the body of a function definition
-
using a yield expression in a function's body causes that function to be a generator function
-
using it in an async def function's body causes that coroutine function to be an asynchronous generator
def gen(): yield 6366 async def async_gen(): yield 6366 # generator expressions g = (n for n in range(6, 36)) next(g) next(g) next(g) # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # StopIteration
-
-
if a container's object's _iter_() method is implemented as a generator -it will automatically return an iterator, technically a generator, object yield expression is used when defining a generator function or an asynchronous generator function
A generator object is generated once but its code is not run all at once
-
only calls to next
-
code in generator stops once a yield has been reached
- the next call to next causes execution to continue from last yield
-
code in generator stops once a yield has been reached
-
allow you to create iterators using the yield keyword
-
itertools
This module implements iterator building blocks inspired by APL, Haskell, and SML
-
memory-efficient tools that form iterator algebra
- construct specialized tools succintly in pure Python
SML provides a tabulation tool tabulate(f) which produces a sequence f(0), f(1), …
- the same can be achieved with map() and count() to form map(f, count())
-
infinite iterators
-
count() make an iterator that returns evenly spaced values beginning with start use with zip() to add sequence numbers or with map() to generate consecutive data points
def count(start=0, step=1): # count(10) → 10 11 12 13 14 ... # count(2.5, 0.5) → 2.5 3.0 3.5 ... n = start while True: yield n n += step
-
cycle() make an iterator returning elements from the iterable and saving a copy of each when the iterable is exhausted, return elements from the saved copy. repeats indefinitely
def cycle(iterable): # cycle('ABCD') → A B C D A B C D A B C D ... saved = [] for element in iterable: yield element saved.append(element) while saved: for element in saved: yield element
-
repeat() makes an iterator that returns object over and over again runs indefinitely unless the times argument is specified
def repeat(object, times=None): # repeat(10, 3) → 10 10 10 if times is None: while True: yield object else: for i in range(times): yield object
-
-
iterators terminating on the shortest input sequence
-
accumulate() an iterator that retturns accumulated sums or results from other binary functions
def accumulate(iterable, function=operator.add, *, initial=None): 'Return running totals' # accumulate([1,2,3,4,5]) → 1 3 6 10 15 # accumulate([1,2,3,4,5], initial=100) → 100 101 103 106 110 115 # accumulate([1,2,3,4,5], operator.mul) → 1 2 6 24 120 iterator = iter(iterable) total = initial if initial is None: try: total = next(iterator) except StopIteration: return yield total for element in iterator: total = function(total, element) yield total
-
batched() batch data from the iterable into tuples of length n
def batched(iterable, n, *, strict=False): # batched('ABCDEFG', 3) → ABC DEF G if n < 1: raise ValueError('n must be at least one') iterator = iter(iterable) while batch := tuple(islice(iterator, n)): if strict and len(batch) != n: raise ValueError('batched(): incomplete batch') yield batch
-
chain() make an iterator that returns elements from the first iterable until it is exhausted then proceed to the next iterable, until all are exhausted.
def chain(*iterables): # chain('ABC', 'DEF') → A B C D E F for iterable in iterables: yield from iterable
-
chain.fromiterable() alternate constructor for chain() gets chained inputs from a single iterable arugment that is evaluated lazily
def from_iterable(iterables): # chain.from_iterable(['ABC', 'DEF']) → A B C D E F for iterable in iterables: yield from iterable
-
compress() make an iterator that returns elements from data where the corresponding element in selectors is true stops when either the data or selectors iterables have been exhausted.
def compress(data, selectors): # compress('ABCDEF', [1,0,1,0,1,1]) → A C E F return (datum for datum, selector in zip(data, selectors) if selector)
-
dropwhile() make an iterator that drops elements from the iterable while the predicate is true and afterwards every elementary
def dropwhile(predicate, iterable): # dropwhile(lambda x: x<5, [1,4,6,3,8]) → 6 3 8 iterator = iter(iterable) for x in iterator: if not predicate(x): yield x break for x in iterator: yield x
-
filterfalse() make an iterator that filters elements from the iterable returning only those for which the predicate returns a false value if predicate is None, returns the items that are false
def filterfalse(predicate, iterable): # filterfalse(lambda x: x<5, [1,4,6,3,8]) → 6 8 if predicate is None: predicate = bool for x in iterable: if not predicate(x): yield x
-
groupby() makes an iterator that returns consecutive keys and groups from the iterable the key is a function computing a key value for each element key defaults to an identity function and returns the element unchanged
**the operation of groupby() is similar to uniq filter in Unix **
-
it generates a break or new group every time the value of the key function changes
-
this is different from SQL's GROUP BY which aggregates common elements regardless of their input order
def groupby(iterable, key=None): # [k for k, g in groupby('AAAABBBCCDAABBB')] → A B C D A B # [list(g) for k, g in groupby('AAAABBBCCD')] → AAAA BBB CC D keyfunc = (lambda x: x) if key is None else key iterator = iter(iterable) exhausted = False def _grouper(target_key): nonlocal curr_value, curr_key, exhausted yield curr_value for curr_value in iterator: curr_key = keyfunc(curr_value) if curr_key != target_key: return yield curr_value exhausted = True try: curr_value = next(iterator) except StopIteration: return curr_key = keyfunc(curr_value) while not exhausted: target_key = curr_key curr_group = _grouper(target_key) yield curr_key, curr_group if curr_key == target_key: for _ in curr_group: pass
-
-
it generates a break or new group every time the value of the key function changes
-
islice() make an iterator that returns selected elements from the iterable
def islice(iterable, *args): # islice('ABCDEFG', 2) → A B # islice('ABCDEFG', 2, 4) → C D # islice('ABCDEFG', 2, None) → C D E F G # islice('ABCDEFG', 0, None, 2) → A C E G s = slice(*args) start = 0 if s.start is None else s.start stop = s.stop step = 1 if s.step is None else s.step if start < 0 or (stop is not None and stop < 0) or step <= 0: raise ValueError indices = count() if stop is None else range(max(start, stop)) next_i = start for i, element in zip(indices, iterable): if i == next_i: yield element next_i += step
-
pairwise() return successive overlapping pairs taken from the input iterable
def pairwise(iterable): # pairwise('ABCDEFG') → AB BC CD DE EF FG iterator = iter(iterable) a = next(iterator, None) for b in iterator: yield a, b a = b
-
starmap() make an iterator that computes the function using arguments obtained from the iterable instead of map() when argument params have already been pre-zipped into tuples
def starmap(function, iterable): # starmap(pow, [(2,5), (3,2), (10,3)]) → 32 9 1000 for args in iterable: yield function(*args)
-
takewhile() make an iterator that returns elements from the iterable as long as the predicate is true
def takewhile(predicate, iterable): # takewhile(lambda x: x<5, [1,4,6,3,8]) → 1 4 for x in iterable: if not predicate(x): break yield x
-
tee() return n independent iterators from a single iterable
def tee(iterable, n=2): if n < 0: raise ValueError if n == 0: return () iterator = _tee(iterable) result = [iterator] for _ in range(n - 1): result.append(_tee(iterator)) return tuple(result) class _tee: def __init__(self, iterable): it = iter(iterable) if isinstance(it, _tee): self.iterator = it.iterator self.link = it.link else: self.iterator = it self.link = [None, None] def __iter__(self): return self def __next__(self): link = self.link if link[1] is None: link[0] = next(self.iterator) link[1] = [None, None] value, self.link = link return value
-
ziplongest() make an iterator that aggreagate elements from each of the iterables
def zip_longest(*iterables, fillvalue=None): # zip_longest('ABCD', 'xy', fillvalue='-') → Ax By C- D- iterators = list(map(iter, iterables)) num_active = len(iterators) if not num_active: return while True: values = [] for i, iterator in enumerate(iterators): try: value = next(iterator) except StopIteration: num_active -= 1 if not num_active: return iterators[i] = repeat(fillvalue) value = fillvalue values.append(value) yield tuple(values)
-
-
combinatoric iterators
- product() cartesian product of the input iterables
def product(*iterables, repeat=1): # product('ABCD', 'xy') → Ax Ay Bx By Cx Cy Dx Dy # product(range(2), repeat=3) → 000 001 010 011 100 101 110 111 if repeat < 0: raise ValueError('repeat argument cannot be negative') pools = [tuple(pool) for pool in iterables] * repeat result = [[]] for pool in pools: result = [x+[y] for x in result for y in pool] for prod in result: yield tuple(prod)
-
permutations() return successive r length permutations of elements from the iterable
def permutations(iterable, r=None): # permutations('ABCD', 2) → AB AC AD BA BC BD CA CB CD DA DB DC # permutations(range(3)) → 012 021 102 120 201 210 pool = tuple(iterable) n = len(pool) r = n if r is None else r if r > n: return indices = list(range(n)) cycles = list(range(n, n-r, -1)) yield tuple(pool[i] for i in indices[:r]) while n: for i in reversed(range(r)): cycles[i] -= 1 if cycles[i] == 0: indices[i:] = indices[i+1:] + indices[i:i+1] cycles[i] = n - i else: j = cycles[i] indices[i], indices[-j] = indices[-j], indices[i] yield tuple(pool[i] for i in indices[:r]) break else: return
-
combinations() return r length subsequences of elements from the input iterable the output is a subsequence of product() keeping only entries that are subsequences of the iterable
def combinations(iterable, r): # combinations('ABCD', 2) → AB AC AD BC BD CD # combinations(range(4), 3) → 012 013 023 123 pool = tuple(iterable) n = len(pool) if r > n: return indices = list(range(r)) yield tuple(pool[i] for i in indices) while True: for i in reversed(range(r)): if indices[i] != i + n - r: break else: return indices[i] += 1 for j in range(i+1, r): indices[j] = indices[j-1] + 1 yield tuple(pool[i] for i in indices)
-
combinationswithreplacement() return r length subsequences of elements from the input iterable allowing individual elements to be repeated more than once
def combinations_with_replacement(iterable, r): # combinations_with_replacement('ABC', 2) → AA AB AC BB BC CC pool = tuple(iterable) n = len(pool) if not n and r: return indices = [0] * r yield tuple(pool[i] for i in indices) while True: for i in reversed(range(r)): if indices[i] != n - 1: break else: return indices[i:] = [indices[i] + 1] * (r - i) yield tuple(pool[i] for i in indices)
-
memory-efficient tools that form iterator algebra
sequences
-
common sequences
-
sequences represent finite ordered sets indexed by non-negative numbers
-
the length of a sequence is n
-
the index set contains the numbers 0,1, …, n-1
-
the item i of sequence s is selected by s[i]
- sequences also support slicing s[i:j] selects all items with index k such that i<=k<j
-
the item i of sequence s is selected by s[i]
-
the index set contains the numbers 0,1, …, n-1
-
the length of a sequence is n
-
there are three basic sequence types lists, tuples, and ranges, also binary and text sequence types
- sequence types of the same type also support comparisons
-
Values of n less than 0 are treated as 0
-
which yields an empty sequence of the same type as sequence s
-
items in sequence s are not copied; they are referenced multiple times
lists = [[]] * 3 # [[], [], []] lists[0].append(3) # [[3], [3], [3]]
-
-
which yields an empty sequence of the same type as sequence s
-
[[]] is a one-element list containing an empty list
- so all three elements of [[]] * 3 are references to this single empty list
lists = [[] for i in range(3)] lists[0].append(3) lists[1].append(5) lists[2].append(7) [[3], [5], [7]]
- if the index (i) of a sequence is negative, len(s) + i is substituted
-
the slice of sequence s from i to j is defined as the sequence of items with index k
- such that i <= k < j, if i or j is greater than len(s), use len(s)
-
if i is omitted or None, uses 0
-
if j is omitted or/None/, use len(s)
- if i is greater than or equal to j, the slice is empty
-
if j is omitted or/None/, use len(s)
-
the slice of sequence s from i to j with step k is defined as the sequence of items with index x = i + n*k
-
such that 0 <= n < (j-i)/k
- if k is None, it is treated like 1
-
such that 0 <= n < (j-i)/k
-
concatenating immutable sequences results in a new object
-
building up a sequence by repeated concatenation will have a quadratic runtime cost in the total sequence length
- for str objects, use str.join() at the end or write to an io.StringIO instance and retrieve its value when complete
-
for bytes objects, you can similarly use bytes.join() or io.BytesIO or you do in-place concatenation with a bytearray object
- bytearray objects are immutable and have efficient overallocation mechanism
- for tuple objects, extend a list instead
-
building up a sequence by repeated concatenation will have a quadratic runtime cost in the total sequence length
- IndexError is raised if i is outside the sequence range
-
sequences represent finite ordered sets indexed by non-negative numbers
-
immutable sequences
- This code block lists the sequence operations sorted in ascending priority. In the table, s and t are sequences of the same type, n, i, j and k are integers and x is an arbitrary object that meets any type and value restrictions imposed by s.
x in s # True if an item of s is equal to x, else False x not in s # False if an item of s is equal to x, else True s + t # the concatenation of s and t s * n # equivalent to adding s to itself n times s[i] # ith item of s origin 0 s[i:j] # slice of s from i to j s[i:j:k] # slice of s from i to j with step k len(s) # length of s min(s) # smallest item of s max(s) # largest item of s s.index(x[, i[, j]]) # index of the first occurrence of x in s (at or after index i and before index j) s.count(x) # total number of occurences of x in s
-
generally supports the built-in hash()
- allows immutable sequences, maybe tuple instances, to be used as dict keys and stored in set or frozenset instances
- hashing an immutable sequence containing unhashable values will result in TypeError
-
tuples
-
tuples are immutable sequences to store collections of heterogeneous data
-
or for cases where an immutable sequence of homogeneous data is needed
-
construct using a pair of parentheses, or a trailing comma for a singleton tuple, or separating items with commas, or type constructor
() a, (a,) a, b, c (a, b, c) tuple() tuple(iterable)
-
-
or for cases where an immutable sequence of homogeneous data is needed
-
it is actually the comma which makes a tuple, not the parentheses because they are optional
-
avoid syntactic ambiguity with parentheses
f(a, b, c) # func call with 3 args f((a, b, c)) # func call with a 3-tuple arg
-
-
for heterogeneous collections of data where access by name is clearer than access by index,
- collections.namedtuple() may be an appropriate choice over a simple tuple object
-
tuples are immutable sequences to store collections of heterogeneous data
-
ranges
-
a range type represents an immutable sequence of numbers class range(stop) class range(start, stop[, step])
-
range constructor args must be integers
- step defaults to 1
- start defaults to 0
-
ranges containing absolute values larger than sys.maxsize are permitted but some features may raise OverflowError
list(range(1)) # [0] list(range(1, 11)) # [1,2,3,4,5,6,7,8,9,10] list(range(0, 30, 5)) # [0,5,10,15,20,25] list(range(5, -1, -1)) # [5,4,3,2,1,0]
the advantage of range type over a regular list or tuple is that a range object will always take the same amount of memory
- it only stores the start, stop, and step values
- they provide features such as containment tests, element index lookup, slicing, and support for negative indices
-
if two range objects represent the same sequence of values are considered equal
- even if they have different start, stop, and step attributes
-
-
strings
-
strings are immutable sequences of Unicode code points
-
string literals are written as single quotes, double quotes, and triple quoted
- triple quotes span multiple lines
- single expression string literals will implicitly convert whitespace to a single string literal
- strings may be created from other objects using the str constructor
-
string literals are written as single quotes, double quotes, and triple quoted
-
string formatting support a large degree of flexibility and customization
- but also supports C printf style formatting that handles a narrower range of types but is often faster
string methods
-
str.capitalize()
- return a copy of the string with its first character capitalized and the rest lowercased
-
str.casefold()
-
return a casefolded copy of the string
- casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string
-
return a casefolded copy of the string
-
str.center(width[, fillchar])
-
return centered in a string of length width
- padding is done using the fillchar
-
return centered in a string of length width
-
str.count(sub[, start[, end]])
-
return the number of non-overlapping occurences of substring sub in the range [start, end]
-
if sub is empty, returns the number of empty strings between characters
- which is the length of the string plus one
-
if sub is empty, returns the number of empty strings between characters
-
return the number of non-overlapping occurences of substring sub in the range [start, end]
-
str.encode(encoding='utf-8', errors='strict')
- return the string encoded to bytes
-
str.endswith(suffix[, start[, end]])
-
returns True if the string ends with the specified suffix, otherwise return False
- suffix can also be a tuple of suffixes to look for
-
returns True if the string ends with the specified suffix, otherwise return False
-
str.expandtabs(tabsize=8)
-
return a copy of the string where all tab characters are replaced by one or more spaces
- depending on the current column and given tab size
-
return a copy of the string where all tab characters are replaced by one or more spaces
-
str.find(sub[, start[, end]])
-
return the lowest index in the string where substring sub is found within the slice s[start:end]
- return -1 if sub not found
-
return the lowest index in the string where substring sub is found within the slice s[start:end]
-
str.format(*args, **kwargs)
-
perform a string formatting operation
-
the string on which this method is called can contain literal text or replacement fields delimited by braces {}
-
each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument
-
returns a copy of the string where each replacement field is replaced with the string value of corresponding argument
"The sum of 1 + 2 is {0}".format(1+2) # 'The sum of 1 + 2 is 3'
-
-
each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument
-
the string on which this method is called can contain literal text or replacement fields delimited by braces {}
-
perform a string formatting operation
-
str.formatmap(mapping, /)
-
mapping is used directly and not copied to a dict
class Default(dict): def __missing__(self, key): return key '{name} was born in {country}'.format_map(Default(name='Guido')) # 'Guido was born in country'
-
-
str.index(sub[, start[, end]])
- like find() but raises ValueError when the substring is not found
-
str.isalnum()
- returns True if all characters in the string are alphanumeric and there is at least one character, False otherwise
-
str.isalpha()
- returns True if all characters in the string are alphabetic and there is at least one character, False otherwise
-
str.isascii()
- returns True if all characters in the string is empty or all characters in the string are ASCII, False otherwise
-
str.isdecimal()
- returns True if all characters in the string are decimal characters and there is at least one character, False otherwise
-
str.isdigit()
- returns True if all characters in the string are digits and there is at least one character, False otherwise
-
str.isidentifier()
- returns True if the string is a valid identifier according to the language definition
-
str.islower()
- returns True if all cased characters in the string are lowercase and there is at least one cased character, False otherwise
-
str.isnumeric()
- returns True if all the characters in the string are numeric characters, and there is at least one character, False otherwie
-
str.isprintable()
- returns True if all characters in the string are printable, False if it contains at least one non-printable character
-
str.isspace()
- returns True if there are only whitespace characters in the string and there is at least one character, False otherwise
-
str.istitle()
- returns True if the string is titlecased string
-
str.isupper()
- returns True if all cased characters are uppercase
-
str.join(iterable)
-
return a string which is the concatenation of the strings in iterable
- TypeError is raised if there are any non-string values in iterable
-
return a string which is the concatenation of the strings in iterable
-
str.ljust(width[, fillchar])
-
return the string left justified in a string of length width
- padding is done using the specified fillchar
-
return the string left justified in a string of length width
-
str.lower()
- return a copy of the string with all the cased characters converted to lowercase
-
str.lstrip([chars])
-
return a copy of the string with leading characters removed
- chars argument is a string specifying the set of characters to be removed
-
return a copy of the string with leading characters removed
-
static str.maketrans(x[, y[, z]])
- this static method returns a translation table usable for str.translate()
-
str.partition(sep)
-
split the string at the first occurence of sep
- and return a 3-tuple containing the part before the separator, the separator itself, and the part after the separator
-
split the string at the first occurence of sep
-
str.removeprefix(prefx, /)
-
if the string starts with the prefix string, return string[len(prefix):]
- otherwise return a copy of the original string
-
if the string starts with the prefix string, return string[len(prefix):]
-
str.replace(old, new, count=1)
-
return a copy of the string with all occurences of substring old replaced by new
-
if count is given, only the first count occurences are replaced
- if count is not specified or -1 then all occurences are replaced.
-
if count is given, only the first count occurences are replaced
-
return a copy of the string with all occurences of substring old replaced by new
-
str.rfind(sub[, start[, end]])
-
return the highest index in the string where sub is found
- such that sub is contained within s[start:end]
-
return the highest index in the string where sub is found
-
str.rindex(sub[, start[, end]])
- like rfind() but raises ValueError when the substring sub is not found
-
str.rjust(width[, fillchar])
-
return the string right justified in a string of length width
- padding is done using the fillchar
-
return the string right justified in a string of length width
-
str.rpartition(sep)
-
split the string at the last occurence of sep
- and return a 3-tuple containing the part before separator, the separator itself, and the part after the separator
-
split the string at the last occurence of sep
-
str.rstrip([chars])
-
return a copy of the string with the trailing characters removed
- chars argument is a string specifying the set of characters to be removed
-
return a copy of the string with the trailing characters removed
-
str.split(sep=None, maxsplit=-1)
-
return a list of the words in the string, using the sep as the delimiter string
- at most maxsplit splits are done
-
return a list of the words in the string, using the sep as the delimiter string
-
str.splitlines(keepends=False)
-
return a list of the lines in the string, breaking at line boundaries
- line breaks are not included in the resulting list unless keepends is given and true
-
return a list of the lines in the string, breaking at line boundaries
-
str.startswith(prefix[, start[, end]])
- returns True if string starts with the prefix, otherwise return False
-
str.strip([chars])
- return a copy of the string with the leading and trailing characters removed
-
str.swapcase()
- return a copy of the string with uppercase characters converted vice versa case (upper or lower)
-
str.title()
- returns a titlecased version of the string where words start with an uppercase character and the remaining characters are lowercase
-
str.translate(table)
-
returns a copy of the string in which each character has been mapped through the given translation table
- table is typically a mapping or sequence
-
returns a copy of the string in which each character has been mapped through the given translation table
-
str.upper()
- return a copy of the string with all the cased characters converted to uppercase
-
str.zfill(width)
-
return a copy of the string left filled with ASCII '0' digits to make a string of length width
- a leading sign prefix ('+'/'-') is handling by inserting the padding after the sign character rather than before
-
return a copy of the string left filled with ASCII '0' digits to make a string of length width
An f-string, formatted string literal, is a string literal that is prefixed with f or F
-
allows embedding arbitrary Python expressions within replacment fields, which are delimited by curly brackets ({})
-
these are evaluated at runtime, and converted into regular str objects
who = 'nobody' nationality = 'Spanish' f'{who.title()} expects the {nationality} Inquisition!' # 'Nobody expects the Spanish Inquisition!'
-
-
any non-string expression is converted using str() by default
-
to use an explicit conversion, use the ! operator followed by any of the valid formats
- !a ascii()
- !r repr()
- !s str()
-
to use an explicit conversion, use the ! operator followed by any of the valid formats
from fractions import Fraction f'{Fraction(1, 3)!s}' # '1/3' f'{Fraction(1, 3)!r}' # 'Fraction(1, 3)' question = '¿Dónde está el Presidente?' print(f'{question!a}') # '\xbfD\xf3nde est\xe1 el Presidente?'
printf-style String Formatting, similar to the sprintf() function in the C language
- the % operator (modulo) is built-in String objects, known as string formatting or interpolation operator
print('%s has %d quote types.' % ('Python', 2)) # Python has 2 quote types.
-
strings are immutable sequences of Unicode code points
-
mutable sequences
-
In the table s is an instance of a mutable sequence type,
- t is any iterable object and x is an arbitrary object that meets any type and value restrictions imposed by s
s[i] = x # item i of s is replaced by x del s[i] # removes item i of s s[i:j] = t # slice of s from i to j is replaced by the contents of the iterable t del s[i:j] # same as s[i:j] = [] s[i:j:k] = t # the elements of s[i:j:k] from the list del s[i:j:k] # removes the elements of s[i:j:k] from the list s.append(x) # appends x to the end of the sequence s.clear() # removes all the items from s s.copy() # creates a shallow copy of s s.extend(t) # or s += t # extends s with the contents of t s *= n # updates s with its contents repeated n times s.insert(i, x) # inserts x into s at the index given by i s.pop() # or s.pop(i) # retrieves the item at i and also removes it from s s.remove(x) # removes the first item frmo s where s[i] is equal to x s.reverse() # reverses the items of s in-place
-
lists
-
lists are mutable sequences to store homogeneous items
-
construct using square brackets or type constructor
[] [a], [a, b, c] [x for x in iterable] list() list(iterable)
-
- if iterable is already a list, a copy is made and returned, similar to iterable[:]
additional list methods
-
sort(*, key=None, reverse=False)
- sorts list in place using < comparisons between items
- there is also sorted() that builds a new sorted list from an iterable
-
lists are mutable sequences to store homogeneous items
-
In the table s is an instance of a mutable sequence type,
-
binary sequences
-
built-in types for manipulating binary data, bytes and bytearray
-
supported by memoryview
-
which uses the buffer protocol to access the memory of other binary objects without needing to make a copy'
-
Python provides facilities to access an underlying memory array or buffer
-
provided at the C and Python level
- the buffer protocol has a producer side, a type can export a "buffer interface" which allows objects of that type to expose information about their underlying buffer
- the buffer protocol's consumer side, several means are available to obtain a pointer to the raw underlying data of an object
-
buffer structures can be used as a zero-copy slicing mechanism and expose binary data
- the memory could be a large, constant array in a C extension, it could be a raw block of memory for manipulation before passing to an OS library, or could be used to pass around structured data in its native, in-memory format
- buffers are not PyObject pointers but simple C structures
-
provided at the C and Python level
-
Python provides facilities to access an underlying memory array or buffer
-
which uses the buffer protocol to access the memory of other binary objects without needing to make a copy'
-
supported by memoryview
- the array module supports efficient storage of basic data types like 32-bit integers and IEEE754 double precision floating values
-
bytes
-
bytes objects are immutable sequences of single bytes
-
offers several methods for ASCII compatible data
class bytes([source[, encoding[, errors]]])
-
write bytes literals like string literals, except with a b prefix
b'still allows embedded "double" quotes' b"still allows embedded 'single' quotes" b'''3 single quotes''', b"""3 double quotes""" bytes(10) bytes(range(20)) bytes(obj)
-
-
- any binary values over 127 must be entered into bytes literals using the appropriate escape sequence
2 hexadecimal digits correspond to a single byte
- bytes type has a class method to read from hexadecimal classmethod fromhex(string)
- a reverse conversion function to transform a bytes object into its hexadecimal representation hex([sep[, bytespersep]])
- use can always convert a bytes object into a list of integers using list(b)
-
bytes objects are immutable sequences of single bytes
-
bytearray
bytearray objects are a mutable counterpart to bytes objects
class bytearray([source[, encoding[, errors]]])
- always created by calling the constructor
bytearray() bytearray(range(20)) bytearray(b'hello world!')
2 hexadecimal digits correspond to a single byte
- bytearray type has a class method to read from hexadecimal classmethod fromhex(string)
- a reverse conversion function to transform a bytes object into its hexadecimal representation hex([sep[, bytespersep]])
- use can always convert a bytes object into a list of integers using list(b)
-
bytes and bytearray operations
-
*.count(sub[, start[, end]])
- returns the number of non-overlapping occurences of subsequence sub in the range [start, end]
-
*.removeprefix(prefix, /)
-
if the binary data starts with the prefix string, return bytes[len(prefix):]
- otherwise return a copy of the original binary data
-
if the binary data starts with the prefix string, return bytes[len(prefix):]
- …. the rest of the operations, pretty much the same as string methods
-
*.count(sub[, start[, end]])
-
memoryview
-
memoryview objects allow Python code to access the internal data of an object
-
that supports the buffer protocol without copying
class memoryview(object)
-
creates a memoryview that references a object that supports the buffer protocol, like bytes and bytearray
-
has the notion of ane element
- the atomic memory unit handled by the originating object
-
has the notion of ane element
-
the itemsize attribute will give you the number of bytes in a single element
-
memoryview also supports slicing and indexing to expose its data
v = memoryview(b'abcefg') v[1] v[-1] v[1:4] bytes(v[1:4])
-
-
-
non-byte format
import array a = array.array('l', [-11111111, 22222222, -33333333, 44444444]) m = memoryview(a) m[0] m[-1] m[::2].tolist()
-
memoryview supports one-dimensional slice assignment, resizing it not allowed
data = bytearray(b'abcefg') v = memoryview(data) v.readonly # False v[0] = ord(b'z') data # bytearray(b'zbcefg') v[1:4] = b'123' data # bytearray(b'z123fg') v[2:3] = b'spam' v[2:6] = b'spam' data # bytearray(b'z1spam')
-
methods available to memoryview _eq_(exporter)
-
a memoryview and a PEP 3118 exporter are equal if their shapes are equivalent
- and if all correspoding values are equal when the operands' respective format codes are interpreted using struct syntax
-
a memoryview and a PEP 3118 exporter are equal if their shapes are equivalent
tobytes(order='C')
-
return the data in the buffer as a bytestring
- equivalent to calling the bytes constructor on the memoryview
hex([sep[, bytespersep]])
- return a string object containing two hexadecimal digits for each byte in the buffer
tolist()
- return the data in the buffer as a list of elements
toreadonly()
- return a readonly version of the memoryview object
release()
- return the underlying buffer exposed by the memoryview object
cast(format[, shape])
-
cast a memoryview to a new format or shape
-
shape defaults to [bytelength//newitemsize]
- which means the result view will be one-dimensional
- return value is a new memoryview but the buffer itself is not copied
-
shape defaults to [bytelength//newitemsize]
-
readonly attributes available to memoryview
-
obj
- the underlying object of the memoryview
-
nbytes
- this the amount of space in bytes that the array would use in a contiguous representation
-
readonly
- a bool indicating whether the memory is read only
-
format
- a string containing the format, in struct module style, for each element in the view
-
itemsize
- the size in bytes of each element in the memoryview
-
ndim
- an integer indicating how many dimensions of a multi-dimensional array the memory represents
-
shape
- a tuple of integers the length of ndim giving the shape of the memory as an N-dimensional array
-
strides
- a tuple of integers the length of ndim giving the size in bytes to access each element for each dimension of the array
-
suboffsets
- used internally for PIL-style arrays
-
ccontiguous
- a bool indicating whether the memory is C-contiguous
-
fcontiguous
- a bool indicating whether the memory is Fortran contiguous
-
contiguous
- a bool indicating whether the memory is contiguous
-
obj
-
-
built-in types for manipulating binary data, bytes and bytearray
sets
-
a set object is an unordered collection of distinct hashable objects
-
an object is hashable if it has a hash value which never changes during its lifetime
- needs a _hash_() and _eq_() method
-
an object is hashable if it has a hash value which never changes during its lifetime
-
being an unordered collection, sets do not record element position or order of insertion
- sets do not support indexing, slicing, or other sequence-like behavior
-
the set type is mutable
- it has no hash value and cannot be used as either a dictionary key or as an element of another set
-
the frozenset type is immutable and hashable
- its contents cannot be altered after it is created; therefore can be used as a dictionary key or as element of another set
-
non-empty sets (not frozensets) can be created by placing a comma-separated list of elements within braces and the set constructor
class set([iterable]) class frozenset([iterable])
-
return a new set or frozenset object whose elements are taken from iterable
-
to represent sets of sets, the inner sets must be frozenset objects
{'jack', 'sjoerd'} {c for c in 'abracadabra' if c not in 'abc'} set() set('foobar')
-
two sets are equal only if every element is a subset of the other
- a set less than another set only if the first set is a subset but not equal of the second set
- a set is greater than another set only if the first set is a superset but not equal of the second set
-
two sets are equal only if every element is a subset of the other
-
-
operations for set that do no apply to frozenset
s = set(['a', 'b', 'foo']) len(s) # 3 s.isdisjoint('c') # True s.issubset(['a', 'b', 'foo', 'bar']) # True s.issuperset(['a', 'b', 'foo', 'bar']) # False u = s.union(['a', 'b', 'foo', 'bar']) # { 'b', 'foo', 'bar', 'a'} t = s.intersection(u) # {'b', 'foo', 'a'} d = s.difference('a') # {'b', 'foo'} sd1 = s.symmetric_difference(u) # {'bar'} sd2 = s.symmetric_difference('a') # {'b', 'foo'} scopy = s.copy() # {'b', 'foo', 'a'} scopy.update('bar') # {'b', 'r', 'foo', 'a'} scopy.intersection_update({'foo', 'bar'}) # {'foo'} scopy.difference_update({'foo'}) # set() scopy.symmetric_difference_update({'foo', 'bar'}) # {'foo', 'bar'} scopy.add('eels') # {'eels', 'foo', 'bar'} scopy.remove('foobar') # KeyError: 'foobar' scopy.discard('foobar') scopy.pop() scopy.pop() scopy.pop() scopy.pop() # KeyError: 'pop from an empty set' scopy.clear()
mappings
-
a mapping are mutable objects that map hashable values to arbitrary objects.
- dictionary is the one standard mapping type
-
a dictionary's keys are not hashable,
-
that is, values containing lists, dictionaries or other mutable types may not be used as keys
class dict(*kwargs)* **class dict(mapping, kwargs) **class dict(iterable, kwargs)
-
return a new dictionary
- intialized from an optional positional argument and a possibly empty set of keyword arguments
-
dictionaries are constructed by
-
use a comma-separated list of key:value pairs within braces
-
use a dict comprehension
-
use the type constructor
{'jack': 4098, 'sjoerd': 4127} {4098: 'jack', 4127: 'sjoerd'} {} {x: x ** 2 for x in range(10)} dict() dict([('foo', 100), ('bar', 200)]) dict(foo=100, bar=200)
-
-
return a new dictionary
-
dictionaries compare equal only if they have the same (key, value) pairs regardless of order
-
operations that dictionaries support
c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
# {'one': 1, 'two': 2, 'three': 3}
list(c)
# ['one', 'two', 'three']
len(c)
# 3
c['one']
# 1
c[1]
# KeyError: 1
iter(c)
# <dict_keyiterator object at 0x738f45d3b9c0>
c.clear()
ccopy = c.copy()
z = c.fromkeys(iter(c))
# {'one': None, 'two': None, 'three': None}
z.items()
# dict_items([('one', None), ('two', None), ('three', None)]
z.keys()
# dict_keys(['one', 'two', 'three'])
z.pop('one')
# {'two': None, 'three': None}
z.popitem()
# ('three', None)
reversed(z)
# <dict_reversekeyiterator object at 0x738f45d611c0>
z.setdefault('one', 0)
# {'two': None, 'three': None, 'one': 0}
z.update(c)
# {'two': 2, 'three': 3, 'one': 1}
z.values()
# dict_values([2, 3, 1])
y = {'four': 4}
w = z | y
# {'two': 2, 'three': 3, 'one': 1, 'four': 4}
y |= c
# {'four': 4, 'one': 1, 'two': 2, 'three': 3}
-
dictionary view objects are returned by dict.keys(), dict.values(), dict.items()
-
len(dictview)
- return the number of entries in the dict
-
iter(dictview)
-
return an iterator over the keys, values or items
- represented as tuples of (key, value) in the dictionary
-
return an iterator over the keys, values or items
-
len(dictview)
- keys views are set-like since their entries are unique and hashable
-
items views also have set-like operations since the (key, value) pairs are unique and the keys are hashable
- if all values in an items view are hashable as well, then the items view can interoperate with other sets
dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500}
keys = dishes.keys()
values = dishes.values()
# iteration
n = 0
for val in values:
n += val
print(n)
# keys and values are iterated over in the same order (insertion order)
list(keys)
list(values)
# view objects are dynamic and reflect dict changes
del dishes['eggs']
del dishes['sausage']
list(keys)
# set operations
keys & {'eggs', 'bacon', 'salad'}
keys ^ {'sausage', 'juice'} == {'juice', 'sausage', 'bacon', 'spam'}
keys | ['juice', 'juice', 'juice'] == {'bacon', 'spam', 'juice'}
# get back a read-only proxy for the original dictionary
values.mapping
values.mapping['spam']
context manager
-
with statement supports the concept of a runtime context defined by a context manager
-
allows user-defined classes to defined a runtime context
-
that is entered before the statement body is executed and exited when the statement ends contextmanager._enter_()
-
enter runtime context and return object related to runtime context
- bound to the identifer in the as clause of with statements
-
exit runtime context and return a boolean flag indicating if any exception that occured should be suppressed
- the arguments contain the exception type, value, and traceback information
-
enter runtime context and return object related to runtime context
-
that is entered before the statement body is executed and exited when the statement ends contextmanager._enter_()
-
allows user-defined classes to defined a runtime context
see the context.lib module for for examples of several context managers
-
python's generators and the contextlib.contextmangager decorator provides a convenient way to implement these protocols
-
a generator functino with that decorator will return a context manager implementing neccessary _enter_() and _exit_() methods
- rather than the iterator produced by an undecorated generator function
-
a generator functino with that decorator will return a context manager implementing neccessary _enter_() and _exit_() methods
type annotation types
-
type annotations are a label associated with a variable, a class attribute or a function parameter or return value, used by convention as a type hint
- type hint specifies the expected type, not enforced by python but useful to static type checkers
-
GenericAlias
-
GenericAlias objects are generally created by subscripting a class
- the subscription of an instance of a generic container class (list, tuples, dicts, etc.) will generally return a GenericAlias object
-
GenericAlias object acts as a proxy for a generic type, a type that can be parameterized, implementing parameterized generics
T[X, Y, …]
-
creates a GenericAlias presenting type T parameterized by types X, Y, and more depending on the T used
def send_post_request(url: str, body: dict[str, int]) -> None: ...
-
-
parameterized generics erase type parameters during object creation
special attributes of GenericAlias objects
-
genericalias.origin
- this points at the non-parameterized generic class
-
genericalias.args
- this is a tuple of generic types passed to the original
-
genericalias.parameters
- this is a lazily computed tuple of unique type variables in args
-
genericalias.unpacked
- a boolean that is true if the alias has been unpacked using the * operator
-
-
Union
-
a union object holds the value of the | bitwise or operation on multiple type objects
- intended for type annotations
X | Y | …
-
defines a union object which holds types X, Y, and so forth X | Y means either X or Y, equivalent to typing.Union[X, Y]
def square(number: int | float) -> int | float: return number ** 2
-
a union object holds the value of the | bitwise or operation on multiple type objects
modules
-
modules are a basic organizational unit of Python code, and created by the import system
-
one module gains access to the code in another module by the process of importing it
-
the basic import statement is executed in two steps
-
find a module, loading and initializing it if necessary
-
define a name or names in the local namespace for the scope where the import statment occurs
-
if requested module is retreived successfully
- if module name is followed by as, then the name following as is bound directly to the imported module
- if no other name is specified, and the module is being imported is a top level module, the module's name is bound in the local namespace as a reference to the imported module
- if the module being imported is not a top level module, then the nameof the top level package that contains the module is bound in the local namespace as a reference to the top level package. the imported module must be accessed using its full qualifed name rather than directly
-
-
the basic import statement is executed in two steps
-
one module gains access to the code in another module by the process of importing it
-
the from form uses a slight more complex process:
- find the module specified in the from clause, loading and initializing if it neccessary
-
for each of the identifiers specified in the import clauses:
- check if the imported module has an attribute by that name:
- if not attempt to import a submodule with that name and then check the imported module again for that attribute
- if the attribute is not found, ImportError is raised
- otherwise, a reference to that value is stored in the local namespace, using the name in the as clause if it is present, otherwise using the attribute name
import foo # foo imported and bound locally
import foo.bar.baz # foo, foo.bar, and foo.bar.baz imported, foo bound locally
import foo.bar.baz as fbb # foo, foo.bar, and foo.bar.baz imported, foo.bar.baz bound as fbb
from foo.bar import baz # foo, foo.bar, and foo.bar.baz imported, foo.bar.baz bound as baz
from foo import attr # foo imported and foo.attr bound as attr
from foo import * # all public names defined in the foo module are bound in the local namespace
# will throw SyntaxError if used in a class or function
-
public names are determined by checking the module's namespace for a variable named all
- if all is not defined, then all names founds which do not begin with an underscore character('_')
-
a module has only attribute access operation: m.name
-
m is a module and name accesses a name defined in m's symbol table
-
dict attribute contains the symbol table
-
direct assignment to this dictionary is not possible
module built into the interpreter <module 'sys' (builtin)> module loaded from a file <module 'os' from 'usr/local/lib/pythonX.Y/os.pc'>
-
-
dict attribute contains the symbol table
-
m is a module and name accesses a name defined in m's symbol table
- if a named module cannot be found, a ModuleNotFoundError is raised
classes
-
classes provides a means of bundling data and functionality together.
-
a new class creates a new type of object
- allowing new instances of that type to be made
-
each class instance can have attributes attached to it for maintaining its state
- can also have methods defined by its class for modifying its state
-
a new class creates a new type of object
-
object-oriented paradigm
-
compared to Simula, an general-purpose, object-oriented programming language for doing simulations
-
the concept of record class construct
-
for pre-defined system classes and subclasses
- and declaring a complex type using built-in types or may reference user-defined types
-
for pre-defined system classes and subclasses
-
the concept of record class construct
-
python provides the class inheritance mechanism that allows multiple base classes
-
a derived class can override any methods of its base class(es)
- and a method can call the method of a base class with the same name
-
a derived class can override any methods of its base class(es)
-
classes are created at runtime, as is true for modules
- and can be further modified after creation
-
compared to Simula, an general-purpose, object-oriented programming language for doing simulations
-
scopes and namespaces
-
a namespace is a mapping from names to objects
-
most namespaces are implemented as Python dictionaries
- examples are built-in names and built-in exception names, global names in a modules, local names in a function invocation
-
a set of attributes of an object form a namespace
- there is no relation between names in different namespaces
-
attributes may be read-only or writable
-
module objects have a secret read-only attribute called dict
- returns the dict used to implement a module's namespace
-
module objects have a secret read-only attribute called dict
-
most namespaces are implemented as Python dictionaries
-
the namespace containing built-in names is created when the python interpreter starts up, and is never deleted
- the built-in names live in the builtins module
-
the global namespace is created when the module definition is read in
- module namespaces last until the interpreter quits
-
statements executed by the top-level invocation of the interpreter
- either read from a script file or interactively, are considered part of a module called main, so they have their own namespace
-
the local namespace is created for a function when it is called
-
when the function returns or raises an exception that is not handled within the function the namespace is deleted
- recursive invocation have their own local namespace
-
when the function returns or raises an exception that is not handled within the function the namespace is deleted
-
a scope is a textual region of a python prgoram where a namespace is directly accessible
-
scopes are determined staticly but used dynamically
- at any time during execution, there can be nested scopes whose namspaces are directly accessible
-
if a name is declared global
- then all references and assignments go directly to the next-to-last scope cotaining the module's global names
-
if a name is declared non-local
- then it allows encapsulated code to rebind to variables found outside of the innermost scope
-
if no global or nonlocal statement is in effect - assignments to names always go into the innermost scope
- assignments do not copy data - they just bind names to objects
-
the local scope references the local names of the current function
- the local scope outside of functions references the same namespace as the global scope: the module's namespace
- class definitions place another namespace in that local scope
-
scopes are determined staticly but used dynamically
-
a namespace is a mapping from names to objects
-
syntax
-
class definitions must be executed before they have any effect
- could conceivably place a class definition in a branch of an if statement or inside a function
-
definition
class ClassName: <statement-1> . . . <statement-N>
-
when a class definition is entered
- a new namespace is created and used as the local scope
-
when a class definition is exited
-
a class object is created
- a wrapper around the contents of the namespaces created by the class definition
-
original local scope is reinstated
- the class object is bound here to the class name given in the class definition
-
a class object is created
-
when a class definition is entered
-
class object
class MyClass: """A simple example class""" i = 12345 def f(self): return 'hello world' x = MyClass() class Complex: def __init__(self, realpart, imagpart): self.r = realpart self.i = imagpart x = Complex(3.0, -4.5) x.r, x.i
-
class objects support attribute references and instantiation
-
attribute references use the standard syntax used for all attribute references in python
-
obj.name
- valid attribute names exist in the class's namespace when the class object was created
- class attributes can be assigned to
- doc is a valid attribute, returning the docstring belonging to a class
-
obj.name
-
class instantiation uses function notation
- it creates a new instance of the class and assigns this object to the variable
-
classes define a method named _init_() to create objects with instances customized to a specfic initial state
- class instantiation automatically invokes _init_() for newly created class instances
- _init_() may have arguments passed from the class instantion operator
-
attribute references use the standard syntax used for all attribute references in python
-
class objects support attribute references and instantiation
-
instance object
-
the only operations understood by instance objects are attribute references
-
there are two valid kinds of valid attributes names: data attributes and methods
x.counter = 1 while x.counter < 10: x.counter = x.counter * 2 print(x.counter) del x.counter
-
data attributes need not be declared
- they spring into existence when they are first assigned to
-
methods are functions that belong to an object
- all attributes of a class that are function objects define corresponding methods of its instances
-
-
the only operations understood by instance objects are attribute references
-
method objects
-
usually a method is called right after it is bound
- it is not necessary to call a method right away, and can be stored away at a later time
x.f() xf = x.f while True: print(xf())
- the special thing about methods is that the instance object is passed as the first argument of the function
-
usually a method is called right after it is bound
-
class and instance variables
- instance variables are for data unique to each instance
-
class variables are for attributes and methods shared by all instances of the class
-
mutable objects like lists and dicts and should not be used as below in class variables
class Dog: kind = 'canine' # class variable shared by all instances tricks = [] # mistaken use of a class variable def __init__(self, name): self.name = name def add_trick(self, trick): self.tricks.append(trick) >>> d = Dog('Fido') >>> e = Dog('Buddy') >>> d.kind # shared by all dogs 'canine' >>> e.kind # shared by all dogs 'canine' >>> d.name # unique to d 'Fido' >>> e.name # unique to e 'Buddy' >>> d.add_trick('roll over') >>> e.add_trick('play dead') >>> d.tricks # unexpectedly shared by all dogs ['roll over', 'play dead']
- instead use an instance variable
-
class Dog: def __init__(self, name): self.name = name self.tricks = [] # creates a new empty list for each dog def add_trick(self, trick): self.tricks.append(trick) >>> d = Dog('Fido') >>> e = Dog('Buddy') >>> d.add_trick('roll over') >>> e.add_trick('play dead') >>> d.tricks ['roll over'] >>> e.tricks ['play dead']
-
if the same attribute name occurs in both an instance and in a class, then attribute lookup prioritizes the instance
class Warehouse: purpose = 'storage' region = 'west' w1 = Warehouse() print(w1.purpose, w1.region) w2 = Warehouse() w2.region = 'east' print(w2.purpose, w2.region)
-
assigning a function object to a local variable in the class or outside the class is ok
# Function defined outside the class def f1(self, x, y): return min(x, x+y) class C: f = f1 def g(self): return 'hello world' h = g
-
this practice only serves to confuse the read of a program
-
methods may call other methods by using method attributes of the self argument
- the self argument is expected first argument of a class method
class Bag: def __init__(self): self.data = [] def add(self, x): self.data.append(x) def addtwice(self, x): self.add(x) self.add(x)
-
each value is an object and therefore has a class (also called it type)
- it is stored as object._class__
-
class definitions must be executed before they have any effect
-
inheritance
class DerivedClassName(BaseClassName): <statement-1> . . . <statement-N>
-
the BaseClassName must be defined in a namespace directly accessible from the scope containing the derived class definition
-
execution of a derived class definition proceeds the same as for a base class
- upon construction, the base class is remembered
-
if a requested attribute is not found in the class
-
the search proceeds to look in the base class
- and applied recursively if the base class is derived from some other class
-
the search proceeds to look in the base class
-
execution of a derived class definition proceeds the same as for a base class
-
derived classes may override methods of their base classes
-
an overriding method may want to extend rather than replace the base class method of the same name
-
just call BaseClassName.methodname(self, arguments)
- only works if the base class is accessible in the global scope
-
just call BaseClassName.methodname(self, arguments)
-
an overriding method may want to extend rather than replace the base class method of the same name
-
python provides two built-in functions that work with inheritance
-
isinstance() to check an instance's type
- isinstance(obj, int) will be True only if obj._class__ is int or some class derived from int
-
issubclass() to check class inheritance
- issubclass(bool, int) is True since bool is a subclass of int
- issubclass(float, int) is False since float is not a subclass of int.
-
isinstance() to check an instance's type
-
the BaseClassName must be defined in a namespace directly accessible from the scope containing the derived class definition
-
multiple inheritance
class DerivedClassName(Base1, Base2, Base3): <statement-1> . . . <statement-N>
-
the search for attributes inherited from a parent class is depth-first, left-to-right,
-
not searching twice in the same class where there is an overlap in the hierarchy
-
it is slightly more complex, the method resolution order changes dynamically to support cooperative calls to super()
-
dynamic ordering is necessary because all cases of multiple inheritance exhibit one or more diamond relationships
- where at least one of the parent classes can be accessed through multiple paths from the bottommost class
-
to keep base classes from being accessed more than once
-
the dynamic algorithm linearizes the search order in a way that preserves left-to-right ordering specified in each class, that calls each parent only once, and that is monotonic
- meaning that a class can be subclassed without affecting the precedence order of it parents
-
the dynamic algorithm linearizes the search order in a way that preserves left-to-right ordering specified in each class, that calls each parent only once, and that is monotonic
-
dynamic ordering is necessary because all cases of multiple inheritance exhibit one or more diamond relationships
-
it is slightly more complex, the method resolution order changes dynamically to support cooperative calls to super()
-
not searching twice in the same class where there is an overlap in the hierarchy
-
the search for attributes inherited from a parent class is depth-first, left-to-right,
-
private variables
-
private instance variables that cannot be accessed except from inside of an object do not exist in Python
- a name prefixed with an underscore should be treated as a non-public part of the API
-
class-private members are made possible using name mangling
-
any identifier (e.g. _thing ) with at least two leading underscores, at most one trailing underscore,
-
is textually replaced with classname_thing
-
classname is the current class name with leading underscores stripped
-
this is done without regard to syntactic position of the identifier
class Mapping: def __init__(self, iterable): self.items_list = [] self.__update(iterable) def update(self, iterable): for item in iterable: self.items_list.append(item) __update = update # private copy of original update() method class MappingSubclass(Mapping): def update(self, keys, values): # provides new signature for update() # but does not break __init__() for item in zip(keys, values): self.items_list.append(item)
-
-
classname is the current class name with leading underscores stripped
-
is textually replaced with classname_thing
-
any identifier (e.g. _thing ) with at least two leading underscores, at most one trailing underscore,
-
private instance variables that cannot be accessed except from inside of an object do not exist in Python
-
dataclass
-
this module provides a decorator and functions
- for automatically adding generated special methods to user-defined classes
-
similar to Pascal "record" or C "struct"
- useful for bundling together a few named data items
from dataclasses import dataclass @dataclass class Employee: name: str dept: str salary: int john = Employee('john', 'computer lab', 1000) john.dept # 'computer lab' john.salary # 1000
-
this module provides a decorator and functions
-
iterators
-
container objects can be looped over using a for statement
-
the for statement calls iter() on the container object
-
returns an iterator object that defines the method _next_(), callable as built-in function next()
- which accesses elements in the container one at a time
- raises StopIteration exception which terminate the for loop
-
returns an iterator object that defines the method _next_(), callable as built-in function next()
-
the for statement calls iter() on the container object
s = 'abc' it = iter(s) it next(it) # 'a' next(it) # 'b' next(it) # 'c' next(it) Traceback (most recent call last): File "<stdin>", line 1, in <module> next(it) StopIteration
add iterator behavior to your classes
class Reverse: """Iterator for looping over a sequence backwards.""" def __init__(self, data): self.data = data self.index = len(data) def __iter__(self): return self def __next__(self): if self.index == 0: raise StopIteration self.index = self.index - 1 return self.data[self.index] rev = Reverse('spam') iter(rev) for char in rev: print(char) # m # a # p # s
-
container objects can be looped over using a for statement
-
generators
-
a tool for creating iterators
-
written like regular functions but use the yield statement
- whenever they want to return data
-
written like regular functions but use the yield statement
-
generators are very compact because _iter_() and _next_() methods are created automatically
-
each time next() is called on it
-
the generator resumes where it left off
-
remembering all the data values and which statement was last executed
- this feature is made easier using instance variables like self.index and self.data
-
remembering all the data values and which statement was last executed
-
the generator resumes where it left off
-
each time next() is called on it
def reverse(data): for index in range(len(data)-1, -1, -1): yield data[index] for char in reverse('golf'): print(char) # f # l # o # g
-
generators can be coded succintly as expressions
-
designed to be used right away by an enclosing function
-
tend to be more memory friendly than equivalent list comprehension
sum(i*i for i in range(10)) # sum of squares # 285 xvec = [10, 20, 30] yvec = [7, 5, 3] sum(x*y for x,y in zip(xvec, yvec)) # dot product #260 unique_words = set(word for line in page for word in line.split()) valedictorian = max((student.gpa, student.name) for student in graduates) data = 'golf' list(data[i] for i in range(len(data)-1, -1, -1)) # ['f', 'l', 'o', 'g']
-
-
designed to be used right away by an enclosing function
-
a tool for creating iterators
instances
errors, exceptions
-
syntax errors shows the parser repeating the offending line
-
and display little arrows at the place where the error was detected
- not always accurate to source of error so check that entire line
-
and display little arrows at the place where the error was detected
-
errors detected during execution are called exceptions
- and are not unconditionally fatal
- exceptions come in different types and the type is printed as part of the error message
-
handling exceptions
while True: try: x = int(input("Please enter a number: ")) break except ValueError: print("Oops! That was no valid number. Try again...")
-
the try statement works as follows
- first the try clause is executed
- if no exception occurs, the except clause is skipped and execution of the try statement is finished
- if an exception occurs during execution of the try clause, the rest of the clause is skipped. Then if its type matches the exception named after the except keyword, the except cause is executed, and then execution continues after the try/except block
- if an exception occurs which does not match the exception named in the except clause, it is passed on to outer try staements; if no handler is found, it is an unhandled exception and execution stops with an error message
-
the try statement may have more than one except clause, to specify different handlers for different exceptions
- an except clause may name multiple exceptions as a parenthesized tuple
... except (RuntimeError, TypeError, NameError): ... pass
-
a class in an except clause matches exceptions
- which are instances of the class itself or one of its derived classes
class B(Exception): # base class pass class C(B): # subclass pass class D(C): # subsubclass pass for cls in [B, C, D]: try: raise cls() except D: print("D") except C: print("C") except B: print("B")
-
prints out B, C, D in that order
- except B: came first, it would have printed B, B, B
-
the except clause may specify a variable after the exception name
-
typically has an args attribute
-
built-in exception types define _str_() to print all arguments without explicitly accessing .args
- _str_() output is printed as the last part ('detail') of the message for unhandled exceptions
-
built-in exception types define _str_() to print all arguments without explicitly accessing .args
-
typically has an args attribute
try: raise Exception('spam', 'eggs') except Exception as inst: print(type(inst)) # the exception type print(inst.args) # arguments stored in .args print(inst) # __str__ allows args to be printed directly, # but may be overridden in exception subclasses x, y = inst.args # unpack args print('x =', x) print('y =', y) # <class 'Exception'> # ('spam', 'eggs') # ('spam', 'eggs') # x = spam # y = eggs
BaseException is the common base class of all exceptions
SystemExit (fatal) which is raised by sys.exit()
- signals an intention to exit the interpreter
KeyboardInterrupt (fatal) which is raised when a user wishes to interrupt the program
Exceptions is the base of all the non-fatal exceptions
-
print or log the exception
- and then re-raise it, which allows a caller to handle the exception as well
import sys try: f = open('myfile.txt') s = f.readline() i = int(s.strip()) except OSError as err: print("OS error:", err) except ValueError: print("Could not convert data to an integer.") except Exception as err: print(f"Unexpected {err=}, {type(err)=}") raise
-
the try … except statement has an optional else clause
- which when present must follow all except clauses
- the use of else is better than adding an additional code to the try clause
- exception handlers can handle occurences inside functions that are called indirectly in the try clause
for arg in sys.argv[1:]: try: f = open(arg, 'r') except OSError: print('cannot open', arg) else: print(arg, 'has', len(f.readlines()), 'lines') f.close() def this_fails(): x = 1/0 try: this_fails() except ZeroDivisionError as err: print('Handling run-time error:', err)
-
the try statement works as follows
-
raising exceptions
- the raise statement allows the programmer to force a specified exception to occur
raise NameError('HiThere') raise ValueError # shorthand for 'raise ValueError()' try: raise NameError('HiThere') except NameError: print('An exception flew by!') raise
-
exception chaining
-
if an unhandled exception occurs inside an except section
- it will have the exception being handled attached to it and included in the error message
try: open("database.sqlite") except OSError: raise RuntimeError("unable to handle error") # exc must be exception instance or None. raise RuntimeError from exc def func(): raise ConnectionError try: func() except ConnectionError as exc: raise RuntimeError('Failed to open database') from exc try: open('database.sqlite') except OSError: raise RuntimeError from None
-
the raise statment allows an optional from clause
- allows disabling automatic exception chaining using the from None idiom
-
if an unhandled exception occurs inside an except section
-
user-defined exceptions
-
programs may name their own exceptions by creating a new exception class
- typically derived from Exception class
-
exception classes can be defined which do anything any other class can do
- but are usually simple
-
programs may name their own exceptions by creating a new exception class
-
clean-up actions
-
the try statement has another optional clause which is intended to define clean-up actions
- that must be executed under all circumstances
-
if a finally clause is present, the finally clause will execute as the last task
- before the try statement completes
- the finally clause runs whether or not the the try statement produces an exception
def divide(x, y): try: result = x / y except ZeroDivisionError: print("division by zero!") else: print("result is", result) finally: print("executing finally clause") divide(2, 1) result is 2.0 executing finally clause divide(2, 0) division by zero! executing finally clause divide("2", "1") executing finally clause Traceback (most recent call last): File "<stdin>", line 1, in <module> divide("2", "1") ~~~~~~^^^^^^^^^^ File "<stdin>", line 3, in divide result = x / y ~~^~~ TypeError: unsupported operand type(s) for /: 'str' and 'str'
-
some objects define standard clean-up actions to be undertaken when the object is no longer needed
-
the with statement allows objects like files to be used in a way that ensures they are always cleaned up promptly and correctly
with open("myfile.txt") as f: for line in f: print(line, end="")
-
-
the try statement has another optional clause which is intended to define clean-up actions
-
raising and handling multiple exceptions
-
for concurrency frameworks, when several tasks may have failed in parallel
- but also where it is desirable to continue execution and collect multiple errors rather than raise the first exception
-
the built-in ExceptionGroup wraps a list of exception instances so that they can be raised together
-
caught like any other exception
def f(): excs = [OSError('error 1'), SystemError('error 2')] raise ExceptionGroup('there were problems', excs) f() + Exception Group Traceback (most recent call last): | File "<stdin>", line 1, in <module> | f() | ~^^ | File "<stdin>", line 3, in f | raise ExceptionGroup('there were problems', excs) | ExceptionGroup: there were problems (2 sub-exceptions) +-+---------------- 1 ---------------- | OSError: error 1 +---------------- 2 ---------------- | SystemError: error 2 +------------------------------------ try: f() except Exception as e: print(f'caught {type(e)}: e') caught <class 'ExceptionGroup'>: e
-
using except* instead of except
- we can selectivly handle only the exceptions in the group that match a certain type
-
def f(): raise ExceptionGroup( "group1", [ OSError(1), SystemError(2), ExceptionGroup( "group2", [ OSError(3), RecursionError(4) ] ) ] ) try: f() except* OSError as e: print("There were OSErrors") except* SystemError as e: print("There were SystemErrors") There were OSErrors There were SystemErrors + Exception Group Traceback (most recent call last): | File "<stdin>", line 2, in <module> | f() | ~^^ | File "<stdin>", line 2, in f | raise ExceptionGroup( | ...<12 lines>... | ) | ExceptionGroup: group1 (1 sub-exception) +-+---------------- 1 ---------------- | ExceptionGroup: group2 (1 sub-exception) +-+---------------- 1 ---------------- | RecursionError: 4 +------------------------------------
-
for concurrency frameworks, when several tasks may have failed in parallel
-
enriching exceptions with notes
-
exceptions have a method addnote(note)
- that accepts a string and adds it to the exception's notes list
def f(): raise OSError('operation failed') excs = [] for i in range(3): try: f() except Exception as e: e.add_note(f'Happened in Iteration {i+1}') excs.append(e) raise ExceptionGroup('We have some problems', excs) + Exception Group Traceback (most recent call last): | File "<stdin>", line 1, in <module> | raise ExceptionGroup('We have some problems', excs) | ExceptionGroup: We have some problems (3 sub-exceptions) +-+---------------- 1 ---------------- | Traceback (most recent call last): | File "<stdin>", line 3, in <module> | f() | ~^^ | File "<stdin>", line 2, in f | raise OSError('operation failed') | OSError: operation failed | Happened in Iteration 1 +---------------- 2 ---------------- | Traceback (most recent call last): | File "<stdin>", line 3, in <module> | f() | ~^^ | File "<stdin>", line 2, in f | raise OSError('operation failed') | OSError: operation failed | Happened in Iteration 2 +---------------- 3 ---------------- | Traceback (most recent call last): | File "<stdin>", line 3, in <module> | f() | ~^^ | File "<stdin>", line 2, in f | raise OSError('operation failed') | OSError: operation failed | Happened in Iteration 3 +------------------------------------
-
exceptions have a method addnote(note)
Geospatial Data Science with Julia, Dr. Júlio Hoffimann
Style Guide for Julia Code
-
Variable names must begin with a letter (A-Z,a-z), underscore, or a subset of unicode code points greater than 00A0
-
variable names are case-sensitive, and have no semantic meaning
-
Unicode names (UTF-8 encoding) are allowed by typing the backslashed LaTeX symbol name followed by tab
-
you can shadow existing exported constants, fore as long as you dont redefine a built-in constant or built-in function already
-
variable names that contain only underscores are write-only, and the values assigned are immediately discarded
-
variables with explicit names of built-in keywords are disallowed
- Names of variables are in lowercase
-
Word separation can be indicated by underscores, but use of underscores is discouraged
- unless the name would be hard to read otherwise
-
Names of `Types` and `Modules` begin with a capital letter
- word separation is shown with upper camel case instead of underscores
- Names of `functions` and `macros` are in lowercase, without underscores
-
Functions that write to their arguments have names that end in `!`.
-
These are called "mutating" or "in-place" functions
- they are intended to produce changes in their arguments after the function is called, not just return a value.
-
These are called "mutating" or "in-place" functions
-
-
variable names that contain only underscores are write-only, and the values assigned are immediately discarded
-
you can shadow existing exported constants, fore as long as you dont redefine a built-in constant or built-in function already
-
Unicode names (UTF-8 encoding) are allowed by typing the backslashed LaTeX symbol name followed by tab
-
variable names are case-sensitive, and have no semantic meaning
Julia Data Types
-
Julia comes with a rich set of built-in data types
-
These types help Julia manage memory efficiently
-
all values in Julia are true objects having a type belonging to the fully connected type graph
- all nodes of which are equally first-class as types
-
all values in Julia are true objects having a type belonging to the fully connected type graph
-
These types help Julia manage memory efficiently
-
Only values, not variables, have types
- variables are simply names bound to values in Julia
-
Data types in Julia form a single, fully connected type graph
-
At the top is Any
- Then its subtypes are many common types like Number, AbstractString, Bool, Char
-
At the top is Any
-
The three principal types (Abstract, Primitive, Composite)
-
are explicity declared
-
have names
-
have explicitly declared supertypes
- may have parameters
-
have explicitly declared supertypes
-
have names
-
These types are internally represented as instances of the same concept, DataType
-
DataType may be abstract or concrete
- concrete has a specified size, storage layout, and optionally field names
- composite type is a DataType that has field names or is empty
-
DataType may be abstract or concrete
-
are explicity declared
Numeric
Boolean
Character
String
Collections
Abstract
Composite
Parametric
Exercises
-
easy create a list of tuples, each representing geographic coordinates, meaning latitude (-180<val<180) and longitude(-90<val<90), and calculate the centroid of these coordinates, then create a dictionary to store the centroid's latitude and longitude
-
medium create a list of tuples, each represent coordinates in 3D cartographic space, meaning latitude and longitude are in radians and height is given in meters, and calculate the local ENU transformation of these coordinates, then create a set to store the 64-bit precision floating-number of these coordinates
- hard calculate the precision loss between converting between geographic coordinates and ENU coordinates by comparing the the tuples
-
medium create a list of tuples, each represent coordinates in 3D cartographic space, meaning latitude and longitude are in radians and height is given in meters, and calculate the local ENU transformation of these coordinates, then create a set to store the 64-bit precision floating-number of these coordinates
-
easy create a dictionary to store attributes of a geographic feature, and include keys for the name, length, and location of the feature, then add an additional attribute and print the dictionary
- medium
- Write a EPSG converter for WKT1/2, WKB, proj-string, PROJJSON