Coding challenge learning note (2)

Sat, Jun 26, 2021 5-minute read python

This post is the continuation of the previous one. Topics covered here are: iterables, date, regex, read files, and other miscellaneous items.

🌀 About list, tuple, string

— 1. Tuple ordering and deep comparision in python.

— 2. string ljust() method.

— 3. control the string split times using argument maxsplit: string.split(sep, maxsplit=1).

— 4. concatenate tuple:

t = ()
for i in range(4):
    t += (i,) # notice that we add , here
t

t = (5) # is still an int 5
t = [5] # is a list [5]

— 5. check if an item is iterable

# solution 1: use try/except

# if the for loop fails, then item is not iterable
try:
    for x in item:
      pass
except TypeError:
    pass

# solution 2: use isinstance()

from collections.abc import Iterable

isinstance(item, Iterable)

# solution 3: use hasattr()
hasattr(item, '__iter__')

— 6. The dictionary setdefault method will only set a key-value pair if the key isn’t in the dictionary yet. This can be useful if there are duplicate keys and we want the (key, value) pair that appears first.

dic = {}
for key, value in iterable:
    dic.setdefault(key, value)

— 7. The shlex.split function splits in a way that is quote-aware.

import shlex

shlex.split('This is a "quoted sentence"')
# ['This', 'is', 'a', 'quoted sentence']

# not quote-aware
'This is a "quoted sentence"'.split()
# ['This', 'is', 'a', '"quoted', 'sentence"']

🌀 About date

— 1. use monthcalendar function from module calendar to get a list of lists representing a calendar.

The Calendar class gives an object that represents a calendar with the first day of each week starting on the given weekday.


from datetime import date, timedelta
from calendar import monthcalendar, THURSDAY, Calendar

monthcalendar(2021, 6) #year, month

# [[0, 1, 2, 3, 4, 5, 6],
#  [7, 8, 9, 10, 11, 12, 13],
#  [14, 15, 16, 17, 18, 19, 20],
#  [21, 22, 23, 24, 25, 26, 27],
#  [28, 29, 30, 0, 0, 0, 0]]

thursday_calendar = Calendar(THURSDAY).monthdatescalendar(2021, 6)

🌀 About regex

— 1. Write good docstrings: semantic linefeeds.

below is an example of replacing every sentence ending character (., ?, !) which is followed by one or more spaces with that same character followed by a newline character.


import re

# use group capturing 
re.sub(r'([.?!])[ ]+', r"\1\n", text)

# OR use positive lookbehind
re.sub(r'(?<=[.?!])[ ]+', r"\1\n", text) 
# look for one or more spaces that are directly following a period question mark, or exclamation mark. The punctuation characters aren't actually matched here, just the spaces after them.

# if we want to handle double quotes outside of punctuation marks.
re.sub(r'([.?!]"?)[ ]+', r"\1\n", text)
# "? in the capture group indicates that " is a optional character

# We can't use a lookbehind because Python's regular expression engine requires that lookbehinds be a fixed number of characters.

🌀 About read file

— 1. Instead of using the open function in a with block, we could do use pathlib module. The pathlib module’s Path object accepts a filename and creates an object which can be used in a number of ways. The read_text method opens the file, reads the contents, closes the file, and returns the file contents.

from pathlib import Path

Path(file_name).read_text()

— 2. print message to standard error (sys.stderr) instead of standard output, so that if someone piped the output to a file, they wouldn’t see this message printed to the file.

print(message, file=sys.stderr)

🌀 Other

— 1. make a namespace by creating a class:


class example:
  attr_a = 's'
  attr_b = 1
  attr_c = [1, 2, 3]

example.attr_a # call the attribute

# if it's numeric attributes
from enum import IntEnum
from itertools import count

class example(IntEnum):
  attr_a = 0
  attr_b = 1
  attr_c = 2

# OR
example = IntEnum('example', 'attr_a attr_b attr_c') # start count from 1

# OR 
example = IntEnum('example', zip(['attr_a', 'attr_b', 'attr_c'], count()) # start count from 0

— 2. Why python does not have switch/case ?

— 3. The built-in round function doesn’t always round upward at .5

customize to round-half-up:

# solution 1: use ceil
from math import ceil

if value % 1 >= 0.5:
    value = ceil(value)
else:
    value = floor(value)

# solution 2:
int(value + 0.5)

# solution 3: 
from decimal import Decimal, ROUND_HALF_UP

value = Decimal(value).quantize(0, ROUND_HALF_UP)

— 4. programming style: EAFP (easier to ask forgiveness than permission), LBYL (look before you leap), duck typing

— 5. a better way to set default value for function arguments: use object()


default_value = object()
# default_value = None # warning: this will make None as a invalid default value)

def function(arg, default = default_value):

    if default is not default_value:
      pass

— 6. the functions below are two different ways to flatten an iterable of iterables (example taken from python morsel). The first one uses recursion, and the second function uses stack. It took me a long time to figure out the logic of the second function, and the key is to keep in mind that iterator is lazy and it can be looped only once! I think this is a very inspiring example so I put it here.


from collections.abc import Iterable

# use recursion
def deep_flatten(iterable):
    for item in iterable:
        if isinstance(item, Iterable):
            yield from deep_flatten(item)
        else:
            yield item

# use stack
def deep_flatten(iterable):
    iterators = [iter(iterable)] # tricky part is using iterator as elements in a list
    while iterators:
        for item in iterators[-1]:
            if (isinstance(item, Iterable)
                    and not isinstance(item, (str, bytes))):
                iterators.append(iter(item))
                break
            else:
                yield item
        else: 
            iterators.pop()
            
# example
t = [1, (2, 3)]
deep_flatten(t)
# 1, 2, 3

Xiang Li