upper
lower
capitalize
title
swapcase
Examples
>>> fox = "tHe qUICk bROWn fOx'
>>> fox.upper()
'THE QUICK BROWN FOX
>>> fox.title()
'The Quick Brown Fox'
>>> fox.swapcase()
'ThE QuicK BrowN FoX'
strip
rstrip
lstrip
center
ljust
rjust
Example:
>>> line = ' stuff '
>>> line.strip()
'stuff'
The strip
methods can remove arbitrary characters
>>> num = "000000000042"
>>> num.strip('0')
'42'
Examples:
>>> line = 'stuff'
>>> line.center(20)
' stuff '
>>> line.ljust(20)
'stuff '
>>> '42'.rjust(10, '0')
'0000000042'
find
rfind
index
rindex
endswith
startswith
replace
Examples:
>>> line = 'Hello world'
>>> line.find('world')
6
>>> line.rfind('o')
7
>>> line.startswith('He')
True
>>> line.replace('o', '---')
'Hell--- w---rld'
partition
rpartition
split
splitlines
join
Examples:
>>> line = 'The quick brown fox jumped over a lazy dog'
>>> line.partition('fox')
('The quick brown ', 'fox', ' jumped over a lazy dog')
>>> line.split()
['The', 'quick', 'brown', 'fox', 'jumped', 'over', 'a', 'lazy', 'dog']
>>> '--'.join(['1', '2', '3'])
'1--2--3'
format
BasicsThe format
method requires that the string have format fields
indicated with curly braces.
The format fields are replaced with objects passed into the format
method.
Example:
>>> '{} and {}'.format('Alice', 'Bob')
'Alice and Bob'
format
ArgumentsThe format fields can be replaced by position or keyword
Position example:
>>> '{1} and {0}'.format('Alice', 'Bob')
'Bob and Alice'
Keyword example:
>>> '{a} and {b}'.format(a='Alice', b='Bob')
'Alice and Bob'
Mixed example:
>>> '{b} and {0}'.format('Alice', b='Bob')
'Bob and Alice'
format
>>> for x in range(1,11):
... print('{:2} {:3} {:4}'.format(x,x**2,x**3))
...
1 1 1
2 4 8
3 9 27
4 16 64
5 25 125
6 36 216
7 49 343
8 64 512
9 81 729
10 100 1000
format
SpecifiersAn optional ’:’
and format specifier can follow the format field name
Format specifiers enable greater control over formatted values
Syntax (all are optional): {:[1][2][3][4][5][6][7][8]}
fill and alignment
sign
’#’
– alternate number form
’0’
– sign aware zero padding
width
grouping
‘.’ followed by integer – precision
type
format
Specifiers: Width OptionThe width option is a positive integer that specifies padding
If the width is too small, nothing happens
Example:
>>> '{:10}'.format('hello')
'hello '
>>> '{:2}'.format('hello')
'hello'
format
Specifiers: Fill and AlignmentAlignment options:
’<’
: left align
’>’
: right align
’=’
: pad after sign (if any) but before digits
’^’
: center align
A fill character can optionally be specified before the alignment character
Example:
>>> '{:^11}'.format('hello')
' hello '
>>> '{:>11}'.format('hello')
' hello'
>>> '{:*>11}'.format('hello')
'******hello'
>>> '{:-^11}'.format('hello')
'---hello---'
format
Specifiers: Sign optionThe sign option is only valid for number types
’+’
: sign should be used for both positive and negative numbers
’-’
: sign should be used for only negative numbers (default)
’ ’
(space): leading space for positive numbers and minus sign for negative numbers
Examples
>>> '{:+}'.format(123)
'+123'
>>> '{: }'.format(123)
' 123'
>>> '{:+}'.format(1.414)
'+1.414'
format
Specifiers: #
The #
option causes a type specific “alternate form” to be used for the conversion
The #
can only be used for integer, float, complex, and Decimal types
Example
>>> # print 123 in binary
>>> '{:b}'.format(123)
'1111011'
>>> '{:#b}'.format(123)
'0b1111011'
format
Specifiers: 0
The 0
(preceding the width option) enables sign aware zero-padding for numeric types
Example
>>> '{:010}'.format(123)
'0000000123'
>>> '{:010}'.format(1.414)
'000001.414'
format
Specifiers: Grouping OptionThe grouping option specifies a character for separating thousands in numbers
The grouping option can be either ’_’
or ’,’
Example
>>> '{:_}'.format(1000000)
'1_000_000'
>>> '{:,}'.format(1000000)
'1,000,000'
format
Specifiers: Precision OptionThe precision option specifies how many digits to be displayed:
after the decimal point for fixed point floating point values
before and after the decimal point for general floating point values
Example
>>> x = math.sqrt(2)
>>> '{:.2}'.format(x)
'1.4'
>>> '{:.2f}'.format(x)
'1.41'
>>> '{:.6}'.format(x)
'1.41421'
>>> '{:.6f}'.format(x)
'1.414214'
format
Specifiers: Type OptionString option:
’s’
: stringCommon integer options:
’b’
: binary
’c’
: character
’d’
: decimal
’o’
: octal
’x’
: hex (lower case)
’X’
: hex (upper case)
format
Specifiers: Type Option (Continued)Common float options:
’e’
: exponent notation
’E’
: exponent notation (upper case E)
’f’
: fixed point
’F’
: fixed point (upper case NAN and INF)
’g’
: general format
’%’
: percent (multiply by 100)
format
Specifiers: Type OptionInteger examples
>>> '{:d}'.format(123)
'123'
>>> '{:b}'.format(123)
'1111011'
>>> '{:X}'.format(123)
'7B'
Floating point examples
>>> '{:f}'.format(1.414)
'1.414000'
>>> '{:E}'.format(1.414)
'1.414000E+00'
>>> '{:%}'.format(1.414)
'141.400000%'
Regular expressions are means of flexible pattern matching in strings
The Python interface to regular expressions is the re
module
re
methods:
compile
split
match
search
sub
Example
>>> import re
>>> regex = re.compile('\s+')
>>> line = 'the quick brow fox jumped over a lazy dog'
>>> regex.split(line)
['the', 'quick', 'brow', 'fox', 'jumped', 'over', 'a', 'lazy', 'dog']
The pattern \s+
matches any whitespace character
Example: comparing string methods
>>> line.index('fox')
15
>>> regex = re.compile('fox')
>>> match = regex.search(line)
>>> match.start()
15
>>> line.replace('fox', 'BEAR')
'the quick brow BEAR jumped over a lazy dog'
>>> regex.sub('BEAR', line)
'the quick brow BEAR jumped over a lazy dog'
Simple Strings are matched exactly
>>> regex = re.compile('ion')
>>> regex.findall('Great Expectations')
['ion']
Some characters have special meanings:
. ^ $ * + ? { } [ ] \ | ( )
\
) to match themExample: the ‘r’ preface indicates a raw string
>>> regex = re.compile(r'\$')
>>> regex.findall("the cost is $20')
['$']
Example: strings versus raw strings
>>> print('a\tb\tc')
a b c
>>> print(r'a\tb\tc')
a\tb\tc
The backslash can be used to give normal characters special meaning
character | description |
---|---|
"\d" |
match any digit |
"\D" |
match any non-digit |
"\s" |
match any whitespace |
"\S" |
match any non-whitespace |
"\w" |
match any alphanumeric char |
"\W" |
match any non-alphanumeric char |
Example
>>> regex = re.compile(r'\w\s\w')
>>> regex.findall('the fox is 9 years old')
['e f', 'x i', 's 9', 's o']
The square brackets specify a set of characters
>>> regex = re.compile('[aeiou]')
>>> regex.split('consequential')
['c', 'ns', 'q', '', 'nt', '', 'l']
The dash can be used to specify a range
>>> regex = re.compile('[A-Z][0-9]')
>>> regex.findall('1043879, G2, H6')
['G2', 'H6']
Curly braces with a number specify repetition
>>> regex = re.compile(r'\w{3}')
>>> regex.findall('The quick brown fox')
['The', 'qui', 'bro', 'fox']
character | description |
---|---|
? |
match zero or one |
* |
match zero or more |
+ |
match one or more |
{n} |
match n repetitions |
{m,n} |
match between ‘m’ and ‘n’ |
Example: email address matcher
email = re.compile(r'[\w.]+@\w+\.[a-z]{3}')
"[\w+]"
one or more alphanumeric characters or periods"@"
at sign"\w+"
one or more alphanumeric characters"\."
period"[a-z]"
exactly three lower case characters