Basic Pandas Types

Pandas

Pandas Series

Pandas Series Example

>>> data = pd.Series([0.25, 0.5, 0.75, 1.0])
>>> data
0   0.25
1   0.50
2   0.75
3   1.00
dtype: float64
>>> data[1]
0.5
>>> data[1:3]
1    0.50
2    0.75
dtype: float64

Pandas Series with Index Example

>>> data = pd.Series([0.25, 0.5, 0.75, 1.0],
           index = ['a', 'b', 'c', 'd'])
>>> data
a   0.25
b   0.50
c   0.75
d   1.00
dtype: float64
>>> data['b']
0.5
>>> data['b':'d']
b    0.50
c    0.75
d    1.00
dtype: float64

Pandas Series Attributes

Pandas DataFrame Object

Pandas DataFrame Construction Example

>>> df = pd.DataFrame([[2,4,6], [1,3,5]])
>>> df
   0  1  2
0  2  4  6
1  1  3  5
>>> df.index
RangeIndex(start=0, stop=2, step=1)
>>> df.columns
RangeIndex(start=0, stop=3, step=1)

Pandas DataFrame Construction Examples

>>> pd.DataFrame(np.ones((3,2)),
                 columns=['one', 'two'],
                 index=['a', 'b', 'c'])
   one  two
a  1.0  1.0
b  1.0  1.0
c  1.0  1.0
>>> pd.DataFrame([{'a': i, 'b': 2 * i}
                  for i in range(3)])
   a  b
0  0  0
1  1  2
2  2  4

Adding/Removing Columns from a DataFrame

Pandas Index Object

Pandas Indexers

Pandas Indexer Examples

>>> data
        one       two
a  0.495141  0.965454
b  0.673145  0.246473
c  0.716398  0.730835

>>> data.loc[:'b', :'one']
        one
a  0.495141
b  0.673145

# equivalent to the above
>>> data.iloc[:2, :1]
>>> data.ix[:2, :'one']

Summary of Selection on DataFrames

Operation Syntax Result Type
select a column df[col] Series
select row by label df.loc[label] Series
select row by integer location df.iloc[loc] Series
slice rows df[5:10] DataFrame
select rows by boolean vector df[bool_vec] DataFrame

Pandas and UFuncs

Pandas and UFuncs Examples

>>> A = pd.DataFrame(
          np.arange(4).reshape((2,2)),
          columns=['one', 'two'])
>>> B = pd.DataFrame(
          np.arange(3).reshape((3,3)),
          columns=['three', 'two', 'one'])
>>> A + B
   one  three  two
0  2.0    NaN  2.0
1  7.0    NaN  7.0
2  NaN    NaN  NaN

Missing Data