Series

Pandas have mainly three types of data structures i.e Series, Dataframes, and Panel. These data structures are built on top of Numpy which means they are fast.

Dimension and Description

The best way to think of this data structure is that the higher dimensional data structure is the container of its lower dimensional data structure, for example, Dataframe is the container of Series and Panel is the container of Dataframe.

Building and handling two or more dimensional arrays is a tedious task, the burden is placed on the user to consider the orientation of the data set when writing functions. But using Pandas data structures, the mental effort of the user is reduced.

For example, with tabular data (DataFrame) it is more semantically helpful to think of the index (the rows) and the columns rather than axis 0 and axis 1.

 Series are 1 dimensional labeled array capable of holding any data type (Strings, integers, Floats, Python objects etc.). The axis labels are collectively referred to as the index. Series contains homogeneous data, for example, the following series is a collection of integers.

  • Series have homogeneous data
  • They are size immutable
  • Series have the value of data mutable

 

The basic method for creating a Series is by using the Series function of pandas module.

pd.Series(data,index,dtype,copy)

  • data: -data takes various forms like ndarray, list, constants.
  • index: -Index values must be unique and hashable, the same length as data. Default np.arrange(n) if no index is passed.
  • dtype: – It tells the data type. If nothing is passed data type will be inferred.
  • copy: – copy data. default False.

Note: -Labels of a Series need not be unique but they must be a hashable type

Series Robofied

 

Here data can be:

  • A Python Dictionary
  • ndarray
  • a constant value like 5.

The index is the list of axis labels and the length of the index should be equal to the length of the data.

The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Series from Dict

Series can also be created from a dictionary. In this case, all the keys of the dictionary will be considered as row label.

dict Robofied

 

If the index value is not provided then the Series will be in the same order as of dictionary. But if we provide the index value as the parameter, and the order of the index is not the same as of the dictionary, then the resultant Series will have the order of the provided index.

index Robofied

 

If nothing is provided as the data parameter, the Series will automatically fill all the index value as NaN (Not a Number).

nan Robofied

Series From Scalar Value

Series can also be created from a Scalar value. If data is given as a scalar value then that scalar value will be broadcasted to all the index’s value.

scalar Robofied

Indexing and Slicing on Series

Just like ndarray we can perform indexing and slicing on the Series.

slicing Robofied

 

Series is Dict Like

A Series is like a fixed-size dict in that you can get and set values by index label:

dictionary operations Robofied

Vectorized Operation On Series

Various arithmetic operations can be performed on a Series and for that, we don’t have to loop one by one over each element.

operations Robofied

A key difference between Series and ndarray is that operations between Series automatically align the data based on the label. Thus, you can write computations without giving consideration to whether the Series involved having the same labels.

Operations between Series (+, -, /,,*) align values based on their associated index values– they need not be the same length. The resulting index will be the sorted union of the two indexes.

Naming a Series

Series can also have a name attribute.

name Robofied

 

 

 

Leave a Comment

Your email address will not be published. Required fields are marked *