Pandas for Data Analysis. Learn how to use pandas, the Python… | by Jenny Dcruz


Learn how to use pandas, the Python package to analyze your data easily

Photo by Sid Balachandran on Unsplash

Pandas, the Python package provides speedy, flexible and expressive data structures designed to make working with data easy and is an open-source data analysis and manipulation tool. It’s fundamental for doing practical data analysis in Python.

You need to have a good understanding of the nature of your datasets while working on them. Pandas is the best tool to help you do that. So let’s get into it and learn the various functions and features provided by pandas.

First, make sure to have pandas installed in your system

Using conda:

conda install pandas

Using pip:

pip install pandas

Pandas provides two primary components:

  1. Series
  2. DataFrames

1. Series

Series are similar to lists. You can think of it as a one dimensional array. By default, each item receives an index label from 0 to (n-1), where n is the size of the Series. Let’s create a Series with an arbitrary list of names.

>>> s = pd.Series((‘Jen’,’Neil’,’Jay’,’Dan’,’Kev’,’Mo’))
>>> print(s)
0 Jen
1 Neil
2 Jay
3 Dan
4 Kev
5 Mo
dtype: object

All of the names are indexed by numbers from 0 to n-1.

As for the ‘dtype’ property, it is used to find the data type in the DataFrame. It gives back a Series with the data type of each column. Columns that consist of mixed types are stored with the object dtype.

Integer indexing

Next up, select a specific item from the Series. You can use integer indexing to do this. An instance of the same is given below.

>>> print(s[1])
Neil

Slicing

To select items from 3 to 5, we can use the slicing technique that selects a range of items from the Series.

>>> print(s[2:4])
2 Jay
3 Dan
dtype: object

This does not include the item at index 4. It will return just the items indexed at 2 and 3.



Source link

Leave a Comment