Pandas Cheat Sheet For Data Science In Python

Bacteria, datalators,

For working with data in python, Pandas is an essential tool you must use. This is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

But even when you’ve learned pandas in python, it’s easy to forget the specific syntax for doing something. That’s why today I am giving you a cheat sheet to help you easily reference the most common pandas tasks.

It’s also a good idea to check to the official pandas documentation from time to time, even if you can find what you need in the cheat sheet. Reading documentation is a skill every data professional needs, and the documentation goes into a lot more detail than we can fit in a single sheet anyway!

Importing Data:

Use these commands to import data from a variety of different sources and formats.

pd.read_csv(filename)Read From CSV file
pd.read_table(filename)From delimited text file (like TSV)
pd.read_excel(filename)From Excel file
pd.read_sql(query, connection_object)Read from SQL table/database
pd.read_json(json_string)Read from a JSON fromatted string, URL or file
pd.read_html(url)Parse an html URL, string or file and extract tables to a list of dataframes
pd.read_clipboard()Takes the contents of your clipboard adn passes it to read table()
pd.DataFrame(dict/list)From a dictionary, keys for columns name, values for data as list

Exporting Data:

Use these commands to export a DataFrame to CSV, .xlsx, SQL, or JSON.

df.to_csv(filename)Write teh df dataframe to a CSV file
df.to_excel(filename)write to an Excel file
df.to_sql(filename)Write to a SQL table
df.to_json(filename)Write to a file in JSON format

Viewing/Inspecting Data:

Use these commands to take a look at specific sections of your pandas DataFrame or Series.

df.head(n)First n rows of the dataframe (default is 5)
df.tail(n)Last n rows of the dataframe
df.shapeKnow the dimension of the dataframe
df.info()index, Datatype and Memory information
df.describe()summary statistics for numerical columns / pass include = ‘all’ to see whole summary
s.value_counts(dropna=False)View unique values and counts
df.apply(pd.Series.value_counts)Unique values and counts for all columns

Selection:

Use these commands to select a specific subset of your data.

df[col]Returns Column with label col as Series
df[[col1, col2]]Return Columns as a new DataFrame
s.iloc[0]Selection by position
s.loc[‘index_one’]Selection by index
df.iloc[0,:]First row
df.iloc[0,0]First element of first column

Data Cleaning:

Use these commands to perform a variety of data cleaning tasks.

df.columns = [‘a’,’b’,’c’]Rename columns
pd.isnull()Checks for null values, Return Boolean Array
pd.notnullCheck for non null values
df.dropna()Drop all rows that contains null values
df.dropna(axis=1)Drop all columns that contain null values
df.dropna(axis=1,thresh=n)Drop all rows that have less than n non values
df.fillna(x)Replace all null values with x
s.fillna(s.mean())Replace all null values with the mean (can be replaced with almost any function from statistics module)
s.astype(fload)Convert the datatype of the series to float
s.replace(1,’one’)Replace something with something else
s.replace([1,3],[‘one’,’three’])Replace multiple items
df.rename(colums=lambda x: x+1)Mass renaming of columns
df.rename(columns = {‘old_name’:’new_name})Selective renaming
df.set_index(‘column_one’)Change the index
df.rename(index = lambda x: x+1)Mass renaming of index

Filter, Sort, and Groupby:

Use these commands to filter, sort, and group your data.

df[df[clo]>0.5]Rows where column col is greater than 0.5
df[(df[col] > 0.5) & (df[col] < 0.7)]Rows where 0.7 > col > 0.5
df.sort_values(col1)Sort values by col1 in ascending order
df.sort_values(col2,ascending=False)Sort values by col2 in descending order
df.sort_values([col1,col2],ascending=[True,False])Sort values by col1 in ascending order then col2 in descending order
df.groupby(col)Returns a groupby object for values from one column
df.groupby([col1,col2])Returns groupby object for values from multiple columns
df.groupby(col1)[col2]Returns the mean of the values in col2, grouped by the values
in col1 (mean can be replaced with almost any function from the Statistical M)
df.pivot_table(index=col1,values=[col2,col3],aggfunc=mean)
Create a pivot table that groups by col1 and calculates the mean of col2 and col3
df.groupby(col1).agg(np.mean)Find the average across all columns for every unique col1 group
df.apply(np.mean)Apply the function np.mean() across each column
nf.apply(np.max,axis=1)Apply the function np.max() across each row

Join/Combine:

Use these commands to combine multiple dataframes into a single one.

df1.append(df2)Add the rows in df1 to the end of df2 (columns should be identical)
pd.concat([df1, df2],axis=1)Add the columns in df1 to the end of df2 (rows should be identical)
df1.join(df2,on=col1,how=’inner’)Selection by position
s.loc[‘index_one’]SQL-style join the columns in df1 with the columns on df2 where the rows for col have identical values. ‘how’ can be one

Statistics:

These commands perform various statistical tests. (They can be applied to a series as well)

df.describe()Summary statistics for numerical columns
df.mean()Returns the mean of all columns
df.corr()Returns the correlation between columns in a DataFrame
df.count()Returns the number of non-null values in each DataFrame column
df.max()Returns the highest value in each column
df.min()Returns the lowest value in each column
df.median()Returns the median of each column
df.std()Returns the standard deviation of each column
df.transpose()Transposing rows and columns

I hope this cheat sheet will be useful to you no matter you are new to python who is learning python for data science or a data professional. Happy Programming.

You can also download the printable PDF file from here.

Down Arrow on Twitter Twemoji 13.0.1

.

40 thoughts on “Pandas Cheat Sheet For Data Science In Python”

  1. I enjoy what you guys are up too. This sort of clever work and exposure!
    Keep up the amazing works guys I’ve incorporated you
    guys to my personal blogroll.

  2. Somebody essentially assist to make seriously posts I would state.

    That is the first time I frequented your website page and to this point?
    I surprised with the research you made to create this actual publish extraordinary.

    Wonderful task!

  3. Great post. I was checking continuously this blog and I’m impressed!
    Very useful information particularly the last part 🙂 I care for such information a
    lot. I was seeking this certain info for a very
    long time. Thank you and good luck.

  4. Pretty part of content. I just stumbled upon your website and in accession capital to say that I
    acquire actually loved account your blog posts.
    Any way I will be subscribing on your augment or even I fulfillment you get admission to persistently quickly.

  5. Generally I don’t learn post on blogs, but I would like to say that this write-up very pressured me
    to check out and do it! Your writing style has been amazed me.
    Thanks, quite nice post.

  6. This is really interesting, You’re a very skilled blogger.

    I have joined your feed and look forward to seeking more of your wonderful post.
    Also, I have shared your web site in my social networks!

  7. Thanks , I’ve just been searching for information about this topic for a while and yours is the greatest I’ve found out
    so far. However, what concerning the bottom line?
    Are you positive about the source? 0mniartist asmr

  8. I loved as much as you will receive carried out right here.
    The sketch is tasteful, your authored material stylish. nonetheless, you command get got
    an nervousness over that you wish be delivering the following.
    unwell unquestionably come further formerly again since exactly the same nearly a
    lot often inside case you shield this hike.

  9. After looking at a number of the blog posts on your blog, I truly like your way of writing a blog.
    I saved as a favorite it to my bookmark site list and will be checking
    back in the near future. Please visit my website too and let me know your
    opinion.

  10. scoliosis
    Do you mind if I quote a few of your posts as long as I provide
    credit and sources back to your site? My
    website is in the exact same area of interest as yours and my visitors would genuinely benefit from some of the information you present
    here. Please let me know if this ok with you. Many
    thanks! scoliosis

  11. scoliosis
    Asking questions are actually pleasant thing if you are not understanding anything entirely, except this piece of writing offers
    nice understanding yet. scoliosis

  12. scoliosis
    I’m really loving the theme/design of your website.
    Do you ever run into any web browser compatibility issues?
    A few of my blog audience have complained about my website not operating correctly
    in Explorer but looks great in Opera. Do
    you have any tips to help fix this issue? scoliosis

  13. Thanks for every other informative website. The place else may I
    am getting that kind of information written in such an ideal way?
    I’ve a venture that I am simply now working on, and I have been on the glance out for such info.

  14. Good day! I could have sworn I’ve been to this website before but after looking at many of the articles I realized it’s new to me.
    Regardless, I’m definitely pleased I came across it and I’ll
    be book-marking it and checking back frequently!

  15. Hello There. I found your blog the usage of msn. That is a really well written article.
    I will be sure to bookmark it and return to learn extra of your helpful
    info. Thanks for the post. I’ll definitely comeback.

  16. Pingback: Google

  17. Pingback: Google

Leave a Comment

Your email address will not be published.