Complete Python Pandas Data Science Tutorial! (Reading CSV/Excel files, Sorting, Filtering, Groupby)
YouTube Viewers YouTube Viewers
216K subscribers
3,051,215 views
0

 Published On Oct 25, 2018

Practice your Python Pandas data science skills with problems on StrataScratch!
https://stratascratch.com/?via=keith

Data & code used in this Tutorial: https://github.com/KeithGalli/pandas
Python Pandas Documentation: http://pandas.pydata.org/pandas-docs/...

Let me know if you have any questions!

In this video we walk through many of the fundamental concepts to use the Python Pandas Data Science Library. We start off by installing pandas and loading in an example csv. We then look at different ways to read the data. Read a column, rows, specific cell, etc. Also ways to read data based on conditioning. We then move into some more advanced ways to sort & filter data. We look at making conditional changes to our data. We also start doing aggregate stats using the groupby function. We finished the video talking about how you would work with a very large dataset (many gigabytes)

I realized as I upload this video there are some additional things I want to talk about in a later video. The first thing that comes to mind immediately is using the apply() function on a dataframe to alter the data using a custom or lambda function. If you have questions on this or anything else before I get around to making a part 2, feel free to write me a note in the comments.

If you enjoyed this video, be sure to throw it a like and make sure to subscribe to not miss any future videos!

Thanks for watching friends! Happy coding! :)

Join the Python Army to get access to perks!
YouTube -    / @keithgalli  
Patreon -   / keithgalli  

---------------------------------------------
Follow me on social media!
Instagram |   / keithgalli  
Twitter |   / keithgalli  

---------------------------------------------
Link to original source of data from Kaggle: https://www.kaggle.com/abcsds/pokemon

---------------------------------------------
Video Outline!
0:00 - Why Pandas?
1:46 - Installing Pandas
2:03 - Getting the data used in this video
3:50 - Loading the data into Pandas (CSVs, Excel, TXTs, etc.)
8:49 - Reading Data (Getting Rows, Columns, Cells, Headers, etc.)
13:10 - Iterate through each Row
14:11 - Getting rows based on a specific condition
15:47 - High Level description of your data (min, max, mean, std dev, etc.)
16:24 - Sorting Values (Alphabetically, Numerically)
18:19 - Making Changes to the DataFrame
18:56 - Adding a column
21:22 - Deleting a column
22:14 - Summing Multiple Columns to Create new Column.
24:14 - Rearranging columns
28:06 - Saving our Data (CSV, Excel, TXT, etc.)
31:47 - Filtering Data (based on multiple conditions)
35:40 - Reset Index
37:41 - Regex Filtering (filter based on textual patterns)
43:08 - Conditional Changes
47:57 - Aggregate Statistics using Groupby (Sum, Mean, Counting)
54:53 - Working with large amounts of data (setting chunksize)

-------------------------
If you are curious to learn how I make my tutorials, check out this video:    • How to Make a High Quality Tutorial V...  

*I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.

show more

Share/Embed