Dan's Data & Programming Knowledge Base Hey! I'm Dan Friedman. I built this site to clearly document important concepts I've learned in data, programming, and career advice. There's lots of real-world examples, digestible small code snippets and plentiful visualizations to make learning (hopefully) fun and easy! Enjoy! View Tutorials Articles Data Science: Reality Doesn't Meet Expectations A Recipe for Doing Great Data Science Work Shoot for the job you can't get [how I got a job at Boosted Boards] Data Analysis with pandas Data Wrangling with Pandas Create Year-Month Column from Dates cut() Method: Bin Values into Discrete Intervals shift() Method: Shift Values in Column Up or Down Pandas rank() Method: Equivalent to ROW_NUMBER(), RANK(), DENSE_RANK() and NTILE() SQL Window Functions Self Join Create New Columns Based on Operations groupby() Method: Split Data into Groups, Apply a Function to Groups, Combine the Results value_counts() Method: Count Unique Occurrences of Values in a Column melt() Method: Unpivot a DataFrame pivot() Method: Pivot DataFrame Without Aggregation Operation query() method: Query/Filter Columns pivot_table() Method: Pivot DataFrame with Aggregation Operation crosstabs() Method: Compute Aggregated Metrics Across Categorical Columns Categorical Data Business Metrics Popular Summary Business Metrics Data Visualization Best Practices When to Use a Cumulative Frequency Graph How to Format Large Tick Values When to Use a Logarithmic Scale Visualize Historical Time Comparisons When to Use Categorical Scatterplots When to Use Heatmaps When to Use Horizontal Bar Charts When to Use Histogram Plots When to Use Box Plots When to Use Line Plots When to Use Vertical Grouped Barplots When to Use Vertical Stacked Bar Charts When to Use a Vertical Bar Chart When to Use a Pie Chart When to Use a Scatter Plot Matplotlib Plotting Customize Scatter Plot Styles using Matplotlib Line Plots using Matplotlib Scatter Plots using Matplotlib Style Line Plots using Matplotlib Style Plots using Matplotlib Pandas Plots Bar Plot using Pandas Line Plot using Pandas Histogram Plot using Pandas Python Beginner Concepts Read in CSV Files for Data Analysis String Formatting Tuple Basics Zip Function Dictionary Methods Dictionaries Basics String Methods Strings Basics Incremental Development Validate Arguments Passed to a Function Conditional Logic with If Statements Build a Number Guessing Game with Keyboard Input Docstrings Best Practices in Functions Generalizing Functions to Be More Reusable Iterate Over Sequences Using For and While Loops Build Functions to Easily Perform Repeated Operations Fundamental Programming Terms Count Occurences of Each Unique Element in a List Iterate over Index Numbers and Elements in a List Using Enumerate Types and Values Math Operations List Methods Lists - Intro to the Data Structure & Common Operations Intermediate Concepts Intro to Designing Classes Intro to Multithreading and Multiprocessing List Comprehensions APIs Beginner Algorithms Partition Array Into Three Parts With Equal Sum (via Leetcode) Create Target Array in the Given Order (via Leetcode) Find Words Formed by Characters (via Leetcode) Two Sum II (via Leetcode) Check if Double of Value Exists (via Leetcode) Height Checker (via Leetcode) Minimum Absolute Difference (via Leetcode) Squares of a Sorted List (via Leetcode) Rank Transform of Array (via Leetcode) How Many Numbers Are Smaller Than the Current Number (via Leetcode) Intersection of Two Arrays (via Leetcode) Subtract the Product and Sum of Digits of an Integer (via Leetcode) Find All Numbers Disappeared in an Array (via Leetcode) Number of Steps to Reduce a Number to Zero (via Leetcode) Longest Common Prefix (via Leetcode) Unique Number of Occurences (via Leetcode) Jewels and Stones (via Leetcode) Valid Parentheses (via Leetcode) Largest Substring Without Repeating Characters (via Leetcode) Two Sums (via Leetcode) Intermediate Algorithms Command Line Basic Minesweeper Advanced Algorithms Find the Median of Two Sorted Arrays (via Leetcode) Machine Learning Clustering K-Means Algorithm from Scratch Segmentation vs. Clustering Classification Visual Introduction to Classification and Logistic Regression Math Descriptive Statistics Spearman's Correlation Pearson's Correlation Bessel's Correction Outliers Correlation Skewness Standard Deviation Mean, Median and Mode Central Limit Theorem Z-scores Inferential Statistics Correlation Does Not Imply Causation Independent Samples t-tests Dependent Samples t-tests Measures of Effect Size for t-tests T-Tests: Intro to Key Terms & One Sample t-test Type I and Type II Errors in Hypothesis Testing Intro to Hypothesis Testing and z-tests Confidence Intervals Distributions Exponential Distribution Probability Plot Normal Distribution Miscellaneous Introduction to Math Symbols Through Simple Examples Santa Clara University Teaching 2018 Summer MSIS 2629 Syllabus Thank you for reading my content! 