

Python MySQL MySQL Get Started MySQL Create Database MySQL Create Table MySQL Insert MySQL Select MySQL Where MySQL Order By MySQL Delete MySQL Drop Table MySQL Update MySQL Limit MySQL Join Machine Learning Getting Started Mean Median Mode Standard Deviation Percentile Data Distribution Normal Data Distribution Scatter Plot Linear Regression Polynomial Regression Multiple Regression Scale Train/Test Decision Tree Confusion Matrix Hierarchical Clustering Logistic Regression Grid Search Categorical Data K-means Bootstrap Aggregation Cross Validation AUC - ROC Curve K-nearest neighbors

Python Matplotlib Matplotlib Intro Matplotlib Get Started Matplotlib Pyplot Matplotlib Plotting Matplotlib Markers Matplotlib Line Matplotlib Labels Matplotlib Grid Matplotlib Subplot Matplotlib Scatter Matplotlib Bars Matplotlib Histograms Matplotlib Pie Charts Python Modules NumPy Tutorial Pandas Tutorial SciPy Tutorial Django Tutorial **I’ve added the argument per=1 to the splprep function as pointed out by Dragan Vidovic in the comments.Python Dictionaries Access Items Change Items Add Items Remove Items Loop Dictionaries Copy Dictionaries Nested Dictionaries Dictionary Methods Dictionary Exercise Python If.Else Python While Loops Python For Loops Python Functions Python Lambda Python Arrays Python Classes/Objects Python Inheritance Python Iterators Python Scope Python Modules Python Dates Python Math Python JSON Python RegEx Python PIP Python Try.Except Python User Input Python String Formattingįile Handling Python File Handling Python Read Files Python Write/Create Files Python Delete Files *The interpolation method was based on replies from this thread. from scipy import interpolate fig, ax = plt.subplots(1, figsize=(8,8)) plt.scatter(df.Attack, df.Defense, c=df.c, alpha = 0.6, s=10) plt.scatter(cen_x, cen_y, marker='^', c=colors, s=70) for i in df.cluster.unique(): # get the convex hull points = df].values hull = ConvexHull(points) x_hull = np.append(points, points) y_hull = np.append(points, points) # interpolate dist = np.sqrt((x_hull - x_hull)**2 + (y_hull - y_hull)**2) dist_along = np.concatenate((, dist.cumsum())) spline, u = interpolate.splprep(, u=dist_along, s=0, per=1) interp_d = np.linspace(dist_along, dist_along, 50) interp_x, interp_y = interpolate.splev(interp_d, spline) # plot shape plt.fill(interp_x, interp_y, '-', c=colors, alpha=0.2) plt.xlim(0,200) plt.ylim(0,200) We can even interpolate the lines of our polygon to make a smoother shape around our data. Highlighted scatter Plot - Image by the author from scipy.spatial import ConvexHull import numpy as np fig, ax = plt.subplots(1, figsize=(8,8)) # plot data plt.scatter(df.Attack, df.Defense, c=df.c, alpha = 0.6, s=10) # plot centers plt.scatter(cen_x, cen_y, marker='^', c=colors, s=70) # draw enclosure for i in df.cluster.unique(): points = df].values # get convex hull hull = ConvexHull(points) # get x and y coordinates # repeat last point to close the polygon x_hull = np.append(points, points) y_hull = np.append(points, points) # plot shape plt.fill(x_hull, y_hull, alpha=0.3, c=colors) plt.xlim(0,200) plt.ylim(0,200) The convex hull is the smallest set of connections between our data points to form a polygon that encloses all the points, and there are ways to find the convex hull systematically - That is to say, we can use Sklearn to get the contour of our dataset. Luckily, there are ways to automate that. Doing so manually would take forever and for sure wouldn’t be worth the effort. from sklearn.cluster import KMeans import numpy as np # k means kmeans = KMeans(n_clusters=3, random_state=0) df = kmeans.fit_predict(df]) # get centroids centroids = kmeans.cluster_centers_ cen_x = for i in centroids] cen_y = for i in centroids] # add to df df = df.cluster.map('.format(i+1), markerfacecolor=mcolor, markersize=10) for i, mcolor in enumerate(colors)]) legend_elements.extend(cent_leg) plt.legend(handles=legend_elements, loc='upper right', ncol=2) # x and y limits plt.xlim(0,200) plt.ylim(0,200) # title and labels plt.title('Pokemon Stats\n', loc='left', fontsize=22) plt.xlabel('Attack') plt.ylabel('Defense')Īnother option to help us visualize our clusters’ size or spread is to draw a shape around it or a shadow. We’ll calculate three clusters, get their centroids, and set some colors.

Since this article isn’t so much about clustering as it is about visualization, I’ll use a simple k-means for the following examples.
