n = int(input())
import numpy as np
from sklearn.cluster import KMeans
c1=np.array([0,0])
c2=np.array([2,2])
# print(c1,c2)
X = []
X1 = []
X2 = []
for i in range(n):
X=([float(x) for x in input().split()])
X0=np.array(X)
if np.linalg.norm(X0-c1)<=np.linalg.norm(X0-c2):
X1.append(X)
else:
X2.append(X)
# print(X1)
d1=np.array(X1)
# print(X2)
d2=np.array(X2)
if not d1.any():
print('None')
else:
print(np.around(np.mean(d1,axis=0),2))
# print(np.array(X1))
if not d2.any():
print('None')
else:
print(np.around(np.mean(d2,axis=0),2))
# print(X2)
===============
Data Science - Pandas Pandas Pandas
Finding the next centroid
Unsupervised learning algorithm clustering involves updating the centroid of each cluster. Here we find the next centroids for given data points and initial centroids.
Task
Assume that there are two clusters among the given two-dimensional data points and two random points (0, 0), and (2, 2) are the initial cluster centroids. Calculate the euclidean distance between each data point and each of the centroid, assign each data point to its nearest centroid, then calculate the new centroid. If there's a tie, assign the data point to the cluster with centroid (0, 0). If none of the data points were assigned to the given centroid, return None.
Input Format
First line: an integer to indicate the number of data points (n)
Next n lines: two numeric values per each line to represent a data point in two dimensional space.
Output Format
Two lists for two centroids. Numbers are rounded to the second decimal place.
Sample Input
3
1 0
0 .5
4 0
Sample Output
[0.5 0.25]
[4. 0.]
Explanation
There are 3 data points and we would like to identify two clusters among them. Initial centroids are given (0, 0), and (2, 2). The distances between the first data point (1, 0) and each of the centroids are 1.0 and 2.24, rounded to the second decimal place. The first data point is closter to (0, 0), thus assigned the 0-th cluster. Similarly data point (0, .5) is closer to (0, 0) than to (2, 2), also assigned to the 0th cluster; while (4, 0) is closter to (2, 2), thus assigned to the 1st cluster. To calculate the new centroids, take the mean of all data points in the 0-th and 1st cluster, respectively. Hence the results are [0.5 0.25] and [4. 0.].
相关推荐
numpy,pandas,scipy,scikit-learn等教程和源码,适合新手入门使用,非常全面!
pandas: Reading and Writing Data....Pages 103-130 pandas in Depth: Data Manipulation....Pages 131-165 Data Visualization with matplotlib....Pages 167-235 Machine Learning with scikit-learn....Pages ...
NumPy,pandas和Matplotlib的Mypy类型存根 :warning: 这个项目基本上已经停止...正在安装您可以从PyPI获取此软件包: pip install data-science-types 要获取最新版本,请直接从GitHub安装: pip install git+https://g
数据重塑Learn Python for Data Science Interactively高级索引重置索引前向填充后向填充Country Capital P
完全扩展和升级,最新版本的Python Data Science Essentials将帮助您使用最常见的Python库在数据科学操作中取得成功。本书提供了对Python核心的最新见解,包括最新版本的Jupyter Notebook,NumPy,pandas和scikit-...
With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science. What you will learn Understand ...
Python Data Analysis Library 或 pandas 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。本Cheat Sheet用1页的内容涵盖了Pandas的绝大部分基本操作。
from data_science_utils import dataframe as df_utils from data_science_utils import models as model_utils from data_science_utils import plots as plot_utils from data_science_utils import ...
DataScience-Notebooks:数据科学笔记本的集合
在深度学习过程中,对数据进行处理的知识总结,包括numpy和pandas等等.
just-pandas-things:不断出现的熊猫怪癖清单
python-data-science-handbook-notes:阅读Jake VanderPlas(https:github.comjakevdpPythonDataScienceHandbook)的Python数据科学手册时所做的笔记
从GitHub克隆项目: git clone https://github.com/Schayik/u-data-science-blog-post.git 下载数据: : 解压缩文件并将文件夹添加到工作区 创建虚拟环境: python -m venv .venv 激活虚拟环境: .venv/Scripts/...
练习1-检查Anaconda并熟悉Python 练习2-在Jupyter笔记本中加载案例研究数据 练习3-验证基本数据完整性 练习4-持续验证数据完整性 练习5-探索和清理数据 练习6-探索信用额度和人口统计特征 练习6-探索信用额度...
此自定义笔记本包含 ,Boto3,Pandas,与Kafka-python相关的软件包,这些软件包随最小笔记本一起安装,以供数据科学使用。 如果用户在使用最小的jupyter笔记本时需要这些软件包,则可以直接使用此图像。 我们已经...
DataCamp课程X-DataScience Master的作业2我们希望您通过这项作业学到什么: 使用Git和GitHub 使用Python文件(而不仅仅是笔记本!) 在GitHub储存库上执行拉取请求使用标准Python约定正确格式化代码使您的代码通过...
Data-Science-Challenge 使用以下命令安装所有必需的库pip install -r requirements.txt 如果遇到问题,可能必须先更新点python -m pip install --upgrade pip前处理创建了一个简单的预处理类,以作为pandas DF ...
数据科学101-教程+注释 最近,我一直在从事数据科学项目,并想分享自己的笔记,并就Python中用于数据科学的最重要和最有用的库编写一些简短的教程。 所以,我在这里。 :) 希望对您有帮助! 很多其他很棒的教程 ...
gspread-pandas:一个用于轻松打开Google电子表格实例并通过Pandas DataFrames与工作表进行交互的软件包