jupyter笔记本
by Tirthajyoti Sarkar
由Tirthajyoti Sarkar
如何为Jupyter笔记本电脑设置PySpark (How to set up PySpark for your Jupyter notebook)
Apache Spark is one of the hottest frameworks in data science. It realizes the potential of bringing together both Big Data and machine learning. This is because:
Apache Spark是数据科学中最热门的框架之一。 它实现了将大数据和机器学习结合在一起的潜力。 这是因为:
Spark is fast (up to 100x faster than traditional Hadoop MapReduce) due to in-memory operation.
由于内存操作,Spark速度很快(比传统的Hadoop MapReduce快100倍)。
It offers robust, distributed, fault-tolerant data objects (called RDDs)
它提供了健壮的,分布式的,容错的数据对象(称为RDD )。
翻译自: https://www.freecodecamp.org/news/how-to-set-up-pyspark-for-your-jupyter-notebook-7399dd3cb389/
jupyter笔记本