PySpark_Training_Repository/README.md
2024-01-12 17:28:45 +01:00

33 lines
No EOL
952 B
Markdown

Python PySpark Training Repository
==============
**Author:** *Yûki VACHOT*
**Updated:** **10/01/24**
# CONTENT TABLE
---
# Installation
`python -m venv `
- [Python 3.11.7](https://www.python.org/downloads/)
- [Spark 3.5.0 with Hadoop 3.0.0](https://spark.apache.org/downloads.html)
- [winutils.exe, .pdb and hadoop.dll](https://github.com/steveloughran/winutils/tree/master/hadoop-3.0.0/bin)
- [Java JDK 17](https://www.azul.com/downloads/?version=java-17-lts&package=jdk#zulu)
- [pygraphviz](https://pygraphviz.github.io/documentation/stable/install.html#windows) install in x86
- `pip install --global-option=build_ext --global-option="-IC:\Program Files (x86)\Graphviz\include" --global-option="-LC:\Program Files (x86)\Graphviz\lib" pygraphviz`
---
# Run Python PySpark
- `python init.py`
---
# Run Python Test
- path from src/test_pyspark_training
- `pytest -k test_`
---
# Run pylint for code check
---
# Run Python doc with Sphinx
---