Assistant Professor - Faculty of Information Technology
Kalinga University, Raipur
What is Python?
Python is a powerful tool with many modules (libraries) that can be brought in to increase its It is an open-source programming language that is reasonably straightforward to learn.
Python is a programming language that may be used to automate processes and process enormous volumes of
Python is compatible with Macs, PCs, Linux, and high-performance computing environments (Polaris, Andes, Discovery machines here at Dartmouth).
Why Python for Data Analysis?
Python’s importable libraries make it an appealing language for data analysis since it can easily import datasets.
NumPy
SciPy
Statsmodels
Pandas
Matplotlib
Natural Language Toolkit (NLTK)
Python can import and export common data formats such as CSV
Development Environments
Python may be used in several settings and with a range of
Using a command line (Python is installed by default on most Macs)
Using a windows terminal
Using a Linux terminal
Using an Integrated Development Environment (IDE) such as Eclipse or PyCharm
Using a web-hosted “sandbox” environment
Some Example of Development Environment
Development Environments (I)
Conclusion
Importing datasets, cleaning and preparing data are all part of the data analysis with Python (Identify and Handle Missing Values, Data Formatting, Data Normalization Sets, Binning, Indicator variables), Taking a Look at the Data Frame in Context (Descriptive Statistics, Basic of Grouping, ANOVA, Correlation and more on correlation), Model Development (Simple and Multiple Regression, Model Evaluation via Visualisation, Polynomial Regression and Pipelines, R-squared and MSE for in-sample evaluation, as well as prediction and decision making) was completed, and the model was then tested (Over-fitting, Under-fitting and Model Selection, Ridge Regression, Grid Search and Model Refinement).