- Oggetto:
- Oggetto:
INTRODUCTION TO DATA SCIENCE - STATISTICAL LEARNING AND DATA ANALYTICS
- Oggetto:
INTRODUCTION TO DATA SCIENCE - STATISTICAL LEARNING AND DATA ANALYTICS
- Oggetto:
Anno accademico 2022/2023
- Codice dell'attività didattica
- SEM0125B
- Docenti
- Pierpaolo De Blasi (Titolare del corso)
Giovanni Rebaudo (Titolare del corso) - Insegnamento integrato
- Corso di studi
- ECONOMIA - percorso in Economia e Data Science
- Anno
- 3° anno
- Periodo didattico
- Secondo semestre
- Tipologia
- Caratterizzante
- Crediti/Valenza
- 6
- SSD dell'attività didattica
- SECS-S/01 - statistica
- Modalità di erogazione
- Tradizionale
- Lingua di insegnamento
- Inglese
- Modalità di frequenza
- Facoltativa
- Tipologia d'esame
- Scritto
- Oggetto:
Sommario insegnamento
- Oggetto:
Obiettivi formativi
The course introduces to the fundamental techniques of statistical learning aimed at building a model for predicting a response variable based on one or more independent variables (or covariates). Special attention will be devoted to computer-based implementation of such techniques using a statistical software and to the interpretation of the analyses' results.- Oggetto:
Risultati dell'apprendimento attesi
- Knowledge and understanding
The student will learn the most common methodologies for analyzing a data set together with their implementation through the software R. The student will also be able to interpret the results of the analysis and present them through both visual and numerical summaries.
- Applying knowledge and understanding
The student will have the ability to discuss various methods and techniques for statistical learning.
- Making judgements
The student will be able to select the appropriate statistical method for analyzing a datasets with the support the R software in supervised learning.
- Communication skills.
Students will properly use statistical language to comunicate the results of their findings.- Oggetto:
Modalità di insegnamento
The course is composed of 48 hours of class lectures. Examples and exercises will be dealt with the R language.
Classes are delivered in presence.
- Oggetto:
Modalità di verifica dell'apprendimento
The final examination consists in a written test with open-ended questions, some about the interpretation of a data analysis already prepared, some of a more theoretical type about the topics covered in class.- Oggetto:
Programma
Statistical learning
- Goals
- Accuracy vs. interpretability
- Bias-variance trade off
Linear regression
- Simple linear regression
- Multiple linear regression
- Discussion and comparisons
Classification
- Logistic regression
- Linear discriminant analysis
- Discussion and comparisons
Validation and resampling
- Cross-validation
- The bootstrap
Model selection and regularization
- Subset selection
- Shrinkage methods (ridge, lasso)
- Dimension reduction
Non-linear models
- Polynomial regression
- Regression Splines
- Generalized additive models
Testi consigliati e bibliografia
- Oggetto:
- Libro
- Titolo:
- An introduction to statistical learning (2nd ed)
- Anno pubblicazione:
- 2021
- Editore:
- Springer
- Autore:
- Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani
- ISBN
- Permalink:
- Note testo:
- Ebook disponibile su piattaforma Springer (chiedere in Biblioteca)
- Obbligatorio:
- Si
- Oggetto: