1. Basic background Knowledge of Machine Learning
This section will cover the necessary content on machine learning methods and techniques to be able to create their own task and model
1.1. Machine Learning Definition
The automatic process of learning useful and meaningful information from a given dataset. There are 2 types of Machine Learning methods: Supervised and Un-Supervised Learning.
1.2. Supervised Machine Learning Models
- Models learns characteristics and features about the data with the goal to predict the outcome variable as accurately as possible. Supervised Learning ML models are designed to answer two types of problems: regression and classification.
- Regression problems deal with predictions that contain continuous and numeric values such as house prices, gas emissions, water levels, etc. Regression ML Models include Linear Regression, Support Vector Regression, etc.
- Classification problems deal with predictions that contain nominal or ordinal values such as movie rating(very bad/bad/average/good/very good), high school completion(yes/no), weather forecast(sunny/cloudy/rainy), etc. Classification ML Models include Decision Trees, Support Vector Machines, Naive Bayes, etc.
Linear Regression Model:
Linear Regression captures the Linear relationship between an outcome variable and the features. The relationship is expressed as a hyperplane:
$$
y = b_0+b_1x_1+b_2x_2 +...+ b_nx_n
$$
- n is the number of features
- y is the outcome variable
- $b_0$ is the intercept
- $b_1, b_2,..., b_n$ the weights calculated by the model during training to minimize the error when predicting y with instance X.
- $x_1, x_2, ..., x_n$ are values for each feature.
K-Nearest Neighbors (KNN) Model:
Finds K number of closest data points to the test instance we’re trying to classify. Then we use majority voting to determine the test instance class label. The closest data points are calculated using distance measures such as Euclidean distance, Manhattan distance, cosine similarity, etc.