Notes in Week 1 - Intro + Linear Regression

To Subscribe, use this Key

Status	Last Update	Fields
Published	11/26/2024	{{c1::image-occlusion:rect:left=.3404:top=.4795:width=.1551:height=.077:oi=1}}{{c2::image-occlusion:rect:left=.539:top=.4888:width=.1185:height=.0616:…
Published	11/26/2024	Supervised Learning requires the following:{{c1::Labeled Data}}{{c2::Direct Feedback}}In order to {{c3::predict an outcome/future}}
Published	11/26/2024	Unsupervised Learning has:{{c1::No labels/targets}}{{c2::No feedback}}And is used to {{c3::find hidden structures in data}}
Published	11/26/2024	{{c1::image-occlusion:rect:left=.0131:top=.0143:width=.3592:height=.0678:oi=1}}{{c2::image-occlusion:rect:left=.5412:top=.0174:width=.3469:height=.067…
Published	11/26/2024	{{c1::image-occlusion:rect:left=.2243:top=.3871:width=.7328:height=.0678}}{{c2::image-occlusion:rect:left=.4144:top=.5504:width=.3288:height=.0801}}{{…
Published	11/26/2024	{{c1::image-occlusion:polygon:left=.019:top=.167:points=.0252,.27 .774,.2823 .7612,.2084 .2762,.2176 .2592,.1684 .1911,.1684 .1954,.2238 .021,.2176}}{…
Published	11/26/2024	{{c1::Machine learning (ML)}} is a branch of artificial intelligence that focuses on using data and algorithms to mimic human learning and improve acc…
Published	11/26/2024	{{c1::Supervised learning}} is a machine learning task where the algorithm is trained on labeled data (input-output pairs).
Published	11/26/2024	The two common types of supervised learning models are {{c1::regression}} (predicting numerical values) and {{c2::classification}} (predicting categor…
Published	11/26/2024	The linear regression model (equation) in supervised learning is represented as {{c1::\( f_{w,b}(x) = wx + b \)}}.
Published	11/26/2024	The {{c1::training set}} is the data used to train the machine learning model, consisting of input-output pairs.
Published	11/26/2024	The goal of regression is to minimize the difference between the predicted values \( \hat{y} \) and the actual values \( y \), often done using the {{…
Published	11/26/2024	The {{c1::cost function}} for linear regression is the squared error cost function, represented as the equation {{c2::\( J(w, b) = \frac{1}{2m} \sum (…
Published	11/26/2024	The process of adjusting the model's parameters to minimize the cost function is called {{c1::model training}}.
Published	11/26/2024	The {{c1::gradient descent}} algorithm is used to iteratively update parameters \( w \) and \( b \) to minimize the cost function.
Published	11/26/2024	In gradient descent, the learning rate \( \alpha \) controls the size of the {{c1::steps}} taken during the update process.
Published	11/26/2024	The update rules for {{c1::gradient descent}} are \( w := w - \alpha \frac{\partial}{\partial w} J(w, b) \) and \( b := b - \alpha \frac{\partial…
Published	11/26/2024	The {{c1::contour plot}} is a 2D plot used to visualize the cost function's values for different values of \( w \) and \( b \).
Published	11/26/2024	In machine learning, the term {{c1::overfitting}} refers to a model that performs well on the training data but poorly on new, unseen data.
Published	11/26/2024	The {{c1::test set}} is a separate set of data used to evaluate the performance of the trained model.
Published	11/26/2024	In linear regression, minimizing the cost function ensures that the model's predictions are {{c1::as close as possible}} to the actual target values.
Published	11/26/2024	The term \( J(w, b) \) represents the {{c1::cost function}} of a linear regression model.
Published	11/26/2024	In machine learning, a model is considered to be in {{c1::convergence}} when the updates made by gradient descent become very small.
Published	11/26/2024	The impact of the learning rate \( \alpha \): if it is too small, gradient descent will be slow; if too large, it may {{c1::overshoot}} the minimum.
Published	11/26/2024	{{c1::Multiple linear regression}} extends the simple linear regression model to handle more than one input feature.
Published	11/26/2024	The {{c1::normal equation}} is an alternative method to gradient descent for finding the parameters \( w \) and \( b \) in linear regression.
Published	11/26/2024	When a model has high bias, it may perform poorly on both training and test data, indicating it has not learned the {{c1::underlying patterns}} in the…
Published	11/26/2024	In contrast to high bias, a model with high variance may perform well on training data but poorly on {{c1::test data}}.
Published	11/26/2024	In linear regression, the best-fit line minimizes the sum of the {{c1::squared differences}} between predicted and actual values.
Published	11/26/2024	The {{c1::mean absolute error (MAE)}} is another performance metric that measures the average magnitude of errors in predictions.
Published	11/26/2024	A {{c1::learning curve}} is a plot that shows the model's performance as the amount of training data increases.
Published	11/26/2024	A model with {{c1::high capacity}} has a large number of parameters and can fit more complex functions.
Published	11/26/2024	The {{c1::training loss}} measures the model's error on the training data, while the {{c2::validation loss}} measures its error on a separate validati…
Published	11/26/2024	The {{c1::regularization}} technique helps to prevent overfitting by adding a penalty term to the cost function.
Published	11/26/2024	The two main types of regularization in linear regression are L1 regularization ( {{c1::Lasso}} ) and L2 regularization( {{c2::Ridge}} )
Published	11/26/2024	The penalty term added to the cost function in L2 regularization is proportional to the {{c1::square of the weights}}.
Published	11/26/2024	In {{c1::Lasso regression}}, the penalty term is proportional to the absolute value of the weights, promoting sparsity in the model.
Status	Last Update	Fields