IT/Machine Learning

Machine Learning Keyworks

묵회 2019. 1. 24. 12:29

https://www.coursera.org/learn/machine-learning/resources/JXWWS


Supervised vs. Unsupervised Learning

Supervised

- data set with collect output

- categorized "regression" and "classification"


Unsupervised

- with little or no idea what result should look like

- clustering


Linear Regression

trying to fit the output onto a continuous expected result function


univariate linear regression: predict a single output y from a single input


Hypothesis Function

hθ(x)=θ0+θ1x


Cost Function

J(θ0,θ1)=2m1i=1m(y^iyi)2=2m1i=1m(hθ(xi)yi)2


Gradient Decent

repeat until convergence:

\theta_j := \theta_j - \alpha \frac{\partial}{\partial \theta_j} J(\theta_0, \theta_1)

θj:=θjα[Slope of tangent aka derivative in j dimension]

repeat until convergence: {θ0:=θ1:=}θ0α1mi=1m(hθ(xi)yi)θ1α1mi=1m((hθ(xi)yi)xi)


Linear Regression with Multiple Variables

hθ(x)=θ0+θ1x1+θ2x2+θ3x3++θnxnh

X=x(1)0x(2)0x(3)0x(1)1x(2)1x(3)1,θ=[θ0θ1]


hθ(X)=Xθ

J(θ)=2m1i=1m(hθ(x(i))y(i))2


J(θ)=2m1(Xθy)T(Xθy)


repeat until convergence:{θj:=θjα1mi=1m(hθ(x(i))y(i))x(i)jfor j := 0..n

θ:=θαJ(θ)


θ:=θmαXT(Xθy)


Feature Normalization

feature scaling and mean normalization


xi:=sixiμi

Where μ_i is the average of all the values for feature (i) and s_i is the range of values (max - min), or s_i is the standard deviation.

Normal Equation

θ=(XTX)1XTy


Gradient DescentNormal Equation
Need to choose alphaNo need to choose alpha
Needs many iterationsNo need to iterate
O (kn^2)O (n^3), need to calculate inverse of X^TX
Works well when n is largeSlow if n is very large

 may be noninvertible. The common causes are:

  • Redundant features, where two features are very closely related (i.e. they are linearly dependent)
  • Too many features (e.g. m ≤ n). In this case, delete some features or use "regularization" (to be explained in a later lesson).


Logistic Regression


Decision Boundary


Regularization


Neural Network


Backpropagation Algorithm


Gradient Checking


Sigmoid Function


Train/Validation/Test set

- Crosse Validation(CV)


Bias(underfitting) vs. Variance(overfitting)


skewed class


Precision vs Recall

- True Positive, True Negative, False Positive, False Negative


F Score


Support Vector Machine(SVM)


Large Margin Classification


Kernels


Clustering


K-Mean Algorithm


elbow method


Dimentionality Reduction


Principal Component Analysis(PCA)


covariance matrix


eigenvectors


Anomaly Detection


Gaussian Distribution


Recommender System


Contents Based Recommendations


Collaborated Filtering


Mean Normalization


Stochastic Gradient Descent


Mini-Batch Gradient Descent


Online Learning


Map Reduce