Skip to content

Pet-project repo: an attempt to independently reproduce basic classical machine learning algorithms using numpy

Notifications You must be signed in to change notification settings

kdduha/classical-machine-learning-algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 

Repository files navigation

classical-machine-learning-algorithms

Here I am trying to reproduce classic machine learning algorithms using numpy. The entire code is based on the materials of the course "Machine Learning Algorithms from scratch" on the stepik platform and on the video lectures of the Yandex SHDA course "Machine Learning". However algorithms may not always be optimized since my goal is to understand the idea and its approximate implementation and not to get perfectly verified solutions at the sklearn level. All code contains comments and type annotations. A brief description of algorithms is below:

Linear Models

  • Linear Regression
    • MSE loss function
    • available metrics: MAE, MSE, RMSE, MAPE, R2
    • available loss regularizations: Lasso, Rigde, ElasticNet
    • can be used a stochastic gradient with different batch's sizes
    • learning step can be computed dynamicly if you pass a counting function to the learning_rate parameter, for example lambda iter: 0.5 * (0.85 ** iter)

  • Binary Linear Regression
    • Log loss function
    • available metrics: Accuracy, Precision, Recall, F1, ROC AUC
    • available loss regularizations: Lasso, Rigde, ElasticNet
    • can be used a stochastic gradient with different batch's sizes
    • learning step can be computed dynamicly if you pass a counting function to the learning_rate parameter, for example lambda iter: 0.5 * (0.85 ** iter)

KNearestNeighbours

  • KNNClassification
    • available metrics/distances: euclidean, chebyshev, manhattan, cosine
    • available weights of the nearest neighbours: uniform (default calculation of classes' mode), rank, distance
    • also you can use predict_proba wich returns the degree of reliability of the prediction, based on the chosen method of calculating neighbours' weights

  • KNNRegression
    • available metrics/distances: euclidean, chebyshev, manhattan, cosine
    • available weights of the nearest neighbours: uniform (default calculation of targets' mean value), rank, distance

About

Pet-project repo: an attempt to independently reproduce basic classical machine learning algorithms using numpy

Topics

Resources

Stars

Watchers

Forks

Languages