- Linear algebra / Matrix Algebra (See How do I learn linear algebra? and How do I learn matrix algebra?)
- Probability Theory (See How do I learn probability?)
If you're interested in an accessible introduction to matrix algebra, Coursera is running a course on it right now: Coding the Matrix: Linear Algebra through Computer Science Applications
The applied math most directly useful for machine learning is:
- Statistics (See How
do I learn statistics for data science? What statistics book do you
recommend to a wannabe data scientist who is familiar with basic
statistics and mathematics?) - Optimization (See How do I learn optimization?)
· 256 Upvotes · Answer requested by Siddharth Verma
going through my Machine Learning course last semester, I felt like I
had the most catching up to do with Linear Algebra. I felt key ideas
from LinAlg are harder to remember over time than Probability. I found
myself to be mostly working with probability distributions, Bayes' rule,
MLEs and MAPs, while the algebra side of it was mostly optimization in
higher dimensions, was mostly Matrix calculus.
I discovered that
the Matrix Cookbook was popular with most students for working with
Matrix Calculus as it seems to have a never-ending list of matrix
derivatives:
http://www2.imm.dtu.dk/pubdb/vie...
As far as brushing up on the rest of your Linear Algebra knowledge is concerned, I highly recommend Strang's lectures/book:
http://ocw.mit.edu/courses/mathe...
Highly
relevant topics include knowing about rank and inversion, SVD, and also
make sure you're very comfortable with eigenvalues and eigenvectors,
amongst other things.
Finally, with Analysis, I don't think ML
requires a formal introduction to Analysis at all. Its important to know
higher dimensional calculus well, especially parts related to
optimization, such as Lagrange multipliers, the primal-dual form, and in
general, the calculus of Matrices, and you should be good to go.
Overall,
I think the case with Linear Algebra and Calculus is to work your way
through an ML book/course, and stop and look at the relevant math when
necessary, whereas you need a strong foundation in Probability right
from the beginning, and most textbooks on ML tend to talk a lot about
probability while skimming over the mathematical details of LinAlg and
Calculus.
Let me first caveat what I’m about to say with this: go to graduate school. †
To show you just how super-serious I am about this, I’m even going to
separate this caveat from the rest of the answer with one of the
ultra-cool line breaks.
Alright,
at this point, I’m assuming that you are still solely considering
graduate school preparation without an undergraduate education. Let’s
go.
My background consists of an undergraduate
BS in mathematics, a minor in physics, and a few years of research
experience that has spanned from charged particle detectors (physics/EE)
to autonomous vehicle system design for collision detection and
evasion. Long story short: I’m far more qualified to answer your question when robotics is emphasized, so that’s what I’m going to do.
Robotics is Multi-Disciplinary
Robotics
is a highly multi-disciplinary field. In fact, I’d argue that it could
well be the academic field which encompasses the largest quantity of
distinct domains into its core structure. When we’re talking about
robotics, we’re really talking about
- Computer science
- Mathematics
- Computer engineering
- Electrical engineering
- Control engineering
- Systems engineering
- Mechanical engineering
- Physics (mechanics, more specifically)
What’s
even more impressive about the above list than its size is the depth of
each field. Aside from control and systems engineering, which are a bit
more specialized and less fundamental than the others, each of the
above domains are extremely broad—indicating that if you were to break
down robotics concepts into a networked graph, it would resemble
something like this:
[1]
Needless
to say, roboticist ultimately specialize in a much narrower range so
that expertise in a topic can be attained. But that doesn’t change the
fact that to pursue robotics, high breadth and versatility in
engineering and math is a tool whose utility can’t be overstated.
Specific Areas of Research
Now,
regardless of whether you want to pursue a masters or a Ph.D., you will
ultimately have to carve out a niche for yourself. As I mentioned
above, mastery of all robotics is a hopelessly daunting task; it’s
impossible. Therefore, it’s important that you expose yourself to the
different areas of robotics, and gradually hone in on your desired path
according to the topics with which you’re interested and at which you’re
talented.
Here’s my breakdown of robotics
research, in increasing order of mathematical abstraction and decreasing
order of hands-on engineering and building:
-
Sensors. About
as applied and hands-on as you can get, the domain of sensors works on
expanding the current technical constraints that robotics hardware
faces. It’s because of these guys that the iphone magically gets smaller
and smaller every year, while also increasing its technological
capacities. An example of the importance of this domain which is even
more specific to robotics is radar evading drones. Remember when Osama
Bin Laden got taken out because we flew a helicopter in Afghanistan that
magically evades radars? Thanks sensors. -
Nano-robotics. Focusing
on developing robotic systems on the micro-level, nano-robotics
explores how robotic agents can be built and implemented on a scale
sufficiently small that they can be directly inserted into your body.
Sound scary? It shouldn’t. Nano-robotics has a plethora of game-changing
medical applications, some of which include legitimately curing cancer
and preventing aging. -
Machine vision. While the ability
to process and interpret visual information comes very intuitively to
humans, translating our abilities to an algorithmic environment in this
matter has proven to be an intimidating process. In fact, I’d argue that
the largest obstacle facing self-driving cars is machine vision. Just
take a look at the self-driving car expert at Tesla who died because his
car failed to distinguish between the bright sky and an incoming white
truck. [2] -
Robotic learning. When machine learning is
applied in a robotic context, it basically becomes robotic learning.
Robotic learning is the overlap between robotics and machine learning;
it approaches the problem of developing tools for adaptation and
learning in robotic systems. Very cool field, with a lot of promising
application, and very well suited for someone interested in machine
learning and robotics. -
Robotic control. This is the area
in which I’m currently nested. Control represents a mathematical
approach to modeling the behavior and evolution of a Dynamical system - Wikipedia
in relation to inputs, which can be used to affect the system’s output.
The goal here is to mathematically demonstrate that a certain approach
for input selection guarantees that the system’s output will quickly
converge to a stabilized desired range, as illustrated in this kick-a**
picture. [3]
Because
you have stated that robotics and machine learning are your interests,
I’m going to assume your interests align with the #3–5 end of the
spectrum. But even when your interests are honed in on these two areas,
there is still a massive range of topics and skill sets spanned by these
two very broad domains.
Developing Skills for Robotic Learning
Again,
I’m far from an expert in robotic learning and machine learning, but
I’ll do my best to show some helpful tips for pursuing this domain. The
fundamental fields from which machine learning constantly draws, as I
understand it, are the following:
- Probability
- Statistics
- Algorithms
- Optimization
- Systems
The
last one is a bit more of a stretch in comparison to the others, but
I’ve heard that a high portion of machine learning can actually be
approached from a systems perspective, and that its inception actually
arose from system theory modeling.
For probability and statistics,
both intuition and rigorous technicality will be important. I had a
horrible textbook which provided very little conceptual basis for the
theorems, and mostly included a bunch of isolated problems which were
crudely connected in a very disjointed way. I recommend Introduction to Probability by
Grinstead and Snell, [4] which provides a lot of clear,
well-articulated conceptual explanations which enhance both intuition
and precise reasoning on the subject. It’s also free and available
online, which ya’ know, is always a big plus.
Becoming comfortable with algorithms is
a task which can more easily be achieved in a college setting, but one
which is also very feasible to execute independently. Regarding a
textbook to guide you through key concepts to algorithm theory, I
recommend to look no further than the classic Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein. [5]
Additionally, I would look to two additional sources to continually expand algorithmic skills: Project Euler Archived Problems - Project Eulerand Topcoder Deliver Faster through Crowdsourcing.
Project Euler encompasses a diverse range of mathematical problems for
algorithmic development which will strengthen your mathematical
algorithmic thinking, and your “out of the box,” creativity. Topcoder
provides challenges which will improve your technical programming
skills, and diversify and expand your problem solving breadth.
Of course, once you have a solid background in the above topics, you’ll want to receive a comprehensive introduction to robot learning, for which I’ve been told that Robot Learning by Connell and Mahadevan is a solid choice. [6]
Although
robotic learning and robotic control are distinct domains, robotic
learning is intrinsically tied in to concepts from control theory. In
fact, one of the most challenging problems facing the robotic learning
community is that it lacks the rigorous analysis and descriptions that
the control and systems theories possess.
For
example, a self-driving car that implements a series of clever robotic
learning algorithms will never be implemented without tools from control
systems. Why? Because without tools from control and systems theory,
you will never get close to demonstrating rigorous, mathematically
demanding qualities such as robustness, safety guarantees, stability,
etc., without which, the government wouldn’t let your self-driving car
see the light of day.
Robotic Control
I think that optimization, control, and systems are all presented and integrated very concisely in Design of Optimal Control Systems by
Bini. [7] This book consists of more than a minimal amount of knowledge
in any of these topics which is needed for machine learning. But a deep
understanding of at least some of the ideas shown in this book will
allow for insights to be drawn between these domains which most others
will likely not be capable of seeing.
Note that
I recommend the above for someone interested in both machine learning
and robotic control. If you’re primarily interested in robotic control,
then your mathematical skills need to be more sophisticated than the
vast majority of other engineers. This is likely the only engineering
discipline in which highly abstract mathematical fields play a
fundamental role. They include
- Real analysis
- Systems of Differential Equations
- Dynamical Systems (similar to 2., but distinct from it)
- Advanced Linear Algebra
- Advanced Optimization
- Basic Topology
- Set Theory (more than the basics, but not quite “advanced” set theory)
Clearly,
your mathematical skills have to be beyond the more applied end of the
spectrum in which things like formalities, proofs, theorems, and rigor
are almost never relevant.
For a comprehensive introduction to real analysis and topology that isn’t esoteric (difficult to find), I recommend Basic Analysis by
Lebl. [8] While the book isn’t intended for studying topology
specifically, it covers nearly all of the fundamentals which are
relevant to control. Note that real analysis is the most important item
in the above list.
Advanced Linear algebra is
the most difficult field for which to find an accessible, engaging
textbook, I.M.O.. The majority of the texts are far too focused on
minute, irrelevant details and burdensome proofs whose understanding
gains little insight regarding the deeper concepts. More importantly,
most textbooks totally fail to connect the ideas to deeper concepts
which are both cool and incredibly useful. After a lot of searching, I
found hope in an unexpected place: online lecture notes. [9] If you
master this book, and its difficult problems, to the point where you can
comfortably walk through the main concepts with a high school student,
then you’ll be five steps ahead of me.
As for dynamical systems, I’d say that Dynamical Systems by Sternberg does the trick. [10] Until you get to the more theoretical content like stability and invariance,
you really want to focus more on the concepts; the details aren’t
particularly important, surprisingly. You really just need to know what
kind of assumptions you have to make about the system you’re modeling.
Once you’re comfortable with most of the above, you can get your hands dirty with some actual control theory. For this, I recommend Mathematical Control Theory by Sontag. [11]
†:
I have a hunch that’s not what you want to here, since you didn’t ask
for advice regarding this matter. So I’m sorry if this caveat irks you
in any way, but it’s the best advice I can give, and I think it’s
important for you to hear.
I’m a firm believer
in pragmatic optimism, and while it’s optimistic to believe that
admittance into graduate school—especially in a technical field—is
feasible without an undergraduate degree, it is far from optimistic.
Without an undergraduate degree, you are immediately excluded from
consideration for all departments at the majority of universities.
I
can’t find any specific statistics on this matter, so you’ll have to
choose whether or not to take my word for it. But trust me when I say
that I can currently think of one graduate school that doesn’t
necessitate an undergraduate degree as a strict requirement.
Even
putting the strict requirements aside, for deeply embedded
multidisciplinary fields like robotics and machine learning, an
undergraduate education is crucial. Although I do think that the ability
to interact with professors; learn with faculty and peers in person;
and receive a curriculum designed by experts on which you are tested in a
competitive environment are all vital assets for initiating the
engineering experience in any field, they are especially true for
robotics.
Another important distinction regarding your question is are you planning for a masters or a Ph.D.?
[1] Pawel Pralat: Graph Theory
[2] Tesla driver killed while using autopilot was watching Harry Potter, witness says
[3] Vehicle stability control systems: An overview of the integrated ...
[4]https://www.dartmouth.edu/~chanc...
[6] Robot Learning | J. H. Connell | Springer
[7] http://retis.sssup.it/~bini/math...
[8] http://www.jirka.org/ra/realanal...
[9] https://www.math.uh.edu/~climenh...
[10] Dynamical Systems (Dover Books on Mathematics): Shlomo Sternberg: 9780486477053: Amazon.com: Books
[11]http://www.mit.edu/~esontag/FTPD...
· 34 Upvotes · Answer requested by Abdulmajeed Kabir
- Convex Optimization (Convex Optimization - Boyd and Vandenberghe)
- Linear algebra
- Some rudimentary Calculas (especially use of Lagrangian)
- Lots of Probability ad Statistics
http://courses.washington.edu/cs...
A couple of years ago, based on his experience, Bradford Cross
gave a comprehensive list of the best resources on machine learning and
the prerequisites in his blog ("Measuring measures"). Unfortunately, it
appears to be down right now.
UPD:
Here the blog post at WebArchive's mirror is: http://web.archive.org/web/20101...
Bradford's lists at Amazon:
- Analysis [1]
- Linear Algebra [2]
- Probability [3]
- Statistics [4, 5]
- Optimization [6]
- Machine learning [7]
- Feature Selection [8]
I hope, Mr. Cross will be able to join the discussion.
[1] http://www.amazon.com/Analysis/l...
[2] http://www.amazon.com/Matrix-Fu/...
[3] http://www.amazon.com/Probabilit...
[4] http://www.amazon.com/Statistics...
[5] http://www.amazon.com/Nonparamet...
[6] http://www.amazon.com/Heuristic-...
[7] http://www.amazon.com/Machine-Le...
[8] http://www.amazon.com/Feature-Se...
UPD 2:
Here
is the list of must-read books for theoretical machine learning [1],
which is attributed to prof. Michael Jordan (UC Berkeley). The sources
are [2] and [3].
[1] https://www.goodreads.com/review...
[2] Learning About Statistical Learning
[3] AMA: Michael I Jordan • /r/MachineLearning
This
course materials are old by the way. Good news is that you can find the
book (composition of all materials) easily by searching. If I am not
wrong, the last revised version of this book is 6th May, 2012.
You need linear algebra as well. I recommend you for this reason, Gilbert Strang's "Linear Algebra and Its Applications". It may be little bit tough, but it is a great book.
If you want to dive into probabilistic approach, you can enroll Probabilistic Graphical Models course: https://www.coursera.org/course/pgm. I heard that it is very good course. Textbook of that course looks very useful: http://www.amazon.com/Probabilis...
The current machine learning (ML) algorithms are based upon mapping functions.
F:X→Y
The function F
can be anything such as a support vector machine (SVM), a restricted
Boltzmann machine (RBM), a deep neural network (DNN) or anything else
that you can hand engineer yourself. In application areas, X represents the input space while Y
represents the output space.
In speech recognition X
might be a set of spectrograms while Y a set of identities representing the speakers. In image recognition, X is the raw image pixel space while Y is the categorization consisting of different classes in which xi∈X
can fall into.
Each ML model has parameters w
that affects the behavior of F
that we can normally adjust in order to change the behavior of that
function. We can thus write the mapping more conveniently as:
yi^=f(xi
,w)
where yi^∈Y
We will focus on supervised ML model where we have a dataset T
of training input-output pairs in the form:
T=[(x1,y1),(x2,y2),…,(xN,yN)]
The goal of supervised machine learning is to find the best parameter values w^
that makes the function F
map the input-output pairs with the least error. So in supervised ML we have two main issues:
- Define a fitness measure that tells us how well the ML model is performing on the trainging set T
- .
- Generalization:
We can run the same fitness measure on the test set after training is
complete in order to measure how well the model generalizes to novel
inputs. This is a very important concept in modern ML. - A learning algorithm to update the weights, w→w^
- .
This
is where the maths come in, to understand the underlying maths concepts
you need to understand what ML is trying to solve in the first place.
The aim here is to find solutions to those 3 issues mentioned above and
maths can help us with that.
1: A fitness measure:
This is normally done by an objective function also known as the loss/cost function:
L(yi^,yi)
where yi^
= actual output and yi
= desired output.
In empirical risk minimization[1](ERM) the goal is to to minimize the overall loss as defined by the risk R
:
Remp(w)=1N∑Ni=1L(f(xi,w),yi)
ERM states that the learning algorithm should choose the hypothesis function f^
such that the empirical risk is minimized, In simple mathematical terms we need to solve:
w^=argminRemp(w)
Where f^=f(x,w^)
2: Generalization:
The above naive ERM can result in the function f^
just memorizing the training examples which can cause what is called overfitting, that is, fitting the function F to each and every noisy/outlier data point. That is not ideal thus instead we normally use structural risk minimization[2](SRM) whereby we add a regularization term C(w)
to the risk, thus we get the regularized risk:
Rstru(w)=1N∑Ni=1L(f(xi,w),yi)+λC(w)
Rstru(w)=Remp(w)+λC(w)
Then in SRM we need to solve:
w^=argminRstru(w)
Regularization
simply simplifies the weight parameters so that they don't model too
much of outliers or noise. That is done by penalization of large weight
values in w
which are a cause of most overfitting issues. Thus L0 norm can be used in order to favor a very sparse set of weights whereby most weight values are zero. You can also use L1 or L2 regularization instead as the L0
norm is hard to optimize. Other weird regularization methods have since
popped up such as dropout, which is used in learning algorithms for
DNNs whereby neurons are randomly dropped out and back during training
so that the overall network becomes robust to noise, dropout can be
loosely seen as an ensemble method.
3: A learning algorithm:
Learning
in current ML can be viewed as a way to update the weights in order to
find the optimal parameters. ERM and SRM both are relying on the
existence of a learning algorithm for weight adjustment. We need an
algorithm to find the weights that solve.
w^=argminRemp(w)
or
w^=argminRstru(w)
We need a way to update the model such that
w^←w
In
current ML systems we just look to the old idea of gradient decent (GD)
from numerical optimization. In GD we simply just move down the
steepest slope on the error (risk) surface defined by the risk R
. That means we can just use the update rule defined by.
wt+1=wt−α∂R∂wt
where t
=step count, α
=learning rate
Here we assume a convex surface defined by R
but in practice especially for DNNs the surface is highly non-convex
but in practice almost any local minima is just good enough, plus we can
add momentum to the update rule so that it can escape from the local
minima traps easily. Also the shear number of parameters makes it harder
for the DNN model to get trapped in a local minima trap as there are
many possible escape routes through the other many dimensions.
In DNNs the gradient computations can become cumbersome even for a modern machine as the number of gradient steps needed to hit w^
are normally large. Thus we need fast ways to accelerate gradient
computations for layered architectures. Backpropagation (backprop)
algorithm, to be specific, is a way of computing gradients extremely
efficiently in any differentiable computational graph. Backprop uses
chain rule by starting from the output layer which is directly connected
to the loss function and hence easier to evaluate the derivatives and
then move towards the layers (input layer) far away from the output
layer while chaining the derivatives. It is called backprop because
errors are passed from back layers towards the front layers thereby
saving a lot of repeat computations.
GD
requires that all training pairs are considered before taking a single
small update step, this is not scalable. Thus in practice we have the so
called stochastic gradient descent (SGD) that takes a step just after
one example, this is so efficient that it is normally a standard
learning algorithm for DNNs together with backprop. There are batch
variants of SGD which you can consider as being inbetween SGD and GD,
the batch gradient descent approach uses a small random set known as the
batch of training examples that it uses to approximate the gradient
field via backprop algorithm. Thus SGD can be seen as the batch variant
with just 1 example in the batch.
So
to learn the maths theory behind ML start from the underlying goals of
ML which we have looked at in this discussion. Of course this was just a
tip of the iceberg, but the best way to see most ML models is that they
are function approximators and we wish to recovery those approximations
from input-output training pairs alone, which we call end-to-end
learning.
It also helps to visualize ML as just
optimization theory. We have a loss function and all we need is an
algorithm that helps us find the right settings such that the loss is
minimized. In practice SGD+backprop works very well for training modern
ML models.
You need to also try and implement
some of these algorithms yourself from scratch. Try to implement
backprop and SGD for a multi-layer neural network (NN), not a deep now,
then try it on MNIST dataset. You can only learn via practice, make sure
before implementation you go through backprop and derive it for
multi-layer NNs and convolutional neural networks (convNet).
Don't
be too much in a hurry though, concepts take time to make sense. In
order to help yourself assimilate the stuff a bit easily, solve some
problems and try to also explain the systems to others via platforms
like Quora, that way you will start to have more and more confidence in
your understanding of the maths behind ML algorithms.
Hope this helps.
Footnotes
[1]Empirical risk minimization - Wikipedia
[2]Structural risk minimization - Wikipedia
· 12 Upvotes · Answer requested by Pankaj Sharma
Some people say that mathematics are useless for a software engineer, machine learning proves them false.
Mathematics
are the prerequisites for machine learning because machine learning is
math. The computer is only useful to do the calculus.
You'll mainly need to learn calculus, matrix calculation, linear and non linear algebra, statistics and graph calculus.
Let's take a basic ML algorithm, the linear regression.
The
goal is to use some data to find a function which takes parameters and
gives an output. Data are used to find the function and test it. In the
future, we will use the function with some parameters and we will obtain
an approximate output.
Let's say our data are
about planes, as input we have the number of miles travelled by the
plane and its age. As output we have the price of the plane. I don't
normalize data to keep things simple.
A sample of our data could be :
miles;age;price
120000;12;120000
48000;4;1500000
...
Our question is : Given the miles travelled by a plane and its age, give a price.
Using
linear regression (gradient descent) we will find a vector theta. This
vector has two values, theta[0] and theta[1]. To find an approximate
price we will multiply the miles by theta[0] and the age by theta[1] to
obtain a result, which is an approximate price.
For
instance our algorithm could find theta = [2;-10 000] and if we have a
plane 5 years old with 78 000 miles, we can than approximate the price
doing 78 000 * 2 + 5 * -10 000 so 106 000 dollars.
The hard part is to find the good values for theta. To do that you need some maths.
You
have a cost function that give you how good your theta is, this cost
function tests your theta values using your data (which already have a
price for a plane regarding the miles and the age).
So your goal is to minimize the cost function by adjusting your theta values.
The cost function to minimize is this one :
where
Using the batch gradient descent algorithm each iteration adjusts the theta values using this formula :
then
you test the theta value with the previous function J(theta) and you'll
see that the cost (ie. the diff between the predicted value and the
real) will decrease at each iteration.
As you can see this simple ML algorithm is math. The computer will be useful to compute the previous formulas.
is too vast a subject to be considered for this question. The breadth
and depth of mathematical awareness you require for machine learning
totally depends on what you are learning in the subject. Keeping this in
mind, let's deal with what you need to know in "mathematics" for
machine learning.
1. Probability and mathematical statistics
This is a fundamental requirement for machine learning and so you need
to know well. When I say probability it's more than what you studied in
High school and almost everything you probably not paid attention to
during your undergrad. You need to know about Random variables, their
distributions, probabilistic convergence, and estimation theory. That
covers a major part of what you need to know here.
Two of my favourite resources are:-
1. Joseph Blitzstein - Harvard Stat 110 lectures
2. Larry Wasserman's book - All of statistics
2. Linear algebra
Linear
algebra will pop up every now and then in ML. PCA, SVD, LU
decomposition, QR decomposition, symmetric matrices, othogonalization,
projections, matrix operations are needed many a times. The good thing
is that there are countless resources available on linear algebra.
My all time favourite is Gilbert Strang's MIT lectures on linear algebra.
3. Optimisation
Though
only a few things from optimisation are needed most of the time, a
strong foundational knowledge will help long way. You need to know
Langrange multipliers, gradient descent, and primal-dual formulation.
The best resource on this is Boyd and Vandenberghe's course on Convex
optimisation from Stanford.
4. Calculus
I
wanted to put this on the top, but I'm putting it in the last just to
emphasise on the fact that only a fundamental knowledge is needed in
terms of calculus. Know about 3-D geometry, integration, and
differentiation and you'll survive. It's the easiest to start with
amongst the topic I've mentioned here. MIT has good lectures on
calculus.
I think with these 4 tools you'll most likely find ML
easy to understand. Other than these you may find real analysis and
functional analysis relevant too, but they are just formal
generalisations of the topics mentioned before.
From a beginner.
An introductory Linear Algebra course will generally include the following:
- Vectors
- Vector Spaces
- Matrices
- Inner Product Spaces
- Orthogonality
- Projection
- Linear transformations
- eigenvectors, eigenvalues
- change of bases
- Various decompositions: LU, Polar, SVD.
I also had some geometric algebra, but haven't found that useful so far.
Probability and statistics:
- probabilities
- combinations
- permutations
- distributions
- Understanding of hypothesis testing
- Descriptive statistics: Means, modes, standard deviations, variances etc.
If you can get through:
https://www.khanacademy.org/math...
And
https://www.khanacademy.org/math...
You are good to go.
which includes a lot of the math for machine learning. There's also a
draft textbook there which is well worth grabbing a copy of.
The machine learning field needs the following mathematics background to understand more things.
- Calculus and in my view the following reference is very good,
- The linear algebra and matrix calculation and the following reference is very relevant,
- The statistics and probability background, and the following books are very good,
- All
of Statistics: A Concise Course in Statistical Inference (Springer
Texts in Statistics): Larry Wasserman: 9780387402727: Amazon.com: Books - Amazon.com: A First Course in Probability (9th Edition) (9780321794772): Sheldon Ross: Books
- The knowledge of optimization and the following textbook is very good,
you are truly looking for a one-stop reference, the best that I can
suggest is Chris Bishop's Pattern Recognition and Machine Learning (http://www.amazon.com/Pattern-Re...).
Although it is quite difficult to start with, it will cover the
majority of your interests until you are well versed enough in the
subject to be able to read publications and more specific texts.
When
in doubt, MIT OpenCourseWare is always a good source -- I believe they
even offer one or two machine learning courses at the graduate level.
Good general reference/tutorial texts:
- Information Theory, Inference and Learning Algorithms -- McKay
- Introduction to Probability Models -- Ross
- AI: A Modern Approach -- Russel & Norvig
- Algorithms -- Kleinberg & Tardos
Christopher
Bishop - Pattern Recognition & Machine Learning. First time I
picked this book up it was pretty daunting, but once you get a bit of
the maths under your belt, I found that it presents clearer
explanations than other texts. I found it really clearly laid out and
it seems to progress pretty well. It also covers a lot of stuff.
Linear Algebra:
Gilbert
Strang videos on linear algebra are excellent, so are the Khan academy
ones. The Gilbert Strang book doesn't seem to get particularly great
reviews. On the basis of reviews, I picked up a copy of Howard Anton's
Elementary Linear Algebra which seems to be very highly regrarded. I
would recommend it. I also have David Poole: A Modern
Introduction...which feels a bit more......modern than Anton and I have
tended to use it more. Doesn't seem to be a particularly well known
book on t'internet, but I find it very clear (more so than Anton).
If
you want to practise, then there's Schaums Outline of Theory and
Problems of Linear Algebra. (Good for practising but insufficient as a
standalone text to the subject)
If you have the luxury of having
some time before starting on Machine Learning, I would suggest really
focusing on linear algebra in a very hands on way (working through
structured examples) and getting a good understanding of orthogonality,
vector spaces, eigenvectors, transformations. From my experience,
trying to learn the maths at the same time as learning Machine Learning
was overwhelming and I would have got a lot more out of ML lectures if I
had already got a grasp of the maths.
You'll
want to know calculus up to vector calculus, a first course in linear
algebra, and a good course in calculus-based statistics that actually
explains what the concepts mean (as opposed to "if you're trying to do
this, you should press the chi-squared-test button" like you see in a
lot of classes.) A discrete math course would be nice just for
background on notation, although you don't actually need to know any
nontrivial discrete math.
Mathematics
for ML is no different from what you learn in high school or in
under-grad studies. If you have that mathematics base, most of the time
it is sufficient to understand what's going on in those creepy equations
you see in books and research papers. However, sometimes more than that
is required and you may have to take some advanced courses in
statistics, calculus, linear algebra etc. You may also like to read in
general more about How do I learn machine learning?
· 3 Upvotes · Answer requested by Shuvanon Razik, Francisco Sosa, and 1 more
Please see How do I learn mathematics for machine learning?
which has some good answers. I believe the Witten et. al. book is one
of the most accessible introductions. I guess a basic book on statistics
and probability and another one on Linear Algebra (For example Strang,
4th edition [1]) will take you most of the way there.
[1] http://math.mit.edu/linearalgebra/
· 19 Upvotes · Answer requested by Francisco Sosa
Teach yourself Machine Learning the hard way ! and follow up Teach yourself Machine Learning the hard way ! (Part 2)
It lists many pre-requisites that you need to understand and also some of the advanced stuff in part2.
I hope this helps.
With regards to mathematics for machine learning, I reckon all of the following skills are important:
(1) Some Basic Mathematical Skills (Linear Algebra, Probability, Optimization)
(2) Knowing how those mathematical skills are exploited for machine learning algorithms
(3) Developing a way to understand mathematics, so that any advanced maths for modern machine learning can be well comprehended.
While
one would generally recommend all sorts of linear algebra and
probability books for machine learning, I feel those are not always
worth the time at least for machine learning. I would recommend
following texts to read through (perhaps in order), which should cater
to the above three mentioned points.
(a) Pattern Recognition and Machine Learning by Christopher Bishop (Will cater to 1 and 2 above)
(b) Deep Learning book by Goodfellow, Bengio and Courville (Will again strengthen 1 and build on 2)
(c)
Understanding Machine Learning by Shai Shalev Shwartz and Shai Ben
David (Will advance your skills in 1, strengthen 2, and give an insight
to 3)
(d) Ankur Moitra’s rather short but useful book on Algorithmic aspects on Machine Learning (Will mainly cater to 3)
(e) Optimization for Machine Learning by Sra, Nowozin and Wright & Off the convex path by Sanjeev Arora and collaborators (Will cater to 3 and advance 1 and 2)
I
truly believe if one can properly understand the above stuff in machine
learning, he will develop all the Maths basics needed for machine
learning, that too in a very connected form !! Hope this helps !!
I won't say that you “learn” math. I would rather say that you train math.
Imagine
you want to train boxing and your coach is teaching you directs, low
kicks and high kicks. No matter how many times he shows you how to kick,
you can't do it perfectly. You do know that it takes patience, hard work and effort
to finally learn how to punch and you need to keep trying and training.
After so many trys you can finally say that you can actually punch.
Whats the point?
Math is the same. Consider direct punches your formulas, low kick your theories and high kicks your
solutions to problems. No matter how many formulas or theories you
know, no matter how many times you've seen solutions you just can't do
it perfectly. Why? Because you need to train those formulas, train those theories and knock out those problems with a damn good high kicks. And how do you do that?
- Do as many problems as you can on a daily basis. It is not going to happen overnight, it takes time to train those kicks.
Wanna learn it fast? Better start now!
Brush up on your statistics and probability. This is definitely critical particularly for supervised learning methods.
Some also require a good deal of number theory knowledge especially when discussing SVM, PCA and friends.
Since,
you are planning to take a Ph.D. and move the science further you might
want to narrow your focus to a particular area for your research while
working with your candidate adviser.
This is not an exhaustive list of topics. Best read in this order:
- Linear Algebra
- Vector Calculus
- Statistics and information theory
- Discrete Math
- Convex Optimization
- Probabilistic Graphical models
I believe there is a book : http://www.amazon.com/All-Mathem... which can help you get a good head start.
I will try to keep this as concise as possible.
Edit: Somebody merged the original question to this question, so the premise becomes irrelevant.
To become a full stack AI/ML engineer, it is imperative that you have a complete grasp of the mathematical foundations
of ML so that you can build upon concepts easily. The basic
mathematical skills required are Linear Algebra, Matrix Algebra,
Probability and some basic Calculus.
Linear Algebra
The best source to study Linear Algebra is Prof. Gilbert Strang’s Linear Algebra book/course. Video Lectures | Linear Algebra | Mathematics | MIT OpenCourseWare
(MIT OCW). There are 34 lectures and believe me, they are completely
worth it as after completing this, linear algebra should not pose any
more problems for you. Solve some exercises/exams if you want to achieve
mastery (recommended).
Matrix Algebra
Matrix algebra is an essential component of deep learning. I personally recommend this (Matrix Cookbook by Kaare Brandt Petersen & Michael Syskind Pedersen): http://www2.imm.dtu.dk/pubdb/vie...
(PDF). There are 66 pages of pure matrix operations and this is the
absolute “go-to” in case you are stuck trying to understand certain
matrix manipulations that a researcher might have done.
Probability & Statistics
Understanding
probability is a very important aspect of understanding ML. Some of the
key probability concepts that you must be aware of include Bayes’
Theorem, distributions, MLE, regression, inference and so on. The best
resource for this is Think Stats (Exploratory Data Analysis in Python) by Allen Downey: http://greenteapress.com/thinkst...
(PDF). This absolute gem of a book is 264 pages long and covers all the
aspects of probability and statistics that you need to understand with
relevant Python code.
Optimization
The go-to book for Convex Optimization is Convex Optimization by Stephen Boyd and Lieven Vandenberghe: https://web.stanford.edu/~boyd/c...
(PDF). This is a 730 page book and you need not read it all in one go.
Choose the concept which you need to learn depending on your
requirements and interest and read that part. It is complete and
extremely well written. This book is free as part of the CVX 101 MOOC on
EdX.
This 263 page book on metaheuristics, Essentials of Metaheuristics by Sean Luke (http://cs.gmu.edu/~sean/book/met...
(PDF)) talks about gradient based optimization, policy optimization
etc. and it is well written. One can choose to go through this also if
interested.
Data science concepts are covered
in the above topics. Other topics can be learnt by googling for sources
easily as and when you encounter them. But complete understanding of the
above should suffice for 95% of all scenarios.
Achieving
mastery of the above topics will surely make you a mathematically
strong AI/ML engineer. Now that you have built the foundation, start
dipping your feet into research papers. They are absolutely
essential as these clearly show the standards of AI
researchers/engineers. Firstly, find out the famous papers of AI like
RNN, LSTM, SVM etc. and go through the technical content.
Can you understand the jargon?
Can you understand the mathematics?
Can you implement the mathematics in code now without the help of overly sufficient libraries?
These are the key questions to be answered. Once you can answer “Yes/Mostly Yes” to these 3 questions, you are good to go.
After trying to read these papers dealing with the most popular concepts, try to read the not-so-famous papers. arXiv
is a great site with hundreds of preprints being published everyday by
top researchers and reading the papers from here is like drinking
straight out of the fire-hose. Try to choose a paper which looks fairly
well written and the abstract seems interesting. Then, read that paper
and try to answer those 3 questions again. The same can be done with
papers of top AI conferences like NIPS, AAAI, AAMAS, IJCAI, ICML etc.
You may not be able to fully implement the papers due to data
constraints and other issues, but if you are able to understand even 60%
of the mathematical reasoning, then I can safely say you have completed
your training.
Do not concentrate on learning more and more “packages”.
Concentrate on the concept. While implementing, you will automatically
see that you require “this” package and then you will automatically
learn to use it. Learning the various commands of random packages won’t
help. If you start implementing and writing codes to solve problems or
simulate results from a paper, you will automatically learn about
packages and use them appropriately; they’ll be the least of your
concerns. This is the correct way to maintain “balance” between math and coding.
You can also participate in competitions (e.g. Kaggle or conference
competitions) to improve speed, development and processing skills if you
feel the need to do so.
Alternatively, you can choose to pursue a doctoral degree (like me :P ) in AI/ML to gain a complete in-depth understanding of everything discussed here and more.
(All the links in this answer are working as of 6th July 2017)
- Analysis http://www.amazon.com/Introducti...
- Algebra http://www.amazon.com/Introducti...
- Probability http://www.amazon.com/All-Statis...
They will make your later reading much more pleasant. You will be able to devise your own proofs.
Terence Tao put multiple math-learning advices on his blog:
- Solving mathematical problems
http://terrytao.wordpress.com/ca... - There’s more to mathematics than grades and exams and methods
http://terrytao.wordpress.com/ca... - There’s more to mathematics than rigour and proofs
http://terrytao.wordpress.com/ca...
I
started writing the github awesome page for that ,it may help ,its
having topics from basic machine learning maths to advanced and quantum
machine learning
krishnakumarsekar/awesome-machine-learning-deep-learning-mathematics
Thanks and Regards
Krishna
krishnakumarsekar/awesome-quantum-machine-learning
A2A.
To have a basic mathematical background, you need to have some knowledge of the following mathematical concepts:
- Probability and statistics
- Linear algebra
- Optimization
- Multivariable calculus
- Functional analysis (not essential)
- First-order logic (not essential)
You
can find some reasonable material on most of these by searching for
"<topic> lecture notes" on Google. Usually, you'll find good
lecture notes compiled by some professor teaching that course. The first
few results should give you a good set to choose from.
For instance, here’s a list of some lecture notes that I just found:
Probability & Statistics : http://www2.aueb.gr/users/demos/...
Linear algebra : https://www.math.ku.edu/~lerner/...
Optimization : http://www.ifp.illinois.edu/~ang...
Calculus: https://www.math.wisc.edu/~angen...
Matrix Calculus : http://www.atmos.washington.edu/...
You
should skim through these, without going into too much detail. You can
come back to studying the topics as and when required while learning ML.
· 37 Upvotes · Answer requested by Jasdeep Rana
If
you want to be a real Data Scientist Not the fake ones with skills of
Analyst and not any mathematical intuition or point of view. Real Data
Scientist Need to have very strong mathematical grounding.
So to learn Mathematics for ML this should be the order :-
- Start with probability ( Conditional Basic Marginal etc …)
- Mathematical Series and Convergence , Numerical methods for Analysis
- Matrix and Linear Algebra
- Bayesian Statistics
- Vectors ( Most Important)
- Calculus
- Markov Process and Chains
- Basics of Optimization ( Linear/ Quadratic)
- Advanced Matrix Algebras and Calculus ( Gradient , Divergence , Curls etc)
This much mathematics will enable the understanding behind the core ideas of ML and probabilistic algorithms,
You should pause now and start analysing certain Packages from Scratch in Python :
1. K-NN is great starting point learn it , and code it from scratch.
2. Logistic Regression with Gradient Descent.
Till
now you can see the parameters and numbers moving in a matrix form ,
and understand the mathematics of prediction, And if you feel this is
enough. Hold your breath. There is more exciting stuff to come. This
will enable you to be a beginner of being a “Real Data Scientist”.
Next Start with :-
- Stochastic Models and Time Series Analysis
- Differential Equations
- Dynamic Programming and Optimization Techniques
- Fourier's and Wavelengths
- Random Fields
- Basic Knowledge of PDEs
- Techniques to solve PDEs using Monte-Carlo , Polynomial Expansions.
These
mathematical techniques will help you visualize the model’s working and
how to model and process raw data to create unique models whose
functionality can be tuned. Parameters can be optimized for the problems
and fine tuned with these techniques.
For a Next Level Up:- ( Statistics of Higher Dimensions)
- PDEs numerical solution with numerical input/ random input. ( fascinating subject to work on )
- Stochastic Differential Equations and Solutions
- PCA etc
- Dirichlet Processes, Markov Decision Process.
- Uncertainty Quantification - Polynomial Chaos, Projections on vector space
I
think these are subject which one must learn to be a good Machine
learning engineer in 21st century. with a knowledge base like this one
can connect dots very rapidly and build systems and model of high
accuracy.
( I am not a big fan of Neural nets,..so forgot to mention here)
Algebra is important in many ways, but you really need to learn some
logic. I don't mean the babiest of baby things that people say is easy
because they can understand and track and foretell the end of a mystery
novel in a tv series like Foyle's War or CSI or Sherlock Holmes. I
don't mean the intro to logic course in many philosophy departments. I
don't mean the Boolean circuits course you may have taken as a freshman
in computer engineering or the simple truth table arguments you did and
eventually turned in to graph theory problems in a second semester or
second year computer science course called ``Discrete Math''. I don't
mean the simple arguments you went through in your modern algebra course
as a senior in a mathematics department. But all of those can be
useful, and are pre-cursors or example generators for a beginning course
in Model Theory. Then you can begin to truly appreciate the NOTION of a
THINKING MACHINE, and what it means to model such a monstrosity. Then
you can begin to understand how to develop formal languages for solution
of specific problems. Then you can start understanding why it's really
strange to model THINKING as a neural network, although that is not a
completely useless way to do it. (Basically, neurel networks seem to me
to be ``pattern recognizers'', roughly, basically using fixed-point
iteration in metric spaces to hone in on a pattern or set of patterns of
behaviors of inputting agents. Please note that I said ``roughly''.
This is not intended to be a tutorial on neural networks.)
Of
course, it helps to have a notion of what it means to define or model
the concept described by the verb ``to learn''. That, my friends, is
the realm of philosophy and pedagogy, but to apply it requires an
understanding of the notion of a model, and we are back to my main
point: Take some model theory. It's not likely to hurt you for more
than a semester, and well...
NO PAIN NO GAIN!!
In
addition to Martin Thoma's great answer, I'd study up on the "Theory of
Computation". Text books abound, but they are expensive. Search on the
web. Wiki has an overview, but it's won't make sense until you've
studied a bit. Still, it may show you what you've missed.
· 1 Upvote
Bayes
Theorem is a fundamental concept of probability that underpins many
extremely important algorithms, from the very basic (e.g., Naive Bayes)
to the quite complicated (e.g., Latent Dirichlet Allocation).
In
linear algebra, a solid understanding of eigenvalues and eigenvectors is
important for topics such as principal component analysis, factor
analysis and other dimensionality reduction tasks.
I would suggest reading as much Linear Algebra books as possible, followed by some probability and statistics texts.
For
the first, I suggest Gilbert Strang's "Linear Algebra and Its
Applications", while for the second, "Probability, Random Variables And
Random Signal Principles" by Peebles is a good choice.
EDIT: A
previous answer suggests Convex Optimization text, which I also
recommend. A good text is "Convex Optimization" by Stephen Boyd, which
is also available for free in the author's website.
took both Andrew Ng's Machine Learning class and Sebastian Thrun's AI
class. I liked the Machine Learning one more - even though AI class
touches more topics, but it does it in a haphazard way. ML class is
narrower, but more practical and focused. It helps to keep a link to
Khan's linear algebra videos handy.
- ML class is running right now - https://www.coursera.org/course/ml
- Linear algebra: http://www.khanacademy.org/math/...
- Place where to find more useful courses: http://www.topfreeclasses.com
For understanding Machine Learning you need following Mathematics prerequisites :
1. Probability and Statistics : Machine
Learning has deep roots in Statistics. In fact the modern Machine
learning is essentially Statistical learning i.e using stats to find
patterns in data and inferring using them. So Stats and Probability are
bare minimum for ML.
2. Linear Algebra : This
is required because data is represented as matrix in Machine Learning
and essentially all ML algorithms can be seen as Matrix manipulation in
the end so basic understanding of Linear Algebra is required.
3. Optimization : Many
people argue that Machine learning is a fancy name for optimization.
While this is true to certain extent there is more to ML than
optimization. But a large part of it is indeed optimization. In the end
mostly all ML algos come down to some optimization task.
4. Calculus : This
is a very useful tool for ML. Most ML algos rely on Differential
Calculus to find solutions (Gradient Descent, Newton's method, quasi
Newton's method etc.).
IMO if you master these
topics than you can learn pretty much anything in ML, because all
algorithms are essentially application of these tools in Ml.
Conditional probability, random variables, pdfs etc. Whatever you'd
learn in your undergrad probability course and a bit more.
Some bit of stochastic processes (like markovian processes, etc)
Linear Algebra: Data analysis and machine learning builds up ALOT on these concepts.
Algorithms: not as important, but still important when it comes to optimizing your solution. some graph theory
Basic Linear Programming and recognizing convexity and relaxation.
I
guess once you have a fair idea of most of these concepts, then its
pretty simple to pick up the intuition behind any algorithm and where it
would work/ how to improve it/ etc.
· 1 Upvote
Goal 1: To understand what is ML, how to apply different algorithms to the task, how interpret the output,common pitfalls etc : You
will need to have a grasp on linear & matrix algebra,probability
and optimization. You don't need to take a deep dive into each one, but
study basic things like eigen vectors,conditional
probability,distributions, bayes theorem.Additionally learn the concepts
of overfitting, cross validation.
Resources
Video Lectures : Machine learning on coursera ( not only the one by AndrewNg, there are few others as well)
Books : Machine Learning by Tom Mitchell. It includes the necessary linear algebra and probability too.
Goal2:Why the current algorithms are designed in a particular way, How do they fundamentally differ from each other:
If you are more interested in the theoritical aspects like how kernels
of a support vector machine are defined, or how deep learning neural
networks are designed or how to tweak the existing algorithms to make a
new one,then you might want to extend your mathematics to functional
analysis, topology , advanced optimization.
Resources : Video: Advanced Machine learning, Caltech (Prof. Mostafa's Lecture)
Books
: Mining of massive datasets by ullman, Any good books on advanced
linear algebra, topology but they wont connect it to machine learning.
Learning
mathematics is about doing. Remember the 80/20 Rule : You must study
theory 20% of the time and practice/implement what you learn 80% of the
time.
Here is a list of books you could use. You can find accompanying online courses for many of them.
1. Strang's Linear Algebra and its Applications
2. Apostol Calculus - Both the volumes
3. Golub's Matrix Computations
4. Sheldon Ross' Probability
5. Elements of Statistical Learning by Hastie et al
6. Bishop's Pattern Recognition and Machine Learning
7. David Barber's Bayesian Reasoning and Machine Learning
8. Kevin Murphy's Machine learning: a Probabilistic Perspective
9. Wasserman's All of Statistics and Non-parametric Statistics
From Hacker News :
1.) Casella, G. and Berger, R.L. (2001). "Statistical Inference" Duxbury Press.
2.) Ferguson, T. (1996). "A Course in Large Sample Theory" Chapman & Hall/CRC.
3.) Lehmann, E. (2004). "Elements of Large-Sample Theory" Springer.
4.) Gelman, A. et al. (2003). "Bayesian Data Analysis" Chapman & Hall/CRC.
5.) Robert, C. and Casella, G. (2005). "Monte Carlo Statistical Methods" Springer.
6.) Grimmett, G. and Stirzaker, D. (2001). "Probability and Random Processes" Oxford.
7.) Pollard, D. (2001). "A User's Guide to Measure Theoretic Probability" Cambridge.
8.) Bertsimas, D. and Tsitsiklis, J. (1997). "Introduction to Linear Optimization" Athena.
9.) Boyd, S. and Vandenberghe, L. (2004). "Convex Optimization" Cambridge.
10.) Golub, G., and Van Loan, C. (1996). "Matrix Computations" Johns Hopkins.
11.) Cover, T. and Thomas, J. "Elements of Information Theory" Wiley.
12.) Kreyszig, E. (1989). "Introductory Functional Analysis with Applications" Wiley.
Please
do try implementing as many things as you can. Pick up a project. Talk
to your peers and professors and people, see if you can help them with
what you've learned. Do.
Some algorithms are really sweet they are available in Wikipedia with formulae, implementation and application.
Some dodge you till you watch two or three YouTube videos (Victor laverenko, Bert huang, udacity or MIT lectures)
Some
are really mischievous, you got to do a lot of research, they test your
patience and perseverance more than your mathematics!
And there are lots of books you can read.
How to learn a particular algorithm?
- First
from the business point, learn why to use a algorithm and not any other
counter part of its. Like why Fuzzy K-Means instead of K-Means. - Secondly from an analyst point of view. Learn how to use the algorithm to solve some use cases. What is it meant to do.
- Last will be the mathematics. The how of the algorithm. And more research to enhance the algorithm and patent it.
P. S. It is normal to not understand in the first go.
P. P. S. And very normal to get totally confused in the second and third.
Hi,
I work for a Data Science and AI company called InData Labs and on of
our tech experts has recently prepared a short guide to learn neural
networks, hope it is helpful for you:
A short guide to neural networks. Master them and become famous.
· 1 Upvote
Mathematics
is important part for learn machine learning. Necessary topics and
useful resources of mathematics for machine learning?
Here
i am sharing weightage of machine learning important mathematics topics
and making your confusion very clear. So see below list and start
preparation according it.
35% - Linear Algebra
25% - Probability Theory and Statistics
15% - Multivariate Calculus
15% - Algorithms and Complex Optimizations
10% - Others
Now
i am taking forward my article in deep level so you get totally
clearance to start machine learning or artificial intelligent .
-
Linear Algebra:
Topics such as Principal Component Analysis (PCA), Singular Value
Decomposition (SVD), Eigendecomposition of a matrix, LU Decomposition,
QR Decomposition/Factorization, Symmetric Matrices, Orthogonalization
& Orthonormalization, Matrix Operations, Projections, Eigenvalues
& Eigenvectors, Vector Spaces and Norms are needed for understanding
the optimization methods used for machine learning. The amazing thing
about Linear Algebra is that there are so many online resources. -
Probability Theory and Statistics:Probability
Rules & Axioms, Bayes' Theorem, Random Variables, Variance and
Expectation, Conditional and Joint Distributions, Standard Distributions
(Bernoulli, Binomial, Multinomial, Uniform and Gaussian), Moment
Generating Functions, Maximum Likelihood Estimation (MLE), Prior and
Posterior, Maximum a Posteriori Estimation (MAP) and Sampling Methods. -
Multivariate Calculus:
topics include Differential and Integral Calculus, Partial Derivatives,
Vector-Values Functions, Directional Gradient, Hessian, Jacobian,
Laplacian and Lagragian Distribution. -
Algorithms and Complex Optimizations:
Knowledge of data structures (Binary Trees, Hashing, Heap, Stack etc),
Dynamic Programming, Randomized & Sublinear Algorithm, Graphs,
Gradient/Stochastic Descents and Primal-Dual methods are needed. -
Others:
This comprises of other Math topics not covered in the four major areas
described above. They include Real and Complex Analysis (Sets and
Sequences, Topology, Metric Spaces, Single-Valued and Continuous
Functions, Limits, Cauchy Kernel, Fourier Transforms), Information
Theory (Entropy, Information Gain), Function Spaces and Manifolds.
Now
you are thinking and looking for best knowledge and practice resources
for your week points right? Don’t worry learners i would also like to
suggest some few good resources for it.
-
For Books :
Programming Collective Intelligence by Toby Segaran , Pattern
Recognition and Machine Learning and others Artificial Intelligence 3e: A
Modern ApproachPaperback by Russell and other books. - Best resources for online , video tutorials : Coursera , Kachhua.com , Udemy Online Courses - Learn Anything, On Your Schedule, chalkstreet, etc.
Thank you. Keep Learning.
Optmization.
Especially convex optimization. E.g. gradient based methods for
non-linear optimiztion (L-BFGS method and conjugate gradient), quadratic
programming, etc.
I made a podcast episode on the math you need for machine learning, and the resources for learning (if you like audio): Machine Learning Guide #8
It covers most of the math you need to get started with machine learning.
There are many reasons why the mathematics of Machine Learning is important and I will highlight some of them below:
- Selecting
the right algorithm which includes giving considerations to accuracy,
training time, model complexity, number of parameters and number of
features. - Choosing parameter settings and validation strategies.
- Identifying underfitting and overfitting by understanding the Bias-Variance tradeoff.
- Estimating the right confidence interval and uncertainty.
- Linear
algebra is a cornerstone because everything in machine learning is a
vector or a matrix. Dot products, distance, matrix factorization,
eigenvalues etc. come up all the time. Gilbert Strang’s linear algebra course i would recommend
- a youtube playlist
- the book: Introduction to linear algebra
- course page at MIT OCW
-
Multivariate Calculus:
Some of the necessary topics include Differential and Integral
Calculus, Partial Derivatives, Vector-Values Functions, Directional
Gradient, Distribution.Differentiation matters because of gradient descent. Again, gradient descent is almost everywhere . some courses i recommend
- Introduction to Mathematical Thinking - Stanford University | Coursera
- Convex Optimization
- Massively Multivariable Open Online Calculus Course from the Ohio State University - the course is a first taste of multivariable calculus, but viewed through the lens of linear algebra.
-
Probability Theory and Statistics:
Machine Learning and Statistics aren't very different fields. Actually,
someone recently defined Machine Learning as 'doing statistics on a
Mac'. Some of the fundamental Statistical and Probability Theory needed
for ML are Combinatorics, Probability Rules & Axioms, Bayes'
Theorem, Random Variables, Variance and Expectation, Conditional and
Joint Distributions, Standard Distributions (Bernoulli, Binomial,
Multinomial, Uniform and Gaussian), Moment Generating Functions, Maximum
Likelihood Estimation (MLE), Prior and Posterior, Maximum a Posteriori
Estimation (MAP) and Sampling Methods.
- Khan Academy's Linear Algebra, Probability & Statistics, Multivariable Calculus and Optimization.
- Larry Wasserman's book - All of statistics: A Concise Course in Statistical Inference.
- Udacity's Introduction to Statistics.
-
Algorithms and Complex Optimizations:
This is important for understanding the computational efficiency and
scalability of our Machine Learning Algorithm and for exploiting
sparsity in our datasets. Knowledge of data structures (Binary Trees,
Hashing, Heap, Stack etc), Dynamic Programming, Randomized &
Sublinear Algorithm, Graphs, Gradient/Stochastic Descents and
Primal-Dual methods are needed. - Boyd and Vandenberghe's course on Convex optimization from Stanford.
Given all that , ML is not all about Maths and to frank Starting you will hardly spend 5% of your effort doing maths
I wrote a detailed medium post on this. You can read it here Math for Deep Learning is not Merlin’s Enchantment – Vaibhav Aparimit – Medium
First
of, I really like your question. You seem to implicitly understand that
math is an essential skill required to grasp the underpinnings of
machine learning .
If your question was around
deep learning, I would say linear algebra for 95% cases. In case of
machine learning you would need to know probability ( especially bayes
rules and conditional probability) , differential calculus and linear
algebra( matrix multiplication, Eigen vectors , determinants , Hessians )
Hope this helps .
You may find Metacademy helpful when trying to understand the prereqs for various concepts in machine learning: Concepts - Metacademy
You must have a sound understanding of at least the following (there might be others which are not there in this list):
- Linear algebra
- Calculus
- Matrix calculus
- Probability and statistics
- Optimization - linear programming, convex optimization, non-linear optimization
Some other topics that are useful in specific sub-areas of machine learning are:
- Basic graph theory
- Basic algorithms
- First-order logic
- Linear Algebra
- Probablity theory and statistics
- Multivariate calculus
- Algorithms and Complex optimizations
-
Others- Real
and Complex Analysis (Sets and Sequences, Topology, Metric Spaces,
Single-Valued and Continuous Functions, Limits, Cauchy Kernel, Fourier
Transforms), Information Theory (Entropy, Information Gain), Function
Spaces and Manifolds.
To learn them go through
Numerical Methods; Matrix and Tensor Algebra; Probability and Statistics; Operations Research; occasionally Calculus.
· Answer requested by Shuvanon Razik
· 11 Upvotes · Answer requested by Shuvanon Razik and Francisco Sosa
Alex mentions above, Andrew Ng's Course on Machine Learning is the best
I have seen so far and he gives an intuitional feel for the concepts,
so its easy to follow rather than looking at plain formulae in
mathematics.
If you are looking to refresh/clarify linear algebra concepts after going through the above course, Khan Academy
could be useful. It also has videos on other topics that might be of
interest for machine learning. If you are looking for concepts like PCA
etc., you might not find it here..
Another useful resource that can focus on concepts is video lectures.. Machine Learning - videolectures.net.. and search only for tutorials.
There
is no one stop shop as the concepts can go deeper and might require
special treatment.. All this is theoretical which can clarify concepts.
However, if you are a novice and if you want a deep and intuitive feel
for concepts then pick one simple problem and implement the solution.
· 1 Upvote
Linear Algebra, Statistics, Discrete Math, Set Theory, etc.
· 1 Upvote
friends i just came across a very interactive course on Understanding
Machine Learning. This is a completely free video course. You just need
to enroll using your id and password. I am sharing the link with you. Do
enroll
Understanding Machine Learning with R - uFaber.com
· 1 Upvote