What is Representer Theorem in Machine Learning?

bryan_alexander — *Banner Courtesy: Bryan Alexander*

Machine learning involves a plethora of concepts from mathematics and statistics at its core. Therefore, a good solid understanding of both these subjects is essential to analyse ML in depth. In fact, to develop algorithms in ML, knowledge of statistics is crucial. There’s a separate branch called Statistical Learning Theory that derives insights from statistics as well as from Functional Analysis. This article explores a concept called Representer Theorem in statistical learning theory, which finds application in areas such as pattern analysis and specifically, Support Vector Machines (SVM).

Kernel Methods And RKHS

In the statistical context for ML, kernel methods are a group of algorithms that cater to pattern analysis, which focus on identifying patterns in data. In order to perform tasks related to detecting patterns, most of the algorithms (apart from kernel methods), require data which is converted into feature vectors. On the other hand, kernel methods (or kernels, specifically) rely on similarity functions. Kernels have the advantage of operating on a feature space, simply by computing values of the pairs of data without considering the coordinates of that data space. This is called the ‘inner product’.

Kernels employ memory-based learning, which means that instead of using the generalisation approach, it compares unknown and new instances to the training instances stored in the memory.

Kernel methods rose to importance as a result of advances in pattern recognition, specifically handwriting recognition. As development in kernel methods grew, many concepts related to this were idealised in mathematics. One particular concept that serves importance in ML was Reproducing Kernel Hilbert Space (RKHS), which was first developed by Polish mathematician Stanislaw Zaremba when he worked with harmonic functions.

As the name suggests, RKHS derives its mathematical function from Hilbert Space, a vector space that generalises two-dimensional and three-dimensional objects (in the backdrop of Euclidean geometry). In mathematical terms, it is defined as:

“Hilbert space is a vector space H with an inner product (f, g) such that norm defined by,

|f| = √(f,f)

turns into a complete metric space.”

Now, RKHS establishes a linear relationship in a Hilbert space of different functions. For example, the inner product mentioned earlier contains two functions, f and g. If RKHS is applied for these functions, the norm should be as small as possible for it to be linearly functional. These two concepts form the basis for representer theorem

Representer Theorem

With the help of RKHS, a unique proposition known as the representer theorem was formulated. This is because popular kernels possess the problem of infinite dimensional space which may seem mathematically feasible but not practically viable, especially for training a learning machine which generally deals with optimisation.

There are two cases in the representer theorem, one without prior assumptions (nonparametric) and the other with partial assumptions (semiparametric). The definition and mathematical representation for both of these cases are given below:

Nonparametric Representer theorem (Theorem 1): Suppose we are given a nonempty set X , a positive definite real-valued kernel k on X ×X , a training sample (x₁, y₁),…,(x_m, y_m) ∈X×R, a strictly monotonically increasing real-valued function g on [0,∞], an arbitrary cost function c : (X × R2)^m → R ∪ {∞}, and a class of functions

F = {f ∈ R^X |f(·) = (Σ)^∞ _i=1 β_ik(·, z_i), β_i ∈ R, z_i∈ X , ||f|| < ∞}

Here, · is the norm in the RKHS associated with k, i.e. for any z_i ∈ X , βi ∈ R (i ∈ N) given

|| Σ^∞_i=1 β_ik (·, zi)||² = Σ^∞_i=1 Σ^∞_j=1 β_i β_j k (z_i, z_j)

Then any f ∈ F minimising the regularised risk functional

c ((x₁, y₁, f(x₁),…,(x_m, y_m, f(x_m))) + g (|| f ||)

admits a representational form

f (·) = Σ^m_i=1 ⍺_ik (·, x_i) ”

Semiparametric representer theorem (Theorem 2) : Suppose that in addition to the assumptions of the previous theorem we are given a set of M real-valued functions {ψ_p}^M_p=1 on X , with the property that the m × M matrix (ψ_p(x_i))_ip has rank M. Then any f’ := f + h, with f ∈ F and h ∈ span{ψ_p}, minimizing the regularized risk

c ((x₁, y₁, f’(x₁),…,(x_m, y_m, f’(x_m))) + g (|| f ||)

admits a representational of the form

f’(·) = (Σ)^m_i=1 α_ik (x_i, ·) + (Σ)^M_p=1 β_pψ_p (·),

with unique coefficients β_p ∈ R for all p = 1,…., M.”

The above theorems minimise factors such as real-valued function g and cost function c. In ML context, these theorems give provisions for kernels in the training data.

Conclusion:

These mathematical representations may confuse many beginners. It is suggested that they go through the basic concepts of kernel methods for better understanding before working with representer theorems. They are very important while training models. As a result of an understanding of these concepts, algorithms will have an optimal risk functional with minimum regularisation.

The post What is Representer Theorem in Machine Learning? appeared first on Analytics India Magazine.

What is Representer Theorem in Machine Learning?

Kernel Methods And RKHS

Representer Theorem

Conclusion:

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List