Institute for Advanced Management Systems Research
Department of Information Technologies
Åbo Akademi University
Neuro-fuzzy classifiers - Tutorial
Robert Fullér
Directory
• Table of Contents
• Begin Article
c 2010 [email protected]
November 29, 2010
Table of Contents
1. Fuzzy IF-THEN rules for classification
2. An adaptive-network-based fuzzy classifier
3
1. Fuzzy IF-THEN rules for classification
Conventional approaches of pattern classification involve clus-
tering training samples and associating clusters to given cate-
gories. The complexity and limitations of previous mechanisms
are largely due to the lacking of an effective way of defining
the boundaries among clusters. This problem becomes more
intractable when the number of features used for classification
increases.
On the contrary, fuzzy classification assumes the boundary be-
tween two neighboring classes as a continuous, overlapping
area within which an object has partial membership in each
class. This viewpoint not only reflects the reality of many ap-
plications in which categories have fuzzy boundaries, but also
provides a simple representation of the potentially complex par-
Toc JJ II J I Back J Doc Doc I
Section 1: Fuzzy IF-THEN rules for classification 4
tition of the feature space. In brief, we use fuzzy IF-THEN
rules to describe a classifier.
Assume that K patterns xp = (xp1 , . . . , xpn ), p = 1, . . . , K
are given from two classes, where xp is an n-dimensional crisp
vector. Typical fuzzy classification rules for n = 2 are like
If xp1 is small and xp2 is very large then
xp = (xp1 , xp2 ) belongs to C1
If xp1 is large and xp2 is very small then
xp = (xp1 , xp2 ) belongs to C2
where xp1 and xp2 are the features of pattern (or object) p, small
and very large are linguistic terms characterized by appropriate
membership functions.
Toc JJ II J I Back J Doc Doc I
Section 1: Fuzzy IF-THEN rules for classification 5
The firing level of a rule
<i : If xp1 is Ai and xp2 is Bi
then xp = (xp1 , xp2 ) belongs to Ci
with respect to a given object xp is interpreted as the degree of
belogness of xp to Ci .
This firing level, denoted by αi , is usually determined as
αi = min{Ai (xp1 ), A2 (xp2 )}.
As such, a fuzzy rule gives a meaningful expression of the qual-
itative aspects of human recognition.
Based on the result of pattern matching between rule antecedents
and input signals, a number of fuzzy rules are triggered in par-
allel with various values of firing strength.
Toc JJ II J I Back J Doc Doc I
Section 1: Fuzzy IF-THEN rules for classification 6
Individually invoked actions are considered together with a com-
bination logic. Furthermore, we want the system to have learn-
ing ability of updating and fine-tuning itself based on newly
coming information. The task of fuzzy classification is to gen-
erate an appropriate fuzzy partition of the feature space.
In this context the word appropriate means that the number of
misclassified patterns is very small or zero.
Then the rule base should be optimized by deleting rules which
are not used. Consider a two-class classification problem shown
in Figure 1. Suppose that the fuzzy partition for each input fea-
ture consists of three linguistic terms
{small, medium, big}
which are represented by triangular membership functions.
Toc JJ II J I Back J Doc Doc I
Section 1: Fuzzy IF-THEN rules for classification 7
Both initial fuzzy partitions in Figure 1 satisfy 0.5-completeness
for each input variable, and a pattern xp is classified into Class
j if there exists at least one rule for Class j in the rule base
whose firing strength with respect to xp is bigger or equal to
0.5.
So a rule is created by finding for a given input pattern xp the
combination of fuzzy sets, where each yields the highest degree
of membership for the respective input feature.
If this combination is not identical to the antecedents of an al-
ready existing rule then a new rule is created.
However, it can occur that if the fuzzy partition is not set up cor-
rectly, or if the number of linguistic terms for the input features
is not large enough, then some patterns will be misclassified.
Toc JJ II J I Back J Doc Doc I
terms for the input features is not large enough, then
some patterns will be missclassified.
Section 1: Fuzzy IF-THEN rules for classification 8
x2
B3 1
B2 1/2
B1
A1 A2 A3
1
1/2 1 x1
Figure 1: Initial fuzzy partition with 9 fuzzy subspaces and 2 misclassified
Figure
patterns. Closed and1openInitial
circles fuzzy partition
represent the given with
pattens9 from
fuzzy
Class 1
subspaces
and Class and 2 misclassified patterns. Closed and
2, respectively.
Toc
openJJ
circlesII
represent
J the given
I pattens
Back Jfrom
Doc
Class
Doc I
Section 1: Fuzzy IF-THEN rules for classification 9
The following 9 rules can be generated from the initial fuzzy
partitions shown in Figure 1:
<1 : If x1 is small and x2 is big then x = (x1 , x2 ) belongs to C1
<2 : If x1 is small and x2 is medium then x = (x1 , x2 ) belongs to C1
<3 : If x1 is small and x2 is small then x = (x1 , x2 ) belongs to C1
<4 : If x1 is big and x2 is small then x = (x1 , x2 ) belongs to C1
<5 : If x1 is big and x2 is big then x = (x1 , x2 ) belongs to C1
<6 : If x1 is medium and x2 is small then xp = (x1 , x2 ) belongs to C2
<7 : If x1 is medium and x2 is medium then xp = (x1 , x2 ) belongs to C2
<8 : If x1 is medium and x2 is big then xp = (x1 , x2 ) belongs to C2
<9 : If x1 is big and x2 is medium then xp = (x1 , x2 ) belongs to C2
where we have used the linguistic terms small for A1 and B1 ,
medium for A2 and B2 , and big for A3 and B3 .
Toc JJ II J I Back J Doc Doc I
Section 1: Fuzzy IF-THEN rules for classification 10
However, the same rate of error can be reached by noticing that
if ”x1 is medium” then the pattern (x1 , x2 ) belongs to Class 2,
independently from the value of x2 , i.e. the following 7 rules
provides the same classification result
<1 : If x1 is small and x2 is big then x = (x1 , x2 ) belongs to C1
<2 : If x1 is small and x2 is medium then x = (x1 , x2 ) belongs to C1
<3 : If x1 is small and x2 is small then x = (x1 , x2 ) belongs to C1
<4 : If x1 is big and x2 is small then x = (x1 , x2 ) belongs to C1
<5 : If x1 is big and x2 is big then x = (x1 , x2 ) belongs to C1
<6 : If x1 is medium then xp = (x1 , x2 ) belongs to C2
<7 : If x1 is big and x2 is medium then xp = (x1 , x2 ) belongs to C2
As an other example, Let us consider a two-class classification
problem. In Figure 2 closed and open rectangulars represent
the given from Class 1 and Class 2, respectively. If one tries to
Toc JJ II J I Back J Doc Doc I
Section 1: Fuzzy IF-THEN rules for classification 11
classify all the given patterns by fuzzy rules based on a simple
fuzzy grid, a fine fuzzy partition and (6 × 6 = 36) rules are
required.
However, it is easy to see that the patterns from Figure 3 may be
correctly classified by the following five fuzzy IF-THEN rules
<1 : If x1 is very small then Class 1,
<2 : If x1 is very large then Class 1,
<3 : If x2 is very small then Class 1,
<4 : If x2 is very large then Class 1,
<5 : If x1 is not very small and x1 is not very large
and x2 is not very small and x2 is not very large
then Class 2
Toc JJ II J I Back J Doc Doc I
Section 1: Fuzzy IF-THEN rules for classification 12
gure 2 A two-dimensional classification problem
Figure 2: Fuzzy partition with 36 fuzzy subspaces.
Toc JJ II J I Back J Doc Doc I
Section 1: Fuzzy IF-THEN rules for classification 13
0.5
B6 1
R25 R45
X2
R22 R42 B2
B1 0
A1 A4 A6
0.5
0 X1 1
Figure 3 Fuzzy partition with 36 fuzzy subspaces.
Figure 3: Fuzzy partition with 36 fuzzy subspaces.
Toc However,
JJ it II
is easy toJsee thatIthe patterns
Back from
J DocFig-Doc I
Section 2: An adaptive-network-based fuzzy classifier 14
2. An adaptive-network-based fuzzy classifier
Sun and Jang propose an adaptive-network-based fuzzy clas-
sifier to solve fuzzy classification problems. Figure 4 demon-
strates this classifier architecture with two input variables x1
and x2 . The training data are categorized by two classes C1 and
C2 . Each input is represented by two linguistic terms, thus we
have four rules.
• Layer 1 The output of the node is the degree to which
the given input satisfies the linguistic label associated to
this node. Usually, we choose bell-shaped membership
functions
2
1 u − ai1
Ai (u) = exp − ,
2 bi1
Toc JJ II J I Back J Doc Doc I
Section 2: An adaptive-network-based fuzzy classifier 15
Layer 1 Layer 2 Layer 3 Layer 4
A1 T
x1 S C1
A2 T
B1 T
S C2
x2
B2 T
Figure 4 An adaptive-network-based fuzzy classifier.
Figure 4: An adaptive-network-based fuzzy classifier.
• Layer 1 The output of the node is the degree
to which the given input satisfies the linguistic
label associated to this node. Usually, we choose
Toc JJbell-shaped
II membership
J functions
I Back J Doc Doc I
Section 2: An adaptive-network-based fuzzy classifier 16
to represent the linguistic terms, where
{ai1 , ai2 , bi1 , bi2 },
is the parameter set.
As the values of these parameters change, the bell-shaped
functions vary accordingly, thus exhibiting various forms
of membership functions on linguistic labels Ai and Bi . In
fact, any continuous, such as trapezoidal and triangular-
shaped membership functions, are also quantified candi-
dates for node functions in this layer. The initial values of
the parameters are set in such a way that the membership
functions along each axis satisfy -completeness, normal-
ity and convexity. The parameters are then tuned with a
descent-type method.
• Layer 2 Each node generates a signal corresponing to the
conjuctive combination of individual degrees of match.
Toc JJ II J I Back J Doc Doc I
Section 2: An adaptive-network-based fuzzy classifier 17
All nodes in this layer is labeled by T , because we can
choose any t-norm for modeling the logical and operator.
The nodes of this layer are called rule nodes.
We take the linear combination of the firing strengths of the
rules at Layer 3 and apply a sigmoidal function at Layer 4 to
calculate the degree of belonging to a certain class. If we are
given the training set {(xk , y k ), k = 1, . . . , K} where xk refers
to the k-th input pattern and
(
k
(1, 0)T if xk belongs to Class 1
y =
(0, 1)T if xk belongs to Class 2
then the parameters of the hybrid neural net (which determine
the shape of the membership functions of the premises) can be
learned by descent-type methods.
Toc JJ II J I Back J Doc Doc I
Section 2: An adaptive-network-based fuzzy classifier 18
This architecture and learning procedure is called ANFIS (adaptive-
network-based fuzzy inference system) by Jang.
The error function for pattern k can be defined by
1 k
Ek = (o1 − y1k )2 + (ok2 − y2k )2
2
where y k is the desired output and ok is the computed output by
the hybrid neural net.
Toc JJ II J I Back J Doc Doc I