iDocSlide.Com

Free Online Documents. Like!

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Share

Description

The effect to diagnostic accuracy of decision tree classifier of fuzzy and k-NN based weighted pre-processing methods to diagnosis of erythemato-squamous diseases

Tags

Transcript

Digital Signal Processing 16 (2006) 922–930www.elsevier.com/locate/dsp
The effect to diagnostic accuracy of decision tree classiﬁer of fuzzyand
k
-NN based weighted pre-processing methods to diagnosis of erythemato-squamous diseases
Kemal Polat
∗
, Salih Güne¸s
Electrical and Electronics Engineering Department, Selcuk University, 42035 Konya, Turkey
Available online 22 May 2006
Abstract
This paper presents a novel method for differential diagnosis of erythemato-squamous disease. The proposed method is based onfuzzy weighted pre-processing,
k
-NN (nearest neighbor) based weighted pre-processing, and decision tree classiﬁer. The proposedmethod consists of three parts. In the ﬁrst part, we have used decision tree classiﬁer to diagnosis erythemato-squamous disease.In the second part, ﬁrst of all, fuzzy weighted pre-processing, which can improved by ours, is a new method and applied toinputs erythemato-squamous disease dataset. Then, the obtained weighted inputs were classiﬁed using decision tree classiﬁer.In the third part,
k
-NN based weighted pre-processing, which can improved by ours, is a new method and applied to inputserythemato-squamous disease dataset. Then, the obtained weighted inputs were classiﬁed via decision tree classiﬁer. The employeddecision tree classiﬁer, fuzzy weighted pre-processing decision tree classiﬁer, and
k
-NN based weighted pre-processing decisiontree classiﬁer have reached to 86.18, 97.57, and 99.00% classiﬁcation accuracies using 20-fold cross validation, respectively.
©
2006 Elsevier Inc. All rights reserved.
Keywords:
Erythemato-squamous; Fuzzy weighted pre-processing;
k
-NN based weighting pre-processing; Decision tree classiﬁer;
k
-Foldcross validation
1. Introduction
The differential diagnosis of erythemato-squamous diseases is a difﬁcult problem and a real problem in dermatol-ogy. Erythemato-squamous diseases all share the clinical features of erythema and scaling, with very little differences.The diseases in this group are psoriasis, seboreic dermatitis, lichen planus, pityriasis rosea, cronic dermatitis, andpityriasis rubra pilaris. Usually a biopsy is necessary for the diagnosis but unfortunately these diseases share manyhistopathological features as well. Another difﬁculty for the differential diagnosis is that a disease may show thefeatures of another disease at the beginning stage and may have the characteristic features at the following stages.Patients were ﬁrst evaluated clinically with 12 features. Afterwards, skin samples were taken for the evaluation of 22histopathological features. The values of the histopathological features are determined by an analysis of the samplesunder a microscope [1].
*
Corresponding author. Fax: +90 332 241 0635.
E-mail address:
kpolat@selcuk.edu.tr (K. Polat).1051-2004/$ – see front matter
©
2006 Elsevier Inc. All rights reserved.doi:10.1016/j.dsp.2006.04.007
K. Polat, S. Güne¸s / Digital Signal Processing 16 (2006) 922–930
923
The use of classiﬁer systems in medical diagnosis is increasing gradually. There is no doubt that evaluation of data taken from patient and decisions of experts are the most important factors in diagnosis. But, expert systems anddifferent artiﬁcial intelligence techniques for classiﬁcation also help experts in a great deal.In this study, a diagnostic system leading to more effective on the differential diagnosis of erythemato-squamousdiseases is presented. Our primary research motivation was to advance the research of erythemato-squamous diseases.We have employed fuzzy weighted pre-processing,
k
-NN based weighted pre-processing, and decision tree classiﬁerin order to distinguish erythemato-squamous diseases.The rest of the paper is organized as follows. We present the related work in Section 2. Section 3 gives the de-scription of the erythemato-squamous diseases dataset. We present the proposed method in Section 4. In Section 5,we give the experimental data to show the effectiveness of our method. Finally, we conclude this paper in Section 6with future directions.
2. Related work
Classiﬁcation systems have been used for erythemato-squamous disease diagnosis problem as for other clinicaldiagnosis problems. There have been several studies reported focusing on erythemato-squamous disease diagnosis.These studies applied different methods to the given problem and achieved high classiﬁcation accuracies using thedataset taken from UCI machine learning repository. Among these studies, the ﬁrst work on the differential diagnosisof erythemato-squamous diseases was conducted by Govenir et al. [2]. In their study, they presented an expert sys-tem for differential diagnosis of erythemato-squamous diseases incorporating decisions made by three classiﬁcationalgorithms: nearest neighbor classiﬁer, naive Bayesian classiﬁer, and voting feature intervals-5. Also, they obtained99.2% classiﬁcation accuracy on the differential diagnosis of erythemato-squamous diseases using voting featureintervals-5 and 10-fold cross validation [2,3]. Ubeyli and Guler [4] obtained 95.5% classiﬁcation accuracy by usingANFIS.Also,Nanni[5]obtained97.22,97.22,97.22,97.22,97.22,97.22,97.22,and97.22%usingLSVM,RS,B1_5,B1_10, B1_15, B2_5, B2_10, and B2_15 algorithms, respectively. In this work, we obtained 99.00% classiﬁcationaccuracy.
3. Erythemato-squamous diseases dataset
The differential diagnosis of erythemato-squamous diseases is an important problem in dermatology. The diseasesin this group are: psoriasis (C1), seboreic dermatitis (C2), lichen planus (C3), pityriasis rosea (C4), cronic dermatitis(C5), and pityriasis rubra (C6). They all share the clinical features of erythema and scaling, with very little differences.These diseases are frequently seen in the outpatient departments of dermatology. At ﬁrst sight all of the diseases look very much alike to erythema and scaling. When inspected more carefully some patients have the typical clinicalfeatures of the disease at the predilection sites (localization of the skin where a disease preters) while another grouphas a typical localization. Patients are ﬁrst evaluated clinically with 12 features. The degree of erythema and scaling,whether the borders of lesions are deﬁnite or not, the presence of itching and koebner phenomenon, the form of thepapules, whether the oral mucosa, elbows, knees, and the scalp are involved or not, whether there is a family historyor not are important for the differential diagnosis. For example, the erythema and scaling of chronic dermatitis is lessthan of psoriasis, the koebner phenomenon is present only in psoriasis, lichen planus, and pityriasis rosea. Itchingand polygonal papules are for lichen planus and follicular papules are for pityriasis rubra pilaris. Oral mucosa ispredilection site for lichen planus, while knee, elbow, and scalp involvement are of psoriasis. Family history is usuallypresent for psoriasis; and pityriasis rubra pilaris usually starts during childhood. Some patients can be diagnosed withthese clinical features only, but usually a biopsy is necessary for the correct and deﬁnite diagnosis. Skin sampleswere taken for the evaluation of 22 histopathological features. Another difﬁculty for the differential diagnosis isthat a disease may show the histopathological features of another disease at the beginning stage and may have thecharacteristic features at the following stages. Some samples show the typical histopathological features of the diseasewhile some do not.Melanin incontinence is a diagnostic feature for lichen planus, ﬁbrosis of the papillary dermis is for chronic der-matitis,exocytosismaybeseeninlichenplanus,pityriasisrosea,andseboreicdermatitis.Acanthosisandparakeratosiscan be seen in all the diseases in different degrees. Clubbing of the rete ridges and thinning of the suprapapillary epi-dermis are diagnostic for psoriasis. Disappearance of the granular layer, vacuolization and damage of the basal layer,
924
K. Polat, S. Güne¸s / Digital Signal Processing 16 (2006) 922–930
Table 1Class code Class Number of instances1 Psoriasis 1122 Seboreic dermatitis 613 Lichen planus 724 Pityriasis rosea 495 Cronic dermatitis 526 Pityriasis rubra pilaris 20Fig. 1. The procedure used in the development of the proposed method.
saw-tooth appearance of retes, and a band like inﬁltrate are diagnostic for lichen planus. Follicular horn plug andperifollicular parakeratosis are hints for pityriasis rubra pilaris. The features of a patient are represented as a vector of features, which has 34 entries for each feature value. In the dataset, the family history feature has the value 1 if anyof these diseases has been observed in the family and 0 otherwise. The age feature simply represents the age of thepatient. Every other feature (clinical and histopathological) was given a degree in the range 0–3. Here 0 indicates thatthe feature was not present, a 3 indicates the largest amount possible, and 1 and 2 indicate the relative intermediatevalues. Each feature has either nominal (discrete) or linear (continuous) value having different weights showing therelevance to the diagnosis [2,3].This erythemato-squamous diseases database comes from the Gazi University and Bilkent University and wassupplied by N. Ilter, M.D., Ph.D., and H.A. Govenir, Ph.D. This dataset contains 34 attributes, 33 of which are linearvalued and one of them is nominal. This dataset srcinally contains 366 instances. Distribution due to class variableof this dataset is given in Table 1 [1].
4. Proposed method
4.1. Overview of the proposed system
Figure 1 shows the procedure used in the development of the proposed method. It consists of three parts: (a) fuzzyweighted pre-processing or
k
-NN based weighted pre-processing, (b) decision tree classiﬁer, (c) classiﬁcation results.
4.1.1. Fuzzy weighted pre-processing
In the fuzzy weighted pre-processing, each feature takes new feature value according to its old value. Two mem-bership functions are explained in this procedure known as input and output membership functions. These are selectedas triangular membership functions as shown in Figs. 2 and 3, respectively.
K. Polat, S. Güne¸s / Digital Signal Processing 16 (2006) 922–930
925Fig. 2. Input membership functions.Fig. 3. Output membership functions.
First, the formation of these membership functions is realized as follows: As a ﬁrst step, the mean values of eachfeature are calculated through using all of the samples’ corresponding feature values in the equation
m
i
=
1
N
N
k
=
1
x
k,i
,
(1)where
x
k,i
represents the
i
th feature value of sample
x
k
,
k
=
1
,
2
,...,N
. After calculation of these sample means foreach feature, the input membership function is formed by triangles as in Fig. 2. The supports of these triangles aredetermined by
m
i
/
8,
m
i
/
4,
m
i
/
2,
m
i
, 2
m
i
, 4
m
i
, 8
m
i
as shown in Fig. 2. The lines of input membership functionsare named as
m
f1
,m
f2
,...,m
f8
. For the output membership function formation, again 8 parts formed membershipfunctions (Fig. 3). The interval [0,1] is divided into 8 equal part and the corresponding lines are again named but inthis case as
m
′
f1
,m
′
f2
,...,m
′
f8
. Before continuing it is worth to note that these input and output membership functionsare formed for each feature so there will be exist different input–output membership function conﬁguration for eachfeature since the sample means of each feature differs.After determination of input and output membership functions for each feature, the weighted procedure comes intoscene. For a feature value, say
x
k,i
, that is for
i
th feature value of
x
k
sample, this value is taken as in the
x
-axis of input membership function and
y
values of the points at which this value cuts the input membership functions aredetermined. For example if this feature value is between 0 and
m
i
/
8, then this point will cut both line
m
f1
and
m
f2
.The
y
values at these intersection points, say
y
1
and
y
2
, are known as membership values (
µ
) and they will then beused in a fuzzy rule base in the following manner: First, the input membership value,
µ(i)
, is determined by using theabove intersection points:
µ(i)
=
µ
A
∩
B
(x
k,i
)
=
MIN
µ
A
(x
k,i
),µ
B
(x
k,i
)
, x
∈
X.
(2)Here
µ
A
(x
k,i
)
and
µ
B
(x
k,i
)
membership values correspond to the intersection points as mentioned above. The rulebase for our system is used as presented in Table 2. After this
µ(i)
value is determined through using Eq. (2) for
926
K. Polat, S. Güne¸s / Digital Signal Processing 16 (2006) 922–930
Table 2Fuzzy rule base for our system
k
value Rules1 If input value cuts
m
f
(k)
and
m
f
(k
+
1
)
then output value
=
(m
f
(k)
′
(y)
+
m
f
(k
+
1
)
′
(y))/
22 If input value cuts
m
f
(k)
and
m
f
(k
+
1
)
then output value
=
(m
f
(k)
′
(y)
+
m
f
(k
+
1
)
′
(y))/
23 If input value cuts
m
f
(k)
and
m
f
(k
+
1
)
then output value
=
(m
f
(k)
′
(y)
+
m
f
(k
+
1
)
′
(y))/
24 If input value cuts
m
f
(k)
and
m
f
(k
+
1
)
then output value
=
(m
f
(k)
′
(y)
+
m
f
(k
+
1
)
′
(y))/
25 If input value cuts
m
f
(k)
and
m
f
(k
+
1
)
then output value
=
(m
f
(k)
′
(y)
+
m
f
(k
+
1
)
′
(y))/
26 If input value cuts
m
f
(k)
and
m
f
(k
+
1
)
then output value
=
(m
f
(k)
′
(y)
+
m
f
(k
+
1
)
′
(y))/
27 If input value cuts
m
f
(k)
and
m
f
(k
+
1
)
then output value
=
(m
f
(k)
′
(y)
+
m
f
(k
+
1
)
′
(y))/
2
our
x
k,i
feature value, the output weight value is then determined by using output membership functions and therules in Table 2. Here in determining weight as a last step, ﬁrst the input membership value,
µ(i)
, is presentedto output membership function to determine the corresponding weighted value of our srcinal feature value. Thismembership value is now taken as a point in
y
-axis of the output membership functions and again as for the casein input membership functions, the intersection points are determined which are cut by this membership value. It isapparent from output membership functions that there will be more than one intersection points. That which of themwill be used is decided through the rules in Table 2. For example, if input feature value cuts
m
f1
and
m
f2
lines in inputmembership functions then the output value for this feature will be the mean of two points that
µ(i)
cuts
m
′
f1
and
m
′
f2
at the output membership functions [6].
4.1.2.
k
-NN based weighting pre-processing
We know from the fuzzy systems that sometimes attributing new values, which are membership values in thefuzzy-logic case, to the data samples can give better results with respect to the classiﬁcation accuracy. This processis done for each attribute of each data sample through giving a new value according to the membership functions. Inthis study however, we did this redetermination of attribute values of each data sample by using
k
-NN algorithm inthe following manner (Fig. 4).The weighting process is conducted for each attribute separately. For example, all of the values of the ﬁrst attributeamong the whole data is reconsidered and changed according to the
k
-NN scheme and then the same process isconducted for the second attribute values of the data samples and son on. The process that is done in determiningone attribute value of one data sample can be explained simply like this: Let
i
be the data sample label and
j
bethe attribute index. That is we are searching a new value for the
j
th value of the
i
th data sample. As can be seenfrom Fig. 4, the ﬁrst thing we must do is to calculate the distances of all other attribute values of the same datasample,
i
th data sample in this case, to this attribute value. Here a simple absolute difference is utilized as a distancemeasure:
d
x
i
(j),x
i
(m)
=
x
i
(j)
−
x
i
(m)
,
(3)where
x
i
(j)
is the
j
th attribute of
i
th data sample, while
x
i
(k)
is the
m
th attribute of
i
th data sample. After thecalculation of the distances, the nearest
k
attribute values are taken and the mean value of these values is calculated:mean_value
(j)
=
1
k
k
attr
n
∈
attr
k
,n
=
1
attr
n
,
(4)where attr
n
represents the value of the
n
th attribute in the
k
-nearest attribute values and
k
is the number of nearestpoints to the related attribute value
j
. This calculated mean value is taken as the new value of the related attributevalue for which these calculations are done. That is,
j
th value of the
i
th data sample is changed with this mean value.The process is conducted for each attribute value of the same data sample in the same manner.
4.1.3. Decision tree classiﬁer
Decision tree learning is one of the most widely used and practical methods for inductive inference. It is a methodfor approximating discrete-valued functions that is robust to noisy data and capable of learning disjunctive expressions[7,8].

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks