A COMPARATIVE ASSESSMENT OF NEURAL NETWORK, FUZZY AND NEURO-FUZZY APPROACHES FOR LANDSLIDE SUSCEPTIBILITY ZONATION IN GARHWAL HIMALAYAS

. Availability of accurate and objective landslide susceptibility maps depicting zones defined on the basis of probability of occurrence of landslides is one of the critical inputs in assessing risk to property and lives in any mountainous region, particularly in the Himalayas. The aim of this study is to assess the utility of soft computing tools, namely, neural network, fuzzy and neuro-fuzzy approaches for for landslide susceptibility zonation and risk assessment in a rigorous mountainous terrain in India.


INTRODUCTION
Landslides in the Himalaya are one of the major and widely spread natural hazards that often strike life and property and are a major concern.One of the requirements for effective landslide mitigation and management programme is the availability of an accurate Landslide Susceptibility Zonation (LSZ) map.LSZ maps categorize a region according to their potential stability or instability, based on geological, geomorphological and topographical factors.The need for LSZ maps at different scales has increased in recent past to support decision makers at various levels of the territorial planning management.
Preparation of LSZ maps requires evaluation of the relationships between various terrain conditions and instances of landslide occurrence.A skilled earth scientist, through his vast experience based on the assessment of the overall terrain conditions usually identifies the causative factors affecting the occurrence of landslides in a region.On their assessment, these factors and their categories are assigned weights and ratings respectively as per their importance in landslide occurrences.The knowledge in the form of weights and ratings is typically input to any LSZ process in several different ways.
In this study, a landslide susceptibility zonation study has been carried out using neural, fuzzy and neuro-fuzzy approaches in remote sensing and GIS domain.A set of raster thematic data layers, each pertaining to a selected causative factor, have been created through digitization of past maps or derived from remote sensing images using an image processing and GIS software.These data have been input to an inhouse developed software namely LaSaRiZ (Landslide Susceptibility and Risk Zonation) software.The outputs from this software are the landslide susceptibility zonation maps produced from various approaches.The evaluation of landslide susceptibility maps derived from the implemented approaches has been performed through landslide density analysis and interpretation of Receiver Operating Characteristic (ROC) curves.

Literature Review
The data integration for various causative factors and their categories to carry out LSZ mapping can either be manual or in a Geographic Information System (GIS) environment as is evident from a number of in-house studies (e.g., Gupta et al. 1993, Saha et al. 2002, Saha et al. 2005, Pareek et al., 2010, Chauhan et al., 2010a).Thus, in these studies, the weights have usually been assigned on the basis of the knowledge domain of the expertise about the subject and the area.This weight assignment strategy, at times, may however be highly subjective and may therefore contain some implicit biasness.Therefore, in order to reduce the subjectiveness in the weight assignment procedure, a number of alternative strategies (e.g., Saha et al. 2005, Mathew et al. 2007, Champati Ray et al., 2007, Pradhan et al. 2009) have been attempted in recent years for LSZ mapping in Himalaya and other parts of the world.Most of these studies are based on establishing the relationships between categories of the causative factors and incidences of the existing landslides in a given region through spatial data analyses.Thus, a range of data driven approaches have been proposed, which include logistic regression and multivariate statistical methods (Dai et al. 2001 Sezer et al., 2011).Each approach is based on a different mathematical concept but has been used with an ultimate aim to produce an LSZ map in an objective manner thereby reducing the subjectivity in the weight assignment procedure.
For the last one decade, several ANN based LSZ studies have been conducted (e.g., Lee et al., 2003;Arora et al., 2004;Kanungo et al., 2006;Nefeslioglu et al., 2008;Pradhan and Lee, 2010;Pradhan et al., 2010a, b;Sezer et al., 2011).In these studies, the ability of an ANN to learn non-linear functions has been exploited for landslide susceptibility mapping in a region where the data pertaining to causative factors may not be approximated by a normal distribution and are non-linearly related.
In addition, fuzzy set theory has also been found useful for landslide mapping (e.g., Chi et al. 2002).Here, a fuzzy set can be utilized to assign varying degree of memberships to the categories of causative factors according to their importance in landslide occurrence.Recently, attempts (e.g., Kanungo et al., 2006) have also been made to combine the ANN and fuzzy set approaches wherein weights to be assigned to causative factors are determined objectively through ANN and ratings are assigned to categories of factors using fuzzy set theory (e.g., Kanungo et al., 2006;Pradhan et al., 2010b;Sezer et al., 2011).
In this study, the traditional backpropagation ANN has been implemented for the determination of weights of causative factors via a connectionist weighting process (Olden et al. 2004).The backpropagation ANN, as implemented in the software, can also be used as a black box to independently produce an LSZ map.A fuzzy relation concept based on cosine amplitude method has been implemented to determine ratings (equivalent to fuzzy membership values) of categories of causative factors.The fuzzy relation concept can be used to independently produce a fuzzy-set based LSZ map and can be combined with neural network to produced a neuro-fuzzy based LSZ map.

Back-propagation neural network (BPNN) approach
A typical neural network architecture for LSZ mapping may be a three layer design with seven input neurons corresponding to seven thematic data layers (one for each causative factor affecting landslide) in the input layer and one output neuron corresponding to presence or absence of landslide in the output layer.The number of neurons in the hidden layer is determined by trial and error.
The back propagation neural network (BPNN) is based on the mathematical background given in Arora et al. (2004) and Kanungo et al. (2006).The flowchart of implementing BPNN as black box for LSZ as implemented in LaSaRiZ is given in figure 1.

Yes
Compute the output at hidden units and output units using Equation ( 1) and then apply sigmoid function Compute the network error using Equation ( 2) weights, which are constantly updated until an acceptable training accuracy is attained.While developing an ANN, the data are commonly partitioned into two subsets; training for the development of the model and testing for the validation of the model.It is expected that the training data will represent all the characteristics belonging to the problem domain (e.g., LSZ).The training data via input neurons are processed through hidden neurons to generate an output at the output neuron.
In the ANN black box approach, the adjusted weights obtained from the trained network are subsequently used to process the testing data to assess the accuracy and generalization capability the network.Once the network is trained and tested to the desired accuracy, the adjusted weights are used to simulate the complete dataset.In the present context, the network output values for the whole dataset has been categorized into one of the five landslide susceptible zones to produce the LSZ map from ANN.This approach has been referred to as ANN black box approach, since, in this case, the weights remain hidden (Arora et al. 2004).

Fuzzy set based cosine amplitude approach
The cosine amplitude approach is a fuzzy relation concept to compute ratings, which define the degree of relationship between category of causative factors and landslide occurrence.The flowchart of this approach, as implemented in the software, is shown in figure 3. The approach evaluates the relation between the existing landslide occurrence and the categories of each causative factor considered.The categories of existing landslide distribution layer and categories from each thematic data layer (corresponding to each causative factor) taken one at a time have been considered as two binary datasets for the computation of ratings or strength of relationship.The ratings, thus obtained, have been integrated to estimate the Landslide Susceptibility Index (LSI) values.The LSI values have been categorised in an ordinal manner to produce an LSZ map.

Combined neural network and fuzzy approach
A combined neural network and fuzzy approach takes advantage from both to integrate ANN derived weights and fuzzy set derived ratings to produce an LSZ map.This map is expected to be more accurate than the LSZ maps produced either from ANN or fuzzy relation concept.The implementation of this approach can be understood with the help of the flowchart given in Figure 3.The combined neural and fuzzy approach involves three steps, (i) Determination of weights of causative factors through ANN connection-weight approach; (ii) Determination of ratings for categories of causative factors using cosine amplitude method; and (iii) The integration of weights and ratings to produce the LSZ map.The processing steps in defining the architecture of the ANN, its training and testing are similar to those of ANN black box approach.The connection weight matrices from the trained two-hidden layer network for input-hidden, hidden-hidden and hidden-output layers are extracted.Simple matrix multiplications of these weight matrices are performed to achieve a weight matrix representing weights corresponding to the causative factors (Olden et al. 2004).The ratings of the categories of causative factors are determined from the cosine amplitude method.The weights, derived from ANN and ratings computed from cosine amplitude method are integrated to compute LSI values, which have been further categorized to produce the LSZ map.

Study Area and Data
The study area belongs to parts of Chamoli and Rudraprayag districts of the State of Uttarakhand, India, in the Himalayan region and covers about 600 sq km.There are a number of thrusts and faults passing through the area which has rendered the rock mass weak.The area is also characterized by fragile geology and complex tectonics.The region had also witnessed two major earthquakes in the recent past one; Uttarakashi in 1991 and Chamoli in 1999, which caused extensive damages to the life and property.These seismic ground movements make the lithology fragile and cause landslides.
All these factors along with torrential rainfall make the slopes inherently unstable, which lead to occurrences of landsides in the region.Based on the past studies on LSZ in the region, seven key causative factors, namely, slope, aspect, relative relief, lithology, structural features, land use land cover and drainage density, were considered.
Spatial data pertaining to these causative factors were collected from satellite remote sensing images (IRS-1C and P6 satellite sensors -PAN, LISS-III and LISS IV), Survey of India (SOI) topographic maps, Valdiya's geological map (Valdiya, 1980) and the field campaigns.Table 1 provides the details of these data along with their usage in the study.These spatial data were appropriately processed and analyzed in an image processing and GIS software to prepare seven thematic data layers.Further details on the study area, causative factors and their justification, and the process of preparation of thematic data layers can be found in Chauhan et al. (2010ab).The seven thematic data layers have been named as slope, aspect, relative relief, structural buffer, lithology, drainage density and landuse landcover.These layers are stacked as a single database to be input to LaSaRiZ for various operations.
The identification and mapping of existing landslides is a pre-requisite to develop any data-driven model for LSZ.Therefore, existing landslide locations have been interpreted visually and mapped from high-resolution LISS IV (MX) image and PAN sharpened multispectral image.A total of 154 landslides of varying dimensions have been mapped, which were subsequently digitized and rasterized to create a landslide distribution data layer.From this landslide distribution data layer, the pixels of landslide and no-landslide attributes are extracted, which are then used for training and testing of the three approaches for LSZ mapping implemented in the LaSaRiZ software.

LSZ using back-propagation neural network approach
The database of landslide and non-landslide pixels consist of a total of 2621 pixels denoting the presence of landslide and an equal number of pixels denoting the absence of landslide.This dataset is divided into two mutually exclusive datasets; 80% for training and 20% for testing.A number of ANN architectures have been designed, which are trained with back-propagation learning algorithm, with learning rate of 0.01 and a momentum factor of 0.2.The training parameters have been kept fixed across various neural networks for their effective comparison.The training process is initiated by assigning arbitrary initial connection weights, which are constantly updated until an acceptable training accuracy is achieved.The adjusted weights obtained from the trained network are subsequently used to process the testing data to assess the accuracy and the generalization capability of each network.The training and testing data accuracies of each network are listed in Table 2. From this table, it can be observed that as the neural network architecture changes, the training and testing data accuracies increase up to a certain neural network design, after which a decrease in accuracy occurs.This shows that there is an optimum ANN architecture for this dataset.Moreover, the training data accuracies differ from testing data accuracies for various architectures.Larger is the difference between training and testing data accuracies, lesser may be the generalization capability of an ANN.Keeping this in view, an ANN with 7×9×5×1 architecture, producing a training accuracy of 75.2% and a testing accuracy of 71.7%, which not only depicts less difference between the training and testing data accuracies but attained high absolute values, has been considered as the most appropriate one for the current dataset.Thus, the adjusted weights from this ANN have been used to determine the network output of all the pixels of the image.The values of neural network outputs for each pixel have been found to vary from 0.01 to 0.998, which reflect the LSI values of pixels.Higher the value of LSI, more susceptible is that pixel towards the occurrence of landslide.These LSI value have been categorized arbitrarily into five landslide susceptibility zones in an ordinal fashion (Table 3) to produce an LSZ map.

LSZ using cosine amplitude based fuzzy relation approach
The categories of landslide distribution layer and categories of a thematic data layer, taken one at a time, have been considered as two datasets for the computation of ratings or strength of relationship.The pixels in the landslide areas have been assigned a value of 1, whereas remaining pixels are assigned a value of 0 in the landslide distribution layer.Similarly, a value of 1 has been assigned to a particular category of a thematic layer and a value of 0 to remaining pixels.Hence, in total, there are 43 data layers in binary form (i.e., 42 layers of categories of causative factors and 1 layer of landslide distribution category).With the help of these data layers, fuzzy memberships or ratings of all the 42 categories have been determined using the cosine amplitude method and are given in Table 4.By assigning the ratings of the 42 categories (t) in the corresponding binary layers of categories, 42 images of r ij have been generated.These 42 rated images (R l ) have been integrated to LSI values, which are found to range between 0.045 and 0.274.A probability distribution curve with mean (µ 0 ) value of 0.15 and standard deviation (σ 0 ) value of 0.026 of these LSI values has been produced.A success rate curve approach, as described in Saha et al. (2005), has been adopted to categorize the LSI values into five ordinal landslide susceptible zones.Accordingly, the boundaries of landslide susceptible zones have been fixed at LSI values of 0.107, 0.136, 0.164 and 0.193 and an LSZ map.

LSZ using combined neural network and fuzzy approach
ANN architecture with one input layer, two hidden layers and one output layer has been adopted.Similar to earlier ANN approach, the number of neurons in the hidden layers have been varied in this case also by running the networks several times to achieve the desired training and testing data accuracies.The training and testing accuracies of 10 networks is shown in Table 5.
Table 5 From this table, a variation in both training and testing accuracies can be noticed as the neural network architecture changes.This suggests that there exists optimal neural network architecture for a given dataset.An ANN with 7×12×10×1 architecture with training accuracy 79.3% and testing accuracy 74.8% has been found to be the most appropriate one, as it provides the least difference between training and testing accuracies.The updated weights of input-hidden, hidden-hidden and hidden-output connections for this network have been captured for further analysis.Simple matrix multiplications of these weight matrices are performed to obtain the final weight matrix corresponding to the factors.The weights of the causative factors thus obtained after connection weight analysis, are given in Table 6.It can be observed from this table that causative factor land use land cover has the most influencing effect with highest value of weight as 3.14.This is followed by the structural features factor with the value of 2.07.Thus, unlike ANN blackbox approach, the importance of each causative factor on the basis of ANN derived weights, can be ascertained in this approach.Further, the ratings of each category of the factor have been determined from fuzzy relation based cosine amplitude method.The weights are integrated with ratings to compute LSI values, which range from 0.0582 to 0.3849 with a mean value of 0.22 and standard deviation value of 0.048.These LSI values have been categorized using success rate curve method.Accordingly, the boundaries of landslide susceptible zones were fixed at LSI values of 0.148, 0.196, 0.244 and 0.292 to produce the LSZ map.

RESULTS AND DISCUSSION
The LSZ maps produced from the three approaches through LaSaRiZ software have been evaluated with each other in respect of the distribution of existing landslides in the area, i.e. on the basis of landslide density, and then through ROC curves.Landslide density has been defined as the ratio of the existing landslide area (in percent) obtained from landslide distribution layer to the area of each landslide susceptibility zone (in percent) obtained from an LSZ map.The distribution of landslide susceptibility zones and the landslide densities for all the three approaches are given in Table 7.It can be seen that in case of back-propagation neural network approach, 72% of the observed landslides fall in 39% of the total area categorized into very high and high susceptibility zone.Also, a very large area of about 29% is obtained as very high susceptibility zone in neural network approach, which does not show any defined pattern and is found to be distributed overall in the map.In case of the cosine amplitude based fuzzy relation approach, 64.65% of observed landslides fall in 24.9% of identified very high and high susceptibility zones.However, in case of combined neural network and fuzzy approach, 74.5% of observed landslides fall in 29.0% of identified very high and high susceptibility zones, which in fact should be the case (i.e., areas belonging to very high and high susceptibility zones have been further narrowed down).This outcome can also be corroborated from the study of landslide density values.Usually, an ideal LSZ map should have the highest landslide density for VHS zone, as compared to other zones and there ought to be a decreasing trend of landslide density values successively from VHS to VLS zone.It has also been ascertained that the landslide density values for VHS zone of LSZ maps are higher than those obtained for other susceptibility zones.There is also a decreasing trend of landslide density values from VHS zone to VLS zone.Thus, based on the landslide density values of different landslide susceptibility zones and their trend from VHS to VLS zones for all the LSZ maps, it can be inferred that the combined neural network and fuzzy approach performs significantly better than the other two approaches for LSZ mapping.
The acceptability of LSZ map produced from the combined neural network and fuzzy approach has further been strengthened through ROC curves (Swets, 1988, Mathew et al. 2009).In the present context, the true positive rate has been defined as the number of correctly classified predicted landslide pixels over the total predicted landslides and is represented on the Y-axis of ROC curve.The false positive rate was defined as the number of incorrectly classified landslide pixels over the total predicted no-landslide pixels and is represented on the X-axis of the ROC curve.The area under curve (AUC) constitutes one of the most commonly used accuracy statistics for the prediction models in natural hazard assessments (Begueria 2006).The minimum value of AUC is 0.5 signifies that the model does not accurately predict the occurrence of landslide while a maximum value of AUC is 1 denotes perfect prediction.
To assess the performance of all the three LSZ approaches, the ROC curves have been generated using the SPSS 16.0 software.For this purpose, a test data set consisting of randomly selected pixels from landslide and no landslide pixels has been considered.The ROC curves for the LSZ maps from the ANN black box (AUC=0.84),fuzzy relation based (AUC=0.86)and combined neural network and fuzzy (AUC=0.92).These curves clearly depict that the combined neural network and fuzzy based approach model is the most successful one in predicting the probability of landslide susceptibility for the study area, since the AUC value for the LSZ map derived from this approach is higher than that obtained from other two approaches.

CONCLUSION
In this paper, a comparative study on landslide susceptibility zonation using three soft computing approaches, namely, ANN black box approach, fuzzy relation based approach and combined neural network and fuzzy approach was presented.The efficacy of the approaches was examined through a case study in the Himalayan region.The LSZ map produced by the combined neural network and fuzzy approach showed systematic and a decreasing trend of variation in landslide density values from VHS to VLS zones in the region.Thus, for the

Figure 2 :
Figure 2: Flow chart to perform LSZ using fuzzy set based cosine amplitude approach

Table 1 :
Data sources and specific use

Table 2 :
Training and testing data accuracies (I: input layer, H1: first hidden layer, H2: second hidden layer and O: output layer).Bold values indicate the highest accuracy

Table 3 :
Classification of LSI values obtained from ANN black box approach into landslide susceptible zones

Table 4 :
Fuzzy ratings for different categories of causative factors as obtained from cosine amplitude approach : Training and testing accuracies in combined neural network and fuzzy approach

Table 6 :
Weights of causative factors derived through ANN in combined neuro-fuzzy approach

Table 7 :
Landslide distribution in various landslide susceptible zones