Abstract

Evaluation of customer satisfaction is an important area of marketing research in which products are defined by attributes that can be grouped into different categories depending on their contribution to customer satisfaction. It is important to identify the category of an attribute so that it can be prioritized by a manager. The Kano model is a well-known method to perform this task for an individual customer. However, it requires filling in a form, which is a difficult and time-consuming exercise. Many existing methods require less effort from the customer side to perform data collection and can be used for a group of customers; however, they are not applicable to individuals. In the present study, we develop a data-analytic method that also uses the dataset; however, it can identify the attribute category for an individual customer. The proposed method is based on the probabilistic approach to analyze changes in the customer satisfaction corresponding to variations in attribute values. We employ this information to reveal the relationship between an attribute and the level of customer satisfaction, which, in turn, allows identifying the attribute category. We considered the synthetic and real housing datasets to test the efficiency of the proposed approach. The method correctly categorizes the attributes for both datasets. We also compare the result with the existing method to show the superiority of the proposed method. The results also suggest that the proposed method can accurately capture the behavior of individual customers.

1. Introduction

Measuring customer satisfaction plays an important role in understanding the customer behavior [1]. Keeping customers satisfied is one of the main goals of any company. Each product can be defined by attributes, which contribute differently to the level of the customer satisfaction. The relationship between an attribute and customer satisfaction is asymmetric and nonlinear [2, 3]. It is important to identify the relationship between total customer satisfaction and that corresponding to particular attributes (denoted as attribute-level performance) so that managers can focus their limited resources on critical attributes [2, 4, 5]. Managers also want to know the behavior of the individual customers for personalized marketing.

The Kano model [2] has been applied successfully to categorize attributes with regard to the customer satisfaction in various domains. In some cases [6, 7], the Kano model has also been used to determine the importance of individual attributes to customer satisfaction. The Kano model is used to divide the product attributes into the following six categories: must-be, one-dimensional, attractive, indifferent, reverse, and questionable attributes. The definitions of these attributes are as follows:(1)Must-be (M) attributes: these are the attributes that the customers expect to be presented by default. The high values of these attributes contribute little to total customer satisfaction; however, the low values lead to the high extent of dissatisfaction.(2)One-dimensional (O) attributes: the low values of these attributes lead to customer dissatisfaction, whereas the high values contribute to higher level of customer satisfaction.(3)Attractive attributes (A): the absence of these attributes does not contribute to customer dissatisfaction. However, their presence allows increasing the level of customer satisfaction.(4)Indifferent (I) attributes: these attributes do not contribute to customer satisfaction or dissatisfaction notably.(5)Reverse (R) attributes: the high values of these attributes lead to customer dissatisfaction, whereas low values cause an increase in the level of customer satisfaction.(6)Questionable (Q) attributes: the customer gives conflicting responses to these attributes; in general, this category is not considered valid.

The Kano model is one of the most important methods proposed to identify the categories of attributes [4, 5, 811]. The Kano model has been successfully applied to various domains, such as e-learning [12], project management [13], airline quality [14], food and beverage industry [15], smart phones [16], tourism [17, 18], websites [19], and nursing homes [20].

The Kano model requires filling in a data collection form to identify the category of an attribute for a given customer (Table 1). The form includes two questions: the first question corresponds to the presence of the attribute, and the second question is related to its absence. These two questions imply the five response options (like, expect, neutral, live with, and dislike). The like option corresponds to the highest level of satisfaction, whereas the dislike one represents the highest extent of dissatisfaction. The combination of these two answers allows defining the category of the attribute for a customer. A category for an attribute may differ for different customers; for example, an attribute may be must-be for one customer, whereas the same attribute may correspond to the attractive category for the other customer. For a group of customers, first, the category of an attribute for each customer is defined. Then, generally, the category of the attribute according to the responses of the group of the customers is derived by computing the frequencies of each category [14, 21]. Different customers may categorize an attribute into different categories. The most frequent response determines the Kano category. For example, for an attribute X, fi number of customers categorize X as M, f2 number of customers categorize X as A, and f3 number of customers categorize X as O. The largest value of fi, f2, and f3 will decide the category of the attribute X.

The Kano model can be used to identify the attribute category for individual customers, as well as for a group of the customers; however, filling in the data collection form is rather challenging and time-consuming for a customer. Customers have to give their opinion about the present and absent conditions of an attribute, while in fact, they may not have acknowledged these particular conditions, and therefore, their viewpoint may not reflect the reality precisely.

Many existing approaches [3, 15, 2123] are based on the datasets in which customers provide their estimates on the level of satisfaction with regard to each attribute in a given scale and evaluate total satisfaction from the product. Accordingly, several methods [3, 15, 2123] have been proposed to employ this type of data to predict the category of attributes. It is easy to collect these datasets as customers describe their experience about the product attributes, and these values are likely to be more accurate as they are based on real experience. However, these methods can be used to define the categories of attributes for all customers presented in a dataset and are not applicable to identify the attribute category for individual customers. An example of this type of datasets is presented in Table 2. Four customers (N1, N2, N3, and N4) provide their estimates on the level of satisfaction according to a scale of 1–5 for the three attributes (X, Y, and Z). They also evaluate their total satisfaction from the product in a scale of 1–5.

In the present paper, we propose a novel method that employs the customer satisfaction data (similarly as the example of these data presented in Table 2) that can be applied to identify the category of an attribute for individual customers or for a set of customers.

The proposed method is based on a probabilistic approach used to identify the relationship between the attribute-level performance and the total-level customer satisfaction. Then, the rules defined by Ahmad et al. [22] are applied to specify the category of an attribute. The proposed method does not imply any assumption on the underlying statistical distribution; therefore, it allows avoiding misspecification of a model.

The Kano model suggests the five categories, whereas later studies [3, 15, 2224] focus on the three-factor theory (basic (must-be), performance (one-dimensional), and excitement (attractive)). Generally, existing methods classify an attribute into one of these three categories or into a random (indifferent) category. Following these studies, our proposed method classifies an attribute into one of the four categories.

The remainder of the paper is structured in the following way. Section 2 provides the literature survey. Section 3 discusses the proposed method. The research results and discussion are presented in Section 4. The paper is completed with the conclusion and discussion on future research directions.

2. Literature Survey

In the previous related studies, it has been observed that the relationships between the attribute-level performance and total customer satisfaction are nonlinear and asymmetrical. On the basis of these relationships, attributes can be classified into different categories. Different approaches have been proposed to classify the product’s attributes. In this section, we will discuss these approaches.

The Kano questionnaire [2] is one of the most widely used approaches to classify attributes. This method implies that each customer has to fill in a questionnaire. On the basis of the obtained responses, the category of an attribute is defined corresponding to each customer. Thereafter, generally, majority voting is employed to identify the category for a group of customers. Some studies [25, 26] have demonstrated that using the most frequent response method may not lead to precise categorization of attributes. Different studies [4, 5] have concluded that the Kano questionnaire is the most efficient approach to classify attributes into the corresponding categories. However, they have outlined the difficulty in collecting the data required to apply this method.

Emery and Tian [27] proposed a simple direct classification by using the attribute method. In this method, customers are supposed to be provided with the information about the theory of Kano categories and then to be asked to classify the attributes into one of the categories. This method is deemed simple; however, it is a challenging and time-consuming task to explain the Kano categories to customers clearly.

Various methods have been proposed on the basis of the concept of using the datasets similar to the one presented in Table 2. Some of these methods are based on regression methodology. For example, Brandt [28] used dummy variable regression to classify attributes. In this approach, the values of coefficients represent the impact of an attribute on total customer satisfaction. These coefficients are then used to identify the categories of attributes [3, 23, 29, 30]. Lin et al. [23] suggested that applying dummy variable regression to classify attributes into the Kano categories may provide inaccurate results in the cases when customer responses are skewed. They proposed applying moderated regression to handle such cases aiming to classify attributes into the Kano categories. Chen [15] argued that using moderated regression can result in misleading classification due to the cofounding effect between attributes and total customer satisfaction. To overcome this deficiency and to identify the relationship between these elements correctly, Chen [15] proposed employing ridge regression. Mainly, the aforementioned methods are based on using the linear regression function. Finn [31] suggested applying polynomial regression to detect nonlinear effects. Then, Lin et al. [21] used the logistic regression function to capture nonlinear relationships between the attribute-level performance and the level of customer satisfaction. In this method, the odds of customer satisfaction are considered to identify attribute categories.

Vavra [32] proposed to consider jointly the explicit importance (based on the direct ratings or customer statements) and implicit one (derived through regression analysis) of an attribute aiming to identify its category. However, several studies [3, 29, 30] have demonstrated that the regression analysis alone performs better than this approach.

Data mining techniques have also been applied to categorize the attributes. Robnik-Šikonja and Vanhoof [24] employed the RELIEF [33] attribute selection technique to estimate the effect (positive or negative) of each attribute-level performance value on the total level of customer satisfaction. In accordance with the extent of how the effects are changing with a change in the attribute-level performance values, the category of the attribute can be identified. This method is computationally expensive as it considers the k-nearest neighbors of each member of a set of training points. Ahmad et al. [22] proposed a rule-based method to identify the category of attributes. First, a support set and significance of each attribute-level performance level are obtained by using the mutual associations between the attribute-level performance values and the customer satisfaction level. These quantities are used as an input to the proposed rules, which, in turn, are employed to define the category of the attribute. Ahmad et al. [22] presented their results on the housing data to demonstrate the effectiveness of the proposed approach. Deng et al. [34] used neural networks to identify the relationships between attributes and customer satisfaction. Füller and Matzler [18] argued that attributes play different roles depending on particular customer segments. They used k-means clustering [35] to create various clusters and then applied the three-factor theory, applying regression analysis with dummy variables to the obtained clusters. The results on different clusters have indicated clear differences between the customer groups.

The literature survey suggests that the methods based on using the datasets similar to the one presented in Table 2 cannot be applied to predict attribute categories for individual customers, whereas methods such as the Kano and Emery and Tian [27] questionnaires are difficult and time-consuming for the viewpoint of a customer. Therefore, there is a need of a method which can use the data presented in Table 2, and at the same time, the method should be able to predict the category of an attribute for an individual customer. As the data mining techniques can handle large amount of data efficiently, we expect that the method should be able to employ data mining techniques so that it can handle large data.

In Section 3, we describe a novel proposed method that employs the datasets similar to the one presented in Table 2 and can predict the category of an attribute for a customer and a group of customers. The method employs data mining techniques; therefore, it can handle large data.

3. The Proposed Method

The motivation of the proposed approach is that, as the attribute values (attribute-level performance) of an attribute for a customer change while all other attributes being constant, it will affect the customer satisfaction, and the relationship between these two variables will be indicative of the category of this attribute. In this section, we first discuss the method proposed by Robnik-Šikonja and Kononenko [36] aiming to identify the importance of each attribute value with regard to classification. Then, we analyze how this approach can be combined with the rules suggested by Ahmad et al. [22] to classify an attribute into categories according to the three-factor theory.

Robnik-Šikonja and Kononenko [36] proposed a method to explain the class prediction of individual data points. In this method, they have suggested evaluating the contribution of each attribute value to the class of a data point. To estimate the contribution of an attribute (A) to the prediction of a data point (N), they have followed the following steps:(i)Train a classifier on a complete dataset(ii)Predict the class probabilities of the data point (N)(iii)Predict the class probabilities of the data point (N) without the attribute (A)

If the differences between class probabilities in cases II and III are large, it means that attribute A plays an important role in the prediction. The authors have argued that it is difficult to predict the class without considering all attributes, and therefore, they have replaced the actual value of attribute A for the data point N with all possible values of A and have taken the average weighting each prediction by the prior probability of the value. These class probabilities are considered as those of the data point without attribute A. As a motivation of the proposed method, we consider that the importance of an attribute value with regard to prediction can be evaluated for a data point. Therefore, we compute the importance of each attribute value individually. Next, we discuss rules proposed by Ahmad et al. [22] to classify an attribute into different categories.

Ahmad et al. [22] proposed a probabilistic method to identify the type of an attribute based on the given customer satisfaction data. In this method, they have used the concept of the support set and discriminating power of an attribute value [37] to predict the attribute category. A support set is defined as a subset of the class, which has the strongest relationship with the attribute value. The discriminating power of an attribute value represents the extent to which the attribute value is related to the support set. In addition, they have proposed the rules to identify the category of an attribute. To ensure the completeness, we first discuss the algorithm [37] which is used to compute the support set and discriminating power of an attribute value. Ahmad and Dey [37] proposed that if an attribute value (, rth attribute value of the ith attribute) is significant, both and will be large, where is the proper subset of m classes. This behavior implies that the data points with value for the ith attribute Ai, as well as the data points with values indicated by , would be categorized as complementary subsets. There can be 2m − 1 proper subsets. The quantity  − 1 is defined as the discriminating power of the attribute value . The subset that provides the maximum value of the quantity , which is denoted by , is called the support set of . Ahmad and Dey [37] presented an algorithm to identify the support set and the discriminating power of an attribute value in the linear time with respect to the number of data points. The algorithm is presented as Algorithm 1.

Input: dataset having m classes.
Output: the support set and discriminating power of attribute value .
Begin
 = 0; /∗discriminating power initialized to 0∗/
 = φ; /∗Support set initialized to NULL∗/
for t = 1 to m do /∗t is a class, m number of classes∗/
 {
 if /∗ t occurs more frequently with compared with .∗/
 {(1) Add t to ; /∗t is added to the support set.∗/
  (2)  =  + ; }
 else
 { =  + ; }
 }
end for
 =  − 1;
End

Based on the concept of the support sets and the discriminating powers of attribute values, Ahmad et al. [22] proposed the following rules to identify the category of an attribute:(a)Basic attributes:Rule. There are two types of support sets for attribute values. One of them contains only customer dissatisfaction values, whereas the other support set has only customer satisfaction ones.(b)Performance attributes:Rule. There are different support sets for different attribute values; these support sets have values changing from strong dissatisfaction values to strong satisfaction ones as the attribute values change.(c)Excitement attributes:Rule. Most of the attribute values have similar support sets with low discriminating powers, which have dissatisfaction and satisfaction values; the remaining attribute values have the support sets with large customer satisfaction values.(d)Random attributes:Rule. All attribute values have similar support sets with both customer satisfaction and dissatisfaction values, and discriminating powers of all attribute values have very low values.

In the proposed method, we compute the class probabilities for each attribute value for an individual customer and then for the complete dataset. These probabilities are used to identify the type of an attribute by using Ahmad et al.’s rules [22]. First, we present the method to compute the categories of attributes for individual customers.(i)For an individual customer: the method starts with training a selected classifier on the given customer satisfaction dataset (an example of which is presented in Table 2). The attribute values of the data are integers. The customer satisfaction data for the selected customer for which the category of an attribute is to be calculated are employed as an input to the trained classifier. The values for this attribute are changed, keeping all other attribute values fixed to create various data points that are then used as the input to the trained classifiers. The output composed of the class probabilities is stored. For example, if the ith attribute (Ai) has s attribute values, by changing the attribute value of the customer row data, s data points will be created. These s data points used as the input to the trained classifiers will create s sets of class probabilities for data points. We relate each probability to the attribute value. We assume that this probability is related to the set of attribute values and not to a single attribute value; however, to compute the support set and the discriminating power of an attribute value, the differences between class probabilities are used, which are due to different attribute values of a given attribute. Therefore, we employ these class probabilities for a given attribute value. The obtained s sets of class probabilities are then employed to calculate support sets and the discriminating power of an attribute value. Let us consider C as a set of m classes with Cj as the jth class.To compute the support set of the rth attribute value of ith attribute, , we require and . The classifier provides the values (class probabilities for a class value for a given attribute value) for all attribute values, which are then used to compute , the average of class probabilities of all other attribute values, which is defined as follows:These probabilities are then used to compute the support sets and discriminating powers of all attribute values of a given attribute using the algorithm proposed by Ahmad and Dey [37]. These values are used to compute the category of the attribute by considering the rules proposed by Ahmad et al. [22]. The process is presented in Algorithm 2.We describe the process considering the example dataset given in Table 2. To calculate the type of attribute Z for a customer N2, five different input rows are computed by changing the values of attribute Z from 1 to 5 as follows: (2, 3, 1), (2, 3, 2), (2, 3, 3), (2, 3, 4), and (2, 3, 5). These rows are then inputted one by one into a classifier trained on the complete data, and class probabilities are computed for each input row. The support sets and discriminating powers of different values of attribute Z (1–5) are then derived on the basis of these class probabilities. Finally, the obtained support sets and discriminating powers are used to determine the category of attribute Z for customer N2. Next, we will present the process for a set of customers.(ii)For a set of the customers: to compute the type of an attribute for a set of the customers, first, the set of class probabilities for each attribute value for teach customer is obtained, as previously discussed. Then, the average of these probabilities for each attribute value is computed over all the customers. These probabilities are employed to define the support sets and discriminating power for different attribute values, as previously discussed. These values are then used to identify the category of the attribute for all customers by using the rules proposed by Ahmad et al. [22]. The steps are presented in Algorithm 3.

Input: customer satisfaction dataset D; a classifier algorithm (CA); Ai attribute; customer N data row.
Output: the type of Aith attribute for the customer N.
Begin
(1)Train the classifier CA on dataset D.
(2)for r = 1 to s do (the attribute can take s values).
(a) Take a data row of customer N, replace ith attribute value with rth value of ith attribute. All other attribute values are the same as given in the dataset.
(b) Input the newly generated row into the classifier and get the set of class probabilities.
( probability of class Cj (j = 1 to m) given Ai attribute value = ).
end for
(3)for j = 1 to m do (the class can take m values)
for r = 1 to s do (the attribute can take s values)
  Compute the probabilities by taking the average of probabilities of other attribute values:
  
end for
end for
(5)Use the set of class probabilities to compute the support set and discrimination power for each attribute value by using the method described in Algorithm 1.
(6)Use these support sets and discriminating powers to compute the type of the attribute by using the rules [22].
End

4. Results and Discussion

In this section, we present the results of the experiments conducted using the two datasets: synthetic and real housing ones. We follow the steps presented in Figure 1. We select random forests [38] as a classifier. Random forests consist of many decision trees and can be used to accurately perform classification with default parameters. In other words, the performance of random forests is robust with respect to parameter selection. They can produce class probabilities for a given data point. The Weka [39] implementation of random forests is used to conduct the experiments. The number of decision trees in a classifier is set to 100. Other parameters are set as default. We use the model created by random forests to compute the probabilities (Algorithms 2 and 3). These probabilities are then used to find out the categories of attributes for individual customers and a group of customers (Algorithm 1 and rules in Section 3).

Input: customer satisfaction dataset D; a classifier algorithm (CA); Ai attribute.
Output: the type of Ai attribute for the group of customers presented in dataset D
Begin
(1)Train the classifier CA on dataset D
(2)for k = 1 to s do (the attribute Ai can take s values)
for j = 1 to n do (there are n customers)
(i)  Create a data row of jth customer with kth value of Ai attribute. All other attribute values are the same as given in the dataset.
(ii)  Input the newly generated row to the classifier and obtain the set of class probabilities.
end for
Compute the average of the class probabilities () for attribute values for all customers. This is denoted as value for the group of the customers.
end for
(3)Use these probabilities to compute values as suggested in Algorithm 2.
(4)Use these class probabilities to compute the support set and the discriminating power of each attribute value by using the method given in Algorithm 1.
(5)Use these support sets and discriminating powers to compute the type of the attribute by using the rules [22].
End
4.1. Synthetic Dataset Preparation

Robnik-Šikonja and Vanhoof [24] suggested a method to obtain a synthetic dataset with the properties of a customer satisfaction dataset. Such a synthetic dataset has the following four attributes: basic (B), performance (P), excitement (E), and random ones (R). The four attributes are considered to represent their contribution to overall customer satisfaction. It is assumed that each attribute can take values from 1 to 5, which represents the attribute-level performance of that particular attribute.

For the basic attribute B, the total customer satisfaction C (B) is represented according to equation (2) as follows:

With regard to the performance attribute P, the total customer satisfaction C (P) is estimated according to the following equation:

For the excitement attribute E, the total customer satisfaction C (E) is defined by

With regard to the random attribute R, the total customer satisfaction C (R) is represented by

Total customer satisfaction is obtained as the sum of all customer satisfaction values generated by different attributes, as represented by

As a result, 625 data points were generated according to different combinations of values (1 to 5) of the four attributes; their contributions to overall customer satisfaction were estimated (equations (2)–(5)), and then, the total customer satisfaction value was computed for each point (equation (6)). It was observed that C varied from −2 to 6 within the dataset.

Random forest is used to compute the class probabilities for each attribute value. Algorithm 1 is applied to obtain the support sets and discrimination powers of the attribute values for each attribute by using the estimated class probabilities. Then, these values are used to identify the categories of attributes.

The support sets of attribute B are presented in Table 3. For the lower attribute values (<4), the support sets have small values of customer satisfaction, whereas for the larger ones, the support sets have the larger values of customer satisfaction. As a result of applying the rules mentioned in Section 3, it is concluded that the behavior of the support sets of attribute values suggests that the attribute is a basic one. The support sets of the attribute values of attribute P are provided in Table 3. It is observed that the values of the support sets increase as the values of attribute P increase according to the rules presented in Section 3, and this indicates the behavior of a performance attribute. With regard to attribute E, the support sets of all corresponding attributes are presented in Table 3. The support sets of the low attribute values of E have smaller and medium values of customer satisfaction, and the discrimination powers of these values are small (approximately 0.04), whereas for the attribute value of five, the support set consisting of large customer satisfaction values is obtained. The discrimination power of this attribute value is 0.2. This behavior is in line with the rule corresponding to the excitement attribute. Therefore, attribute E is concluded to be an excitement attribute. The support sets of attribute R are provided in Table 3. These support sets represent different types of customer satisfaction values and do not show any pattern with an increase in the attribute values. All attribute values have very small discriminating powers (<0.01). According to the rules presented in Section 3, this indicates the behavior of a random attribute. Therefore, attribute R is considered to be a random attribute. The obtained results confirm that the proposed method can be used to predict correctly the category of attributes.

4.2. Boston Housing Dataset

There are no publicly available benchmark datasets in this area. Researchers generally use their own datasets for the study of their methods. These datasets are not publicly available because of confidentiality issues. Boston housing dataset is a publicly available dataset and has been used to test a similar method [22]. Ahmad et al. [22] explained their results on this dataset using the domain knowledge. Therefore, it is easy to analyze the results on this dataset.

Here, we discuss the results of testing the proposed method on the Boston housing dataset. This dataset is obtained from UCI’s machine learning repository [40]. One binary and 13 continuous attributes along with the prices of houses constitute the dataset. Table 4 represents the information about these attributes. The price of a house is given as the target variable. Considering the fact that housing prices are positively correlated with customer satisfaction [41], the price of a house is considered as a representation of customer satisfaction. In these experiments, it is assumed that the housing prices mirror the customer satisfaction which may not be entirely true as in many cases, customer satisfaction depends on the prices. Equal frequency discretization is applied to convert all continuous attributes and house prices into integer-valued attributes (attribute values 1 to 5).

As one of the attributes is binary, we obtained the attribute type for the other 12 attributes. We computed the support sets of all attribute values by using the method presented in Section 3. Table 5 represents the support sets of all attribute values corresponding to different attributes. On the basis of the obtained support set and discriminative power of each attribute value, the type of each attribute was obtained, as presented in Table 6. Two features were considered as the basic ones, six attributes corresponded to the performance ones, and three of them were regarded as excitement attributes. Moreover, one attribute was concluded to be a random one (the discriminative powers of all attribute values were <0.005).

In addition, we compare the results with those obtained using the method proposed by Ahmad et al. [22]. Table 7 presents the attributes for which the two methods provide the similar categories. Out of the 12 attributes, the considered two methods match on 10 attributes and have discrepancies in terms of the two attributes: 7 and 12 (Table 8). According to the method developed by Ahmad et al. [22], attribute 7 is a basic attribute, whereas the method proposed in the present paper suggests that this is a performance one. Attribute 7 is related to the proportion of owner-occupied units built prior to 1940, and therefore, it is likely that with an increase in the number of old buildings, the prices of houses will decrease, which is the indication of a performance attribute. Attribute 12 is an excitement attribute according to the method of Ahmad et al. [22]; however, the proposed method concludes that it is a performance attribute. This attribute is related to the proportion of blacks by town. The attribute that is related to the proportion is more likely to have a more/less negative or positive effect with the change in the proportion. Therefore, the results obtained in the present paper are likely to be correct. It should be noted that several attributes demonstrate different properties in different ranges, and therefore, different methods may capture various effects differently, which may result in defining different categories by these methods.

The important aspect of the proposed method is that it can also predict the type of attributes for an individual customer. To confirm this, we selected the three customers randomly: one highly unsatisfied (satisfaction 1), one highly satisfied (satisfaction 5), and one averagely satisfied (satisfaction 3) customers. As each customer is represented by a row, the row data are used to predict the type of attributes for a customer. For the highly unsatisfied customer, the results are presented in Tables 9 and 10. We observe that six attributes are basic, whereas four attributes correspond to performance attributes. No attribute is classified as the excitement one, whereas two attributes are random attributes. It can be concluded that a customer who has rather high expectations from a product (a large number of basic attributes) is more likely to be unsatisfied. It is difficult that high expectations of such a customer would be met by the product attributes appropriately, and therefore, this may lead to customer dissatisfaction. Moreover, a customer who does not consider many attributes as excitement ones is unlikely to be highly satisfied. The obtained results show that the unsatisfied customer has six attributes as basic ones and zero attributes as excitement ones, which is similar to the behavior of a highly unsatisfied customer. Therefore, it can be concluded that the proposed method can classify the attributes correctly also for a highly unsatisfied customer.

For the highly satisfied customer, the results are presented in Tables 11 and 12. We observe that seven attributes are excitement ones; none of attributes is classified as a basic attribute; and two attributes are classified as the performance ones, whereas three attributes are classified as random ones. We conclude that a customer who does not have excessively high expectations from the product is likely to be more satisfied (the small number of basic attributes), and a customer who considers more attributes as excitement ones is likely to be more satisfied as excitement attributes contribute only to customer satisfaction. The similar behavior is observed in the obtained prediction (zero basic attributes and a large number of excitement ones), and therefore, we can conclude that the proposed method can classify attributes correctly.

For the averagely satisfied customer, the results are presented in Tables 13 and 14. A customer who seeks to express his opinion by observing the performance is more likely to be an averagely satisfied customer who does not expect much from the product (low number of basic attributes) but puts considerable emphasis on its performance. Excitement attributes also add to his/her satisfaction. The similar behavior was observed from the results obtained using the proposed method. Six attributes are classified as the performance ones, whereas zero attributes are classified as basic, and three attributes are classified as excitement attributes. A large number of performance attributes indicate that customer satisfaction is mainly dependent on the performance of the product.

Therefore, we conclude that the results of the conducted experiments indicate that the proposed method is capable of predicting the categories of different attributes for a group of customers correctly and, moreover, that it can be used to predict attribute categories for individual customers.

5. Conclusions

The three-factor theory is an important tool to evaluate customer satisfaction. There are two main approaches developed to identify categories of product attributes, which are based on their contribution to the level of customer satisfaction. The first approach has a difficulty associated with the data collection task, whereas the second approach has a deficiency that it cannot be used to identify attribute categories for individual customers. In the present paper, we propose a novel method that can be applied to datasets collected in a manner similar to the second approach; however, the proposed method can be used to predict attribute categories for both individual customers and groups of customers. The results of the conducted experiments using the synthetic customer satisfaction dataset suggested that the proposed method was able to identify the structure of the considered dataset correctly. The Boston housing dataset was used to conduct the experiments. The obtained experimental results were compared with the results presented by Ahmad et al. [22]. Generally, the results were similar; analysing the observed discrepancies, we suggested that it was more likely that the proposed method provided the correct classification results. The results for different types of individual customers were also presented. They demonstrated the capabilities of the proposed method to identify the categories of attributes for individual customers and a group of the customers. The proposed method uses the dataset which is based on the experience of customers for given attributes. Therefore, it cannot be used to identify the categories of new attributes, which are not present in the dataset.

As discussed in Section 4.2, the Boston housing dataset is not the ideal choice for the experiments (it does not have customer satisfaction attribute). However, we used it for the experiments as it is a publicly available dataset and has been used in similar experiments [22]. Data collected specifically for the comparative study purpose (Kano questionnaire + dataset as in Table 2) are a better choice to test similar methods as Kano categories obtained by those methods can directly be compared with the Kano categories obtained by the Kano questionnaire. Researchers in this area should come up with publicly available benchmark datasets so that different methods can be compared on the same datasets.

In the future, we plan to test the proposed method on additional datasets. It is a challenging task to obtain a dataset for testing as the datasets presented in different research papers are not publicly available. We will also investigate how a combination of attributes may contribute to the total level of customer satisfaction. The proposed method cannot find the correlation of the attributes with the customer satisfaction; we will propose modifications in the method so that the strengths of the categories can also be computed.

Data Availability

The data used to support the findings of this study are publicly available at http://archive.ics.uci.edu/ml.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This project was funded by the Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, under grant no. DF-767-830-1441. The authors, therefore, gratefully acknowledge DSR for the technical and financial support.