USIMSCAR: a retrieval strategy for case-based reasoning using a combination of similarity and association knowledge
thesisposted on 2017-01-31, 04:18 authored by Kang, Yong-Bin
Case-Based Reasoning (CBR) is a widely researched technology for developing knowledge-based systems in a range of real-world application domains such as medical diagnosis, help-desk service, product recommendation, classification, and configuration or planning. The fundamental premise of CBR is that experience in the form of past cases can be leveraged to solve new problems. This paradigm arises from the fact that in many application domains, similar problems usually have similar solutions. Retrieval is often considered an important phase in CBR, since it lays the foundation for the overall effectiveness of CBR systems. The aim of retrieval in CBR is to retrieve useful cases that can be successfully used to solve the target problem. If the retrieved cases are not useful or relevant, CBR systems will not eventually produce good solutions for the problem. Therefore, the success of any CBR systems is strongly reliant on the performance of retrieval. The retrieval strategy in CBR systems typically relies on exploiting similarity knowledge and is referred to as similarity-based retrieval (SBR). Similarity measures are used in SBR to approximate the usefulness of cases with respect to the target problem. However, a limitation of SBR is that it tends to rely primarily on similarity knowledge, ignoring other forms of knowledge that can be further leveraged for improving retrieval performance. While many kinds of learnt and induced knowledge have been used to enhance traditional SBR, this thesis demonstrates that association analysis of stored cases can enhance and improve traditional SBR. In this thesis, we propose and develop a novel retrieval strategy USIMSCAR that combines association knowledge with similarity knowledge. The aim of association knowledge is to represent a set of strongly evident, interesting relationships between known problem features and known solutions which are shared by a large number of cases. We formulate and represent association knowledge for CBR systems using a special type of association rules that we term soft-matching class association rules (scars). Through extensive experiments, we experimentally demonstrate the improvement of our proposed USIMSCAR over SBR using both benchmark and real datasets in three CBR application domains: medical diagnosis, help-desk service support, and product recommendation domains. Throughout our experimental evaluation, we validate that USIMSCAR is an effective retrieval strategy for CBR that enhances and improves SBR. We also evaluate the statistical significance of the accuracy and effectiveness obtained by USIMSCAR when compared with traditional SBR. The contributions of this thesis are as follows: (1) we propose and develop an approach for formalizing association knowledge for CBR systems acquired from association analysis of stored cases using association rule mining, (2) we propose and develop innovative strategies for measuring the usefulness of cases as well as directly leveraging interesting rules encoding association knowledge, with respect to the target problem, (3) we propose and develop a novel retrieval strategy that substantially improves SBR by using both similarity and association knowledge. In summary, in this thesis, we have addressed the problem of SBR, and proposed and developed an effective retrieval strategy using association analysis techniques to enhance traditional SBR. The research done over the course of this thesis has been published in five conference papers with one journal paper submitted for review.