Hidden Markov Models Based Credit Card Fraud Detection.
Hidden Markov Models Based Credit Card Fraud Detection.
1. INTRODUCTION
Online activities are well acknowledged to every citizen of the society with the eminent growth of e-commerce. Online activities mainly involve regular purchase of goods, electronic devices and other such things. The online transactions made for such activities are secure payment methods that authorize the transfer of funds. These transactions are supported by different bank cards which makes the operation easy. A huge population use credit card for its undemanding accessibility. The bank has accumulated a vast count of credit card transactions.
Apart from its magnificent advantages they do face some of their pitfalls regarding the security. The illicit use of these credit cards is a major issue to ponder on. The credit card fraud can be done for various reasons, mainly to get unaccredited funds from the account. It is thus the responsibility of the bank to safeguard the amount transferred online on the internet of the card holder. The bank organization can adopt various existing methodologies such as case based reasoning, decision tree, and neural network for fraud detection in order to reduce the financial loss.
From among various detection techniques our approach focus on Hidden Markov model which detects the fraud transaction and concurrently report the timestamp and IP address of the intruder’s machine. The HMM prior processing include maintaining the record of the card holders transactions to evaluate its normal behavior. Every time a new transaction made is recorded in the system. The Hidden Markov model then automatically generate the spending profile of the user. Now if any intruder tries to make transaction with any registered credit card, then its spending habit will be different from that of authenticate user and can be easily captured. Through this system we make sure that no genuine transaction is rejected. The system is capable of recording the timestamp and IP address of the attacking machine so that the geographic location of intruder can be traced.
2. LITERATURE SURVEY
Credit card fraud detection has been a current evoking issue of major concern. In affect to this various detection techniques such as genetic algorithms, data mining, neural networks, clustering techniques and decision tree are used.
Ghosh and Reily implemented the neural network system which involved cases dealing with lost cards, stolen cards, stolen card details, application fraud etc. Aleskerov, Freisleben and Rao also developed a system on neural network called Card watch. The system focus towards commercial implementation.
Dorronsoro and others developed a neural network based detection system called Minerva. This system proposes the facility to ingrain itself deep in credit card transaction servers to detect fraud in real-time. Kokkinaki suggested to create a user profile for each credit card account and to test incoming transactions against the corresponding user’s profile. Chan and Stolfo studied the class distribution of a training set and its effects on the performance of multi classifiers on the credit card fraud domain. Brause and others looked specifically at credit card payment fraud and identified fraud cases by combining a rule-based classification approach with a neural network algorithm. Kim proposed a fraud density map technique to improve the learning efficiency of a neural network. Chiu and Tsai identified the problem of credit card transaction data having a natural skewness towards legitimate transaction. Foster and Stine attempted to predict personal bankruptcy using a fully automated stepwise regression model.
3. PROPOSED MODEL
The Hidden Markov model is undemanding and easily manageable sequential model which is use to model the spending convention of the card holder (user). It is a doubly embedded random process comprising of two disparate levels. One of them remains hidden and other is noticeable to observer. The Hidden Markov model has greater potential in managing complex process than the traditional Markov model. The considerable advantage seen in the model is the diminution in number of FP (False Positives). FP is the transition identified as fraudulent by the fraud detection system but although they are genuine.
The new model consists of finite set of states which are associated with probability distribution. Transitions among different states are supervised by set of probability called as transition probability . Every state in model originates some outcome called as observation calculated according to corresponding probability distribution. HMM can be successfully applied to various applications in temporal pattern recognitions such as speech, handwriting gesture reorganization part of speech tagging and bioinformatics.
The HMM can be well defined with the following elements-
After knowing all these elements, the HMM is ready to work. We consider the initial sequence of transaction of card holder. The transactions made by the credit card holder is categorized into three clusters l, m, h for low medium and high category transaction respectively. The volume of each cluster is resolved considering the limit of the credit card. The amount up to 35% of card limit belongs to l cluster, upto 65% belongs to m and above that comes under h cluster. Let O1 ,O2 ,….….OR be consisting of R symbols to form a sequence. The HMM now works in the following manner.
4. RESULTS
Due to the obvious security reasons it is very difficult to fetch the dataset from any bank. So in order to get our results simulated analysis is performed by considering a random dataset of transactions for any credit card holder. Firstly, all the transaction sequence need to be categorized into three clusters namely low medium and high according to the user credit card limit. Assuming the credit card limit to be ₹. 10000 in our case, the range of clusters thus produced will be low {₹. 0, 3000}, medium {₹. 3000, 6000} and high {₹. 6000, 10000}. After deciding the categories the fraud detection of incoming transaction will be verified by last 10 transactions.
In the table above the transaction sequence with its amount dataset is represented. From the given datasets we calculate the first acceptance probability to check the spending habit of the user taking 7th to 17th transaction.
Since the 18th dataset refers to current transaction, second acceptance probability is calculated taking this transaction into consideration. So,
With the values of two acceptance probability we can determine the standard deviation as,
The percentage change of this standard deviation is compared with the threshold value ϴ. Ideally the ϴ is taken as 0.1. Further it is continually calculated every time the algorithm runs. So the value of threshold is empirically calculated for every transaction. Following this criteria the current threshold value turns out to be 0.0842.
the above transaction is detected as fraudulent and the user has to answer the security questions. Simultaneously, the IP and time stamp of the user is also recorded.
5. CONCLUSION
The technology supporting online transaction has provoked the use of payment cards. Though online transactions have paved a smooth path for the customers, at the same time has provoked various threats to security for these transactions. In our system we have proposed the application of HMM in detecting the credit card frauds thereby recording the IP of the fraud system along with the timestamp when malignant attempted to attack. It thus helps in tracing the geographic location of the attacker.
In this paper it is shown that the processing of the HMM starts with grouping of various transaction amount sequence into three categories which forms different hidden states of the model. The range of such clusters is reliant on limit of the credit card which varies with the user. Now with the help of such clusters, model suggests to find spending profile of the user for given sequence. The percentage change in the probabilities of previous and new transaction sequence is compared with the threshold value which decides whether the upcoming transaction is fraudulent or not.
Comparative studies revealed an accuracy of the system to be about 80% for a wide range of input dataset. Thus the produced system is genuine to a great extent. It has also reduced the complexity when compared with the existing system. In our simulation analysis we have considered a small set of data, but our proposed system is capable of handling larger range of transactions which is quite certain in real life scenarios.
6. FUTURE WORK
The system has the flexibility for the future enhancement at the same time shows its advantage of dynamic nature. There will always be a method to enhance the probability which we use for the fraud detection based on practical datasets and values. Also the algorithm used is applied at one layer. For stronger protection multiple layer algorithm can be implemented. Further in future we can design an application which can be add sophisticated modules like capturing the photo of the attacker.
1. INTRODUCTION
Online activities are well acknowledged to every citizen of the society with the eminent growth of e-commerce. Online activities mainly involve regular purchase of goods, electronic devices and other such things. The online transactions made for such activities are secure payment methods that authorize the transfer of funds. These transactions are supported by different bank cards which makes the operation easy. A huge population use credit card for its undemanding accessibility. The bank has accumulated a vast count of credit card transactions.
Apart from its magnificent advantages they do face some of their pitfalls regarding the security. The illicit use of these credit cards is a major issue to ponder on. The credit card fraud can be done for various reasons, mainly to get unaccredited funds from the account. It is thus the responsibility of the bank to safeguard the amount transferred online on the internet of the card holder. The bank organization can adopt various existing methodologies such as case based reasoning, decision tree, and neural network for fraud detection in order to reduce the financial loss.
From among various detection techniques our approach focus on Hidden Markov model which detects the fraud transaction and concurrently report the timestamp and IP address of the intruder’s machine. The HMM prior processing include maintaining the record of the card holders transactions to evaluate its normal behavior. Every time a new transaction made is recorded in the system. The Hidden Markov model then automatically generate the spending profile of the user. Now if any intruder tries to make transaction with any registered credit card, then its spending habit will be different from that of authenticate user and can be easily captured. Through this system we make sure that no genuine transaction is rejected. The system is capable of recording the timestamp and IP address of the attacking machine so that the geographic location of intruder can be traced.
2. LITERATURE SURVEY
Credit card fraud detection has been a current evoking issue of major concern. In affect to this various detection techniques such as genetic algorithms, data mining, neural networks, clustering techniques and decision tree are used.
Ghosh and Reily implemented the neural network system which involved cases dealing with lost cards, stolen cards, stolen card details, application fraud etc. Aleskerov, Freisleben and Rao also developed a system on neural network called Card watch. The system focus towards commercial implementation.
Dorronsoro and others developed a neural network based detection system called Minerva. This system proposes the facility to ingrain itself deep in credit card transaction servers to detect fraud in real-time. Kokkinaki suggested to create a user profile for each credit card account and to test incoming transactions against the corresponding user’s profile. Chan and Stolfo studied the class distribution of a training set and its effects on the performance of multi classifiers on the credit card fraud domain. Brause and others looked specifically at credit card payment fraud and identified fraud cases by combining a rule-based classification approach with a neural network algorithm. Kim proposed a fraud density map technique to improve the learning efficiency of a neural network. Chiu and Tsai identified the problem of credit card transaction data having a natural skewness towards legitimate transaction. Foster and Stine attempted to predict personal bankruptcy using a fully automated stepwise regression model.
3. PROPOSED MODEL
The Hidden Markov model is undemanding and easily manageable sequential model which is use to model the spending convention of the card holder (user). It is a doubly embedded random process comprising of two disparate levels. One of them remains hidden and other is noticeable to observer. The Hidden Markov model has greater potential in managing complex process than the traditional Markov model. The considerable advantage seen in the model is the diminution in number of FP (False Positives). FP is the transition identified as fraudulent by the fraud detection system but although they are genuine.
The new model consists of finite set of states which are associated with probability distribution. Transitions among different states are supervised by set of probability called as transition probability . Every state in model originates some outcome called as observation calculated according to corresponding probability distribution. HMM can be successfully applied to various applications in temporal pattern recognitions such as speech, handwriting gesture reorganization part of speech tagging and bioinformatics.
The HMM can be well defined with the following elements-
After knowing all these elements, the HMM is ready to work. We consider the initial sequence of transaction of card holder. The transactions made by the credit card holder is categorized into three clusters l, m, h for low medium and high category transaction respectively. The volume of each cluster is resolved considering the limit of the credit card. The amount up to 35% of card limit belongs to l cluster, upto 65% belongs to m and above that comes under h cluster. Let O1 ,O2 ,….….OR be consisting of R symbols to form a sequence. The HMM now works in the following manner.
4. RESULTS
Due to the obvious security reasons it is very difficult to fetch the dataset from any bank. So in order to get our results simulated analysis is performed by considering a random dataset of transactions for any credit card holder. Firstly, all the transaction sequence need to be categorized into three clusters namely low medium and high according to the user credit card limit. Assuming the credit card limit to be ₹. 10000 in our case, the range of clusters thus produced will be low {₹. 0, 3000}, medium {₹. 3000, 6000} and high {₹. 6000, 10000}. After deciding the categories the fraud detection of incoming transaction will be verified by last 10 transactions.
In the table above the transaction sequence with its amount dataset is represented. From the given datasets we calculate the first acceptance probability to check the spending habit of the user taking 7th to 17th transaction.
Since the 18th dataset refers to current transaction, second acceptance probability is calculated taking this transaction into consideration. So,
With the values of two acceptance probability we can determine the standard deviation as,
The percentage change of this standard deviation is compared with the threshold value ϴ. Ideally the ϴ is taken as 0.1. Further it is continually calculated every time the algorithm runs. So the value of threshold is empirically calculated for every transaction. Following this criteria the current threshold value turns out to be 0.0842.
the above transaction is detected as fraudulent and the user has to answer the security questions. Simultaneously, the IP and time stamp of the user is also recorded.
5. CONCLUSION
The technology supporting online transaction has provoked the use of payment cards. Though online transactions have paved a smooth path for the customers, at the same time has provoked various threats to security for these transactions. In our system we have proposed the application of HMM in detecting the credit card frauds thereby recording the IP of the fraud system along with the timestamp when malignant attempted to attack. It thus helps in tracing the geographic location of the attacker.
In this paper it is shown that the processing of the HMM starts with grouping of various transaction amount sequence into three categories which forms different hidden states of the model. The range of such clusters is reliant on limit of the credit card which varies with the user. Now with the help of such clusters, model suggests to find spending profile of the user for given sequence. The percentage change in the probabilities of previous and new transaction sequence is compared with the threshold value which decides whether the upcoming transaction is fraudulent or not.
Comparative studies revealed an accuracy of the system to be about 80% for a wide range of input dataset. Thus the produced system is genuine to a great extent. It has also reduced the complexity when compared with the existing system. In our simulation analysis we have considered a small set of data, but our proposed system is capable of handling larger range of transactions which is quite certain in real life scenarios.
6. FUTURE WORK
The system has the flexibility for the future enhancement at the same time shows its advantage of dynamic nature. There will always be a method to enhance the probability which we use for the fraud detection based on practical datasets and values. Also the algorithm used is applied at one layer. For stronger protection multiple layer algorithm can be implemented. Further in future we can design an application which can be add sophisticated modules like capturing the photo of the attacker.
Comments
Post a Comment