Apriori Algorithm - codeding.com

This article explain the basics of association rules and how to find them using Apriori algorithm. The algorithm is implemented in JavaScript and C# / Silverlight and a live demonstration is available below with full source code. ES2015 version of JavaScript is used. ES2015 is a fantastic step forward for the JavaScript with significant features like classes, arrow functions, modules, etc.

Learn Apriori Algorithm by Example

The Apriori algorithm for finding large itemsets and generating association rules using those large itemsets are illustrated in this demo. Enter a set of items separated by comma and the number of transactions you wish to have in the input database. Then press Generate DB button to generate a random database with items that you entered. Then press the Apriori button to see the algorithm in action; a set of large itemsets and association rules will be generated based on the given support threshold and confidence threshold. The source code of the demo is available for download below in JavaScript and C#.

The same widget / demo is available in C# and Silverlight here. Source code is available for download below.

What is Data Mining and Association Rules

Data Mining is a technique for discovering useful information from large databases. Analyzing the data and extracting useful information can be potentially very profitable to a business. For example, if a seller can find the association between two products, critical decisions in pricing or product placements can be made in order to promote the business. By this way, the seller can concentrate the marketing efforts on every subset of customers who are very likely to buy the associated products.

Association Rules are used for discovering regularities between products in big transactional databases. A transaction is an event involving one or more of the products (items) in the business or domain; for example buying of goods by a consumer in a super market is a transaction. A set of items is usually referred as "itemset", and an itemset with "k" number of items is called "k-itemset".

The general form of an association rule is X => Y, where X and Y are two disjoint itemsets. The "support" of an itemset is the number of transactions that contain all the items of that itemset; whereas the support of an association rule is the number of transactions that contain all items of both X and Y. The "confidence" of an association rule is the ratio between its support and the support of X.

A given association rule X => Y is considered significant and useful, if it has high support and confidence values. The user will specify a threshold value for support and confidence, so that different degrees of significance can be observed based on these threshold values.

For more information on association rules, see http://en.wikipedia.org/wiki/Association_rule.

Data Structures for this Article

We create the following data-structure classes. An Itemset is just a list of strings; it can be used to represent a transaction or any set of items. The ItemsetCollection is list of itemsets; it can be used to represent a transactional database or any group of itemsets. The AssociationRule class represents an instance of a generated association-rule.

class Itemset extends Array {
    constructor() {
        super();
        this.Support = 0.0;
    }
}

class ItemsetCollection extends Array {
    constructor() {
        super();
    }
}

class AssociationRule {
    constructor() {
        this.X = new Itemset();
        this.Y = new Itemset();
        this.Support = 0.0;
        this.Confidence = 0.0;
    }
}

Finding Large Itemsets using Apriori Algorithm

The first step in the generation of association rules is the identification of large itemsets. An itemset is "large" if its support is greater than a threshold, specified by the user. A commonly used algorithm for this purpose is the Apriori algorithm.

The Apriori algorithm relies on the principle "Every non-empty subset of a larget itemset must itself be a large itemset". The algorithm applies this principle in a bottom-up manner. Let Li denote the collection of large itemsets with "i" number of items. The algorithm begins by identifying all the sets in L1. Each item that has the necessary support forms a large 1-itemset and included in L1, other itemsets are dropped from consideration. This process of retaining necessary itemsets only is called "pruning". The set of itemsets used to find Li is called candidate itemsets (Ci).

The collection L2 can be constructed by considering each pair of sets in L1 and retaining only those pairs that has enough support. In general, having constructed Li, the collection Li+1 is constructed by considering pairs of sets, one from Li and another from L1 and eliminating those for which the support is smaller. This procedure is continued until all large itemsets up to the desired maximum size have been obtained or no further pruning is possible.

The following code implements the Apriori algorithm.

static doApriori(db, supportThreshold) {
    let I = db.getUniqueItems();
    let L = new ItemsetCollection(); // Resultant large itemsets
    let Li = new ItemsetCollection(); // Large itemset in each iteration
    let Ci = new ItemsetCollection(); // Pruned itemset in each iteration

    // First iteration (1-item itemsets)
    for (var i = 0; i < I.length; i += 1) {
        Ci.push(Itemset.from([I[i]]));
    }

    // Next iterations
    let k = 2;
    while (Ci.length != 0) {
        // Set Li from Ci (pruning)
        Li.clear();
        for (var index in Ci) {
            let itemset = Ci[index];
            itemset.Support = db.findSupport(itemset);
            if (itemset.Support >= supportThreshold) {
                Li.push(itemset);
                L.push(itemset);
            }
        }

        // Set Ci for next iteration (find supersets of Li)
        Ci.clear();
        let subsets = Bit.findSubsets(Li.getUniqueItems(), k); // Get k-item subsets
        subsets.forEach(set => Ci.push(set));
        k += 1;
    }

    return L;
}

The FindSubsets() function defined in Bit class is used to find all the subsets of a given set of items. This is explained in more detail in this article.

Finding Association Rules

Having found the set of all large itemsets from the input database, the next task is to find the required set of strong association rules. An association rule is "strong" if its confidence value is greater than a user-defined threshold. The association rules are created by combining each large itemset with each of its subsets. The strong rules are published as result and others are dropped.

static mine(db, L, confidenceThreshold) {
    let allRules = [];

    for (var i in L) {
        let itemset = L[i];
        let subsets = Bit.findSubsets(itemset, 0); // Get all subsets

        for (var j in subsets) {
            let subset = subsets[j];
            let confidence = (db.findSupport(itemset) / db.findSupport(subset)) * 100.0;

            if (confidence >= confidenceThreshold) {
                let rule = new AssociationRule();
                subset.forEach(i => rule.X.push(i));
                itemset.removeItemset(subset).forEach(i => rule.Y.push(i));
                rule.Support = db.findSupport(itemset);
                rule.Confidence = confidence;

                if (rule.X.length > 0 && rule.Y.length > 0) {
                    allRules.push(rule);
                }
            }
        }
    }

    return allRules;
}

top

Downloads for this article

File	Language	Tools
Apriori Demo Source in C# 365.57 kb (3707 downloads)	C#, Silverlight 3	Visual Studio 2010
Apriori Demo Source in JavaScript / ES2015 42.96 kb (650 downloads)	JavaScript / ES2015	Visual Studio Code

Share this Article

Comments on this Article

Comment by sathya sankar on Apr-27-2017
Its very useful, I have to generate association rule for web log data. i am using R tool , already i done with apriori, now i wish to try reverse apriori, can i get code for reverse apriori,,,,,please...........................

Comment by mustafa93 on Oct-11-2016
good

Comment by sanju on Jun-15-2016
may i hab a vb.net cod for this?

Comment by mera on Dec-05-2015
any one have the code of algorithm in vb.net

Comment by cincoutprabu on Oct-12-2015
Hi keretina, the data structure classes like Itemset, ItemsetCollection were defined in a class library which is included in the download. Namespace of the class library is "codeding.Apriori.DataStructures".

Comment by keretina on Oct-06-2015
I have a doubt in which part of the code is the body of those 3 classes ( class Itemset,class ItemsetCollection and class AssociationRule) and the interfaces is defined , I downloaded the source code but couldn't find it , any one who have full source code please help me out . Thanks for sharing.

Comment by Fiszer on Oct-02-2015
Hello. Your implementation is awsome but i have problem cause algorithm stops working when iam using too many transactions.

Comment by poonam on Jul-20-2015
thanks.

Comment by maheshwar on Jan-16-2015
I have the adult database with 14 attribute i use the same code for making the association rule but it takes the 15 hours to generate the Frequent rule and association rule. How i optimise it execute the code fastly

Comment by ushanandhini on Aug-12-2014
i need csharph implementation code aprioi algorthim

Comment by saria on Jul-20-2014
I need fuzzy for this algorithm anyone can help me please :(

Comment by sara on Jun-03-2014
its very nice code. thanks for sharing.

Comment by sindiya on Apr-14-2014
need full explanation for this (with the db).. thank you. nice coding.

Comment by vuductoan123 on Mar-10-2014
thanks for sharing... its very useful. i need link download sourcecode

Comment by tuananhk43 on Feb-26-2014
thanks you so much

Comment by pusia1809 on Dec-04-2013
Thank Prabu !

Comment by rykardu on Oct-24-2013
Hi, do you have any code for use with Fuzzy Association Rules?

Comment by nhimmai on Oct-19-2013
thanks

Comment by nofearmd5 on Oct-07-2013
ok

Comment by faisalqau on May-24-2013
Nice coding...

Comment by has on May-13-2013
thanks

Comment by gachecha on Apr-23-2013
Nice code, but doesn't work for lark item sets - in the Bit class when you're calculating the size of the item sets (2^(size of unique items)), the number is getting greater than what "int" can hold. Also, the run time is huge for a big database!

Comment by Saira on Mar-19-2013
Kindly suggest me some few variations of Apriori

Comment by cincoutprabu on Mar-13-2013
Hi ulveera, what do you mean by complete working code. The Silverlight widget available in this article is the complete working demo, and its source code is available for download.

Comment by ulveera on Mar-13-2013
hey can I have the complete working code of it along with association rules code?? please

Comment by jh8666 on Jan-29-2013
thank you very much! it's good for me

Comment by Nguy?n Tu?n Anh on Dec-31-2012
Thanhs you very mucg

Comment by ehsan on Dec-21-2012
i get error ...why??how can i solve it? error: Error 1 The type or namespace name 'NumericUpDown' does not exist in the namespace 'System.Windows.Controls' (are you missing an assembly reference?) D:\AprioriDemo\codeding.AprioriDemo\obj\Debug\MainPage.g.cs 42 42 codeding.AprioriDemo

Comment by Hiba on Dec-21-2012
hey please i couldnt implement it can any help me this is my email soft_hiba91@hotmail.com thanks in advance

Comment by duyngukho on Dec-15-2012
good for me

Comment by pria on Nov-30-2012
Very nicely explained.. Thank you.. :) :) can i please have the source code.. mail id : proudindian.priya@gmail.com Thank you in advance..

Comment by gzd on Oct-31-2012
Yes , i tried different values for two thresholds and inputs. i found missing thing : At AprioriMining.cs , i added following part : i divided 100 : if (itemset.Support >= (supportThreshold/100)) if (confidence >= (confidenceThreshold/100)) After that , code executed succesfully. Thank you so much! :)

Comment by cincoutprabu on Oct-30-2012
Hi gzd, the number of large itemsets and number of association rules generated will change based on the support threshold & confidence threshold, you specify. Have you tried different values for these two thresholds?

Comment by gzd on Oct-30-2012
Firstly , thanks for important code i installed and executed it. but always results are same: My results : 0 Large Itemsets (by Apriori) 0 Association Rules. Your example's result is same. Please help me. Can you bring different example , different result ???

Comment by gzd on Oct-29-2012
After downloading, I opened it with Microsoft visual studio 2010. but i got an error: it's project type (.csproj) is not supported by this version of the application. Please help me

Comment by saras on Oct-20-2012
i need source code for apriori algorithm using c language

Comment by dungvh on Oct-11-2012
thanks

Comment by anuj on Oct-03-2012
can anyone send me this source code?

Comment by thang on Sep-19-2012
Thank you so much!

Comment by Desai on Sep-10-2012
Thankyou Mr. Prabhu Arumugam. The problem was with my visual studio installation as the system components are not properly installed. Thank you so much. The program executed successfully in another machine and verified.

Comment by cincoutprabu on Sep-09-2012
Hi desai, try one of these: 1) Clean the solution and rebuild again before running. 2) If you are running in Windows7 and you copied the code to desktop, try moving the directory to another drive like (D:), and then try running the code from there. 3) You have to configure the directory in which you have placed the code as virtual directory in IIS. Plz refer this link for more information: http://stackoverflow.com/questions/2355947/error-allowdefinition-machinetoapplication-beyond-application-level

Comment by desai on Sep-08-2012
After downloading, I opened it with Microsoft visual web developer 2008. While i want to run I am getting,"It is an error to use a section registered as allowDefinition='MachineToApplication' beyond application level. This error can be caused by a virtual directory not being configured as an application in IIS" as error. Please help me in solving this.

Comment by thirupathi on Aug-23-2012
can any one send this source code plz...

Comment by jeo on Aug-11-2012
My email is Saleeem_77@hotmail.com

Comment by Jeo on Aug-11-2012
I am not able to download the Apriori-Demo-Source.zip!! Can any one send me the software to my email please? I will be very thankfull and it is very important for me. Thank you

Comment by jeo on Jul-11-2012
Thank you so much

Comment by Shrada Pradhan on May-06-2012
nice code thannx

Comment by thuanvo on May-06-2012
thank so much!^^

Comment by andy on Apr-30-2012
thanks for sharing... its very useful

Comment by hoangnguyenlien on Apr-18-2012
thanks!

Comment by umesh on Apr-15-2012
its very useful code for us

Comment by anuj on Apr-15-2012
thanks

Comment by Fatema on Apr-03-2012
here is given a link to download the source code. but i can't download this. why!!??!!??!!

Comment by Nagavallisammeta on Mar-12-2012
your coding are good understanding to readers you will have a nice performance to implementing the apriori algorithm.

Comment by manoj kumar on Feb-19-2012
thanks

Comment by jerada on Dec-08-2011
Stp comment je peux télécharger ce code ???

Comment by latelyhappy on Sep-30-2011
thank you !

Comment by win on Aug-22-2011
I can't open with VS 2010, please help me! Thanks!

Comment by taher on Jun-11-2011
thanks

Comment by anilwurity on May-24-2011
good work

Comment by arun_s24 on May-18-2011
its very good, I have to generate association rule for retail shop, i am using WEKA , is it good or not..plzz any body rply or call me on 09549833706

Comment by Priya on Apr-26-2011
Really awesome creation. It is really ideal coding. It is the way we learn this algo ... totally loved it.. Keep it up

Comment by soukaena on Apr-24-2011
thank you

Comment by gowang on Mar-15-2011
thanks for sharing... its very useful

Comment by Jayantkumar on Dec-03-2010
Its really good.

Comment by karami on Aug-05-2010
thanks

Post your comment here

Your Name
Your Email
Comment
Post Comment