New User?   Join Now     Login            Forgot Password?    
Browse by Category
Articles
Games
Algorithms
Software
Browse Projects
XamlQuery v1.2
YtoX
Browse Topics
Apriori Algorithm  (42845 hits)
 Share this Article
 
Posted by Prabu Arumugam on Jul-21-2010
Languages: C#, Silverlight

This article explain the basics of association rules and how to find them using Apriori algorithm. The algorithm is implemented in C# and Silverlight and a live demonstration is available below with full source code.

Apriori Algorithm Demo in Silverlight

The Apriori algorithm for finding large itemsets and generating association rules using those large itemsets are illustrated in this demo. The source code of the demo is available for download below.

What is Data Mining and Association Rules

Data Mining is a technique for discovering useful information from large databases. Analyzing the data and extracting useful information can be potentially very profitable to a business. For example, if a seller can find the association between two products, critical decisions in pricing or product placements can be made in order to promote the business. By this way, the seller can concentrate the marketing efforts on every subset of customers who are very likely to buy the associated products.

Association Rules are used for discovering regularities between products in big transactional databases. A transaction is an event involving one or more of the products (items) in the business or domain; for example buying of goods by a consumer in a super market is a transaction. A set of items is usually referred as "itemset", and an itemset with "k" number of items is called "k-itemset".

The general form of an association rule is X => Y, where X and Y are two disjoint itemsets. The "support" of an itemset is the number of transactions that contain all the items of that itemset; whereas the support of an association rule is the number of transactions that contain all items of both X and Y. The "confidence" of an association rule is the ratio between its support and the support of X.

A given association rule X => Y is considered significant and useful, if it has high support and confidence values. The user will specify a threshold value for support and confidence, so that different degrees of significance can be observed based on these threshold values.

For more information on association rules, see http://en.wikipedia.org/wiki/Association_rule.

Data Structures for this Article

We create the following data-structure classes. An Itemset is just a list of strings; it can be used to represent a transaction or any set of items. The ItemsetCollection is list of itemsets; it can be used to represent a transactional database or any group of itemsets. The AssociationRule class represents an instance of a generated association-rule.

public class Itemset : List<string>
{
    public double Support { get; set; }
}

public class ItemsetCollection : List<Itemset>
{
}

public class AssociationRule
{
    public Itemset X { get; set; }
    public Itemset Y { get; set; }
    public double Support { get; set; }
    public double Confidence { get; set; }
}

Finding Large Itemsets using Apriori Algorithm

The first step in the generation of association rules is the identification of large itemsets. An itemset is "large" if its support is greater than a threshold, specified by the user. A commonly used algorithm for this purpose is the Apriori algorithm.

The Apriori algorithm relies on the principle "Every non-empty subset of a larget itemset must itself be a large itemset". The algorithm applies this principle in a bottom-up manner. Let Li denote the collection of large itemsets with "i" number of items. The algorithm begins by identifying all the sets in L1. Each item that has the necessary support forms a large 1-itemset and included in L1, other itemsets are dropped from consideration. This process of retaining necessary itemsets only is called "pruning". The set of itemsets used to find Li is called candidate itemsets (Ci).

The collection L2 can be constructed by considering each pair of sets in L1 and retaining only those pairs that has enough support. In general, having constructed Li, the collection Li+1 is constructed by considering pairs of sets, one from Li and another from L1 and eliminating those for which the support is smaller. This procedure is continued until all large itemsets up to the desired maximum size have been obtained or no further pruning is possible.

The following code implements the Apriori algorithm.

public static ItemsetCollection DoApriori(ItemsetCollection db, double supportThreshold)
{
    Itemset I = db.GetUniqueItems();
    ItemsetCollection L = new ItemsetCollection(); //resultant large itemsets
    ItemsetCollection Li = new ItemsetCollection(); //large itemset in each iteration
    ItemsetCollection Ci = new ItemsetCollection(); //candidate itemset in each iteration

    //first iteration (1-item itemsets)
    foreach (string item in I)
    {
        Ci.Add(new Itemset() { item });
    }

    //next iterations
    int k = 2;
    while (Ci.Count != 0)
    {
        //set Li from Ci (pruning)
        Li.Clear();
        foreach (Itemset itemset in Ci)
        {
            itemset.Support = db.FindSupport(itemset);
            if (itemset.Support >= supportThreshold)
            {
                Li.Add(itemset);
                L.Add(itemset);
            }
        }

        //set Ci for next iteration (find supersets of Li)
        Ci.Clear();
        Ci.AddRange(Bit.FindSubsets(Li.GetUniqueItems(), k)); //get k-item subsets
        k += 1;
    }

    return (L);
}

The FindSubsets() function defined in Bit class is used to find all the subsets of a given set of items. This is explained in more detail in this article.

Finding Association Rules

Having found the set of all large itemsets from the input database, the next task is to find the required set of strong association rules. An association rule is "strong" if its confidence value is greater than a user-defined threshold. The association rules are created by combining each large itemset with each of its subsets. The strong rules are published as result and others are dropped.

public static List<AssociationRule> Mine(ItemsetCollection db, ItemsetCollection L, double confidenceThreshold)
{
    List<AssociationRule> allRules = new List<AssociationRule>();

    foreach (Itemset itemset in L)
    {
        ItemsetCollection subsets = Bit.FindSubsets(itemset, 0); //get all subsets
        foreach (Itemset subset in subsets)
        {
            double confidence = (db.FindSupport(itemset) / db.FindSupport(subset)) * 100.0;
            if (confidence >= confidenceThreshold)
            {
                AssociationRule rule = new AssociationRule();
                rule.X.AddRange(subset);
                rule.Y.AddRange(itemset.Remove(subset));
                rule.Support = db.FindSupport(itemset);
                rule.Confidence = confidence;
                if (rule.X.Count > 0 && rule.Y.Count > 0)
                {
                    allRules.Add(rule);
                }
            }
        }
    }

    return (allRules);
}

 Downloads for this article
File Language Tools
Apriori-Demo-Source  365.57 kb  (2120 downloads) C#, Silverlight 3 Visual Studio 2010

 Share this Article

 Comments on this Article
Comment by ushanandhini on Aug-12-2014
i need csharph implementation code aprioi algorthim
Comment by saria on Jul-20-2014
I need fuzzy for this algorithm anyone can help me please :(
Comment by sara on Jun-03-2014
its very nice code. thanks for sharing.
Comment by sindiya on Apr-14-2014
need full explanation for this (with the db).. thank you. nice coding.
Comment by vuductoan123 on Mar-10-2014
thanks for sharing... its very useful. i need link download sourcecode
Comment by tuananhk43 on Feb-26-2014
thanks you so much
Comment by pusia1809 on Dec-04-2013
Thank Prabu !
Comment by rykardu on Oct-24-2013
Hi, do you have any code for use with Fuzzy Association Rules?
Comment by nhimmai on Oct-19-2013
thanks
Comment by nofearmd5 on Oct-07-2013
ok
Comment by faisalqau on May-24-2013
Nice coding...
Comment by has on May-13-2013
thanks
Comment by gachecha on Apr-23-2013
Nice code, but doesn't work for lark item sets - in the Bit class when you're calculating the size of the item sets (2^(size of unique items)), the number is getting greater than what "int" can hold. Also, the run time is huge for a big database!
Comment by Saira on Mar-19-2013
Kindly suggest me some few variations of Apriori
Comment by cincoutprabu on Mar-13-2013
Hi ulveera, what do you mean by complete working code. The Silverlight widget available in this article is the complete working demo, and its source code is available for download.
Comment by ulveera on Mar-13-2013
hey can I have the complete working code of it along with association rules code?? please
Comment by jh8666 on Jan-29-2013
thank you very much! it's good for me
Comment by Nguy?n Tu?n Anh on Dec-31-2012
Thanhs you very mucg
Comment by ehsan on Dec-21-2012
i get error ...why??how can i solve it? error: Error 1 The type or namespace name 'NumericUpDown' does not exist in the namespace 'System.Windows.Controls' (are you missing an assembly reference?) D:\AprioriDemo\codeding.AprioriDemo\obj\Debug\MainPage.g.cs 42 42 codeding.AprioriDemo
Comment by Hiba on Dec-21-2012
hey please i couldnt implement it can any help me this is my email soft_hiba91@hotmail.com thanks in advance
Comment by duyngukho on Dec-15-2012
good for me
Comment by pria on Nov-30-2012
Very nicely explained.. Thank you.. :) :) can i please have the source code.. mail id : proudindian.priya@gmail.com Thank you in advance..
Comment by gzd on Oct-31-2012
Yes , i tried different values for two thresholds and inputs. i found missing thing : At AprioriMining.cs , i added following part : i divided 100 : if (itemset.Support >= (supportThreshold/100)) if (confidence >= (confidenceThreshold/100)) After that , code executed succesfully. Thank you so much! :)
Comment by cincoutprabu on Oct-30-2012
Hi gzd, the number of large itemsets and number of association rules generated will change based on the support threshold & confidence threshold, you specify. Have you tried different values for these two thresholds?
Comment by gzd on Oct-30-2012
Firstly , thanks for important code i installed and executed it. but always results are same: My results : 0 Large Itemsets (by Apriori) 0 Association Rules. Your example's result is same. Please help me. Can you bring different example , different result ???
Comment by gzd on Oct-29-2012
After downloading, I opened it with Microsoft visual studio 2010. but i got an error: it's project type (.csproj) is not supported by this version of the application. Please help me
Comment by saras on Oct-20-2012
i need source code for apriori algorithm using c language
Comment by dungvh on Oct-11-2012
thanks
Comment by anuj on Oct-03-2012
can anyone send me this source code?
Comment by thang on Sep-19-2012
Thank you so much!
Comment by Desai on Sep-10-2012
Thankyou Mr. Prabhu Arumugam. The problem was with my visual studio installation as the system components are not properly installed. Thank you so much. The program executed successfully in another machine and verified.
Comment by cincoutprabu on Sep-09-2012
Hi desai, try one of these: 1) Clean the solution and rebuild again before running. 2) If you are running in Windows7 and you copied the code to desktop, try moving the directory to another drive like (D:), and then try running the code from there. 3) You have to configure the directory in which you have placed the code as virtual directory in IIS. Plz refer this link for more information: http://stackoverflow.com/questions/2355947/error-allowdefinition-machinetoapplication-beyond-application-level
Comment by desai on Sep-08-2012
After downloading, I opened it with Microsoft visual web developer 2008. While i want to run I am getting,"It is an error to use a section registered as allowDefinition='MachineToApplication' beyond application level. This error can be caused by a virtual directory not being configured as an application in IIS" as error. Please help me in solving this.
Comment by thirupathi on Aug-23-2012
can any one send this source code plz...
Comment by jeo on Aug-11-2012
My email is Saleeem_77@hotmail.com
Comment by Jeo on Aug-11-2012
I am not able to download the Apriori-Demo-Source.zip!! Can any one send me the software to my email please? I will be very thankfull and it is very important for me. Thank you
Comment by jeo on Jul-11-2012
Thank you so much
Comment by Shrada Pradhan on May-06-2012
nice code thannx
Comment by thuanvo on May-06-2012
thank so much!^^
Comment by andy on Apr-30-2012
thanks for sharing... its very useful
Comment by hoangnguyenlien on Apr-18-2012
thanks!
Comment by umesh on Apr-15-2012
its very useful code for us
Comment by anuj on Apr-15-2012
thanks
Comment by Fatema on Apr-03-2012
here is given a link to download the source code. but i can't download this. why!!??!!??!!
Comment by Nagavallisammeta on Mar-12-2012
your coding are good understanding to readers you will have a nice performance to implementing the apriori algorithm.
Comment by manoj kumar on Feb-19-2012
thanks
Comment by jerada on Dec-08-2011
Stp comment je peux télécharger ce code ???
Comment by latelyhappy on Sep-30-2011
thank you !
Comment by win on Aug-22-2011
I can't open with VS 2010, please help me! Thanks!
Comment by taher on Jun-11-2011
thanks
Comment by anilwurity on May-24-2011
good work
Comment by arun_s24 on May-18-2011
its very good, I have to generate association rule for retail shop, i am using WEKA , is it good or not..plzz any body rply or call me on 09549833706
Comment by Priya on Apr-26-2011
Really awesome creation. It is really ideal coding. It is the way we learn this algo ... totally loved it.. Keep it up
Comment by soukaena on Apr-24-2011
thank you
Comment by gowang on Mar-15-2011
thanks for sharing... its very useful
Comment by Jayantkumar on Dec-03-2010
Its really good.
Comment by karami on Aug-05-2010
thanks
 Post your comment here
Your Name
Your Email
Comment
Post Comment

About About      Terms Terms      Contact Contact
codeding on Facebook Facebook      codeding on Twitter Twitter