Discovering Monthly Fuzzy Patterns

Discovering patterns that are fuzzy in nature from temporal datasets is an interesting data mining problems. One of such patterns is monthly fuzzy pattern where the patterns exist in a certain fuzzy time interval of every month. It involves finding frequent sets and then association rules that holds in certain fuzzy time intervals, viz. beginning of every months or middle of every months, etc. In most of the earlier works, the fuzziness was user-specified. However, in some applications, users may not have enough prior knowledge about the datasets under consideration and may miss some fuzziness associated with the problem. It may be the case that the user is unable to specify the same due to limitation of natural language. In this article, we propose a method of finding patterns that holds in certain fuzzy time intervals of every month where fuzziness is generated by the method itself. The efficacy of the method is demonstrated with experimental results.


Introduction
Analysis of transactional data has been considered as an important data mining problem.Market basket data is an example of such transactional data.In a market-basket data set, each transaction is a collection of items bought by a customer at one time.The concept proposed in [1] is to find the co-occurrence of items in transactions, given minimum support and minimum confidence thresholds.Temporal Association rule mining is an important extension of above-mentioned problem.When an item from super-market is bought by a customer, this is called transaction and its time is automatically recorded.Ale et al. [2] have proposed a method of extracting association rules that hold within the life-span of the corresponding item set.
Mahanta et al. [3] have introduced concept of locally frequent item sets as item sets that are frequent in certain time intervals and may or may not be frequent throughout the life-span of the item set.An efficient algorithm is developed by them which is used find such item sets along with a list of sequences of time intervals.Considering the time-stamp as calendar dates, a method is discussed in [4] which can extract yearly, monthly and daily periodic or partially periodic patterns.If the periods are kept in a compact manner using the method discussed in [4], it turns out to be a fuzzy time interval.In this paper, we discuss such patterns and device algorithms for extracting such patterns.Although our algorithm works for extracting monthly fuzzy patterns, it can be modified for daily fuzzy periodic patterns.The paper is organized as follows.In Section 2, we discuss related works.In Section 3, we discuss terms, definitions and notations used in the algorithm.In Section 4, the proposed algorithm is discussed.In Section 5, we discuss about results and analysis.Finally a summary and lines for future works are discussed in Section 6.

Related Works
Agrawal et al. [1] first formulated association rules mining problems.One important extension of this problem is Temporal Data Mining [5] by taking into account the time aspect, more interesting patterns that are time dependent can be extracted.The problems associated are to find valid time periods during which association rules hold and the discovery of possible periodicities that association rules have.In [2], an algorithm for finding temporal rules is described.There each rule has associated with it a time frame.In [3], the works done in [2] has been extended by considering time gap between two consecutive transactions containing an item set into account.
Considering the periodic nature of patterns, Ozden et al. [6] proposed a method, which is able to find patterns having periodic nature where the period has to be specified by the user.In [7], Li et al. discuss about a method of extracting temporal association rules with respect to fuzzy match, i.e. association rule holding during "enough" number of intervals given by the corresponding calendar pattern.Similar works were done in [8] incorporating multiple granularities of time intervals (e.g.first working day of every month) from which both cyclic and user defined calendar patterns can be achieved.
Mining fuzzy patterns from datasets have been studied by different authors.In [9], the authors present an algorithm for mining fuzzy temporal patterns from a given process instance.Similar work is done in [10].In [11] method of extracting fuzzy periodic association rules is discussed.

Terms, Definitions and Notations Used
Let us review some definitions and notations used in this paper.
A fuzzy number is a convex normalized fuzzy set A defined on the real line R such that 1) there exists an 0 x R ∈ such that ( ) 2) ( ) A x is piecewise continuous.Thus a fuzzy number can be thought of as containing the real numbers within some interval to varying degrees.Fuzzy intervals are special fuzzy numbers satisfying the followings: 2) ( ) A x is piecewise continuous.A fuzzy interval can be thought of as a fuzzy number with a flat region.A fuzzy interval A is denoted by x a b ∈ is known as left reference function and ( ) is known as the right reference function.The left reference function is non-decreasing and the right reference function is non-increasing [12].
The support of a fuzzy set A within a universal set E is the crisp set that contains all the elements of E that have non-zero membership grades in A and is denoted by ( ) The core of a fuzzy set A within a universal set E is the crisp set that contains all the elements of E having membership grades 1 in A .
Set Superimposition When we overwrite, the overwritten portion looks darker for obvious reason.The set operation union does not explain this phenomenon.After all

and in ( )
A B ∩ the elements are represented once only.
In [13] an operation called superimposition denoted by ( ) where ( ) ( ) are the elements of ( ) represented twice, and ( ) + represents union of disjoint sets.
To explain this, an example has been taken.
, B a b = are two real intervals such that A B ∩ ≠ ∅ , we would get a superimposed portion.It can be seen from ( 1) (2) explains why if two line segments are superimposed, the common portion looks doubly dark [5].The identity ( 2) is called fundamental identity of superimposition of intervals.
To explain this we take the fuzzy intervals [ ] ( ) 1, 5 and [ ] ( ) with constant membership value ( ) given in Figure 1 and If we apply superimposition on the intervals then the superimposed interval will be consisting of [ ) ( ) and ( ] ( ) . Here the membership of [ ] 3, 5 is (1) due to double representation and it is shown in Figure 3.    Let [ ] In ( 4), the sequence is formed by sorting the sequence { } i x in ascending order of magnitude for are also identical and independent.Let ( ) ( ) , , , n x x x be the values of 1 2 , , , n x x x , and , , , n y y y be the values of 1 2 , , , n y y y arranged in ascending order.
For X and Y if the empirical probability distribution functions ( ) φ are defined as in ( 5) and ( 6) respectively.Then, the Glivenko-Cantelli Lemma of order statistics states that the mathematical expectation of the empirical probability distributions would be given by the respective theoretical probability distributions.
( )  Clustering of patterns can be done based on their fuzzy time interval associated with yearly patterns using some statistical measure.

, a b and k Y 2 ,
is random in the interval [ ] P b y are the probability distribution functions followed by k X and k Y respectively.Then in this case Glivenko-Cantelli Lemma gives like the values of empirical complementary probability distribution function or empirical Although the set superimposition is operated on the closed intervals, it can be extended to operate on the open and the half-open intervals in the trivial way.