Vector-based approaches for computing approximations in multigranulation rough set

: Approximation computation is a significant issue when the rough set model is applied. However, few authors focus on how to calculate approximations of multigranulation rough set (MGRS). Herein, the authors clarify a fact that only a part of elements in the universe need to be judged whether they belong to approximations of MGRS. If X is a target concept which is approximated by approximations in MGRS, then the element whose equivalence class does not intersect with X is of no need to be judged. Based on the fact, the authors clarify that they proposed a vector-based algorithm to compute approximations in MGRS. Time complexity of the proposed algorithm is O (| X | | U |) .


Introduction
Pawlak originally proposed a Rough set theory in 1980s and it is a powerful mathematical tool for characterising the uncertainty by the difference between lower and upper approximations. The rough set theory has been widely used in image processing [1,2], machine learning [3][4][5][6][7][8][9][10], pattern recognition [11][12][13][14][15][16][17][18], data mining, and other relevant areas. However, multiple different types of attribute values appear in information systems in many real-world situations, e.g. missing ones, numerical ones, set-valued ones, and interval-valued ones. Classical rough set theory cannot be applied in analysing these data, means that classical rough set theory has some theoretical limitations. To overcome these limitations, a lot of extensions have been proposed such as covering-based rough set [19], which generalisze rough set from the equivalence relation to the general binary relation, fuzzy rough set [20], which generalise rough set from the equivalence relation to fuzzy relation, and multigranulation rough set (MGRS) [21].
As we all know, Pawlak's rough set is constructed by a single equivalence relation and that is too restrictive in many real-life applications. Multiple viewpoint has been used for many real application areas. In order to extend the application areas of rough set theory, Qian et al. [21] improved the theory and proposed a theory of MGRS, which includes optimistic and pessimistic lower and upper approximations. MGRSs are constructed by a family of attribute sets, which characterise different viewpoints. Approximation computation plays a significant role in applications of MGRS. However, since MGRS have been proposed, few author focus on designing a fast algorithm to compute approximations of MGRS. Hu et al. [22] proposed a matrix-based algorithm for computing approximation of MGRS that is much more efficient than naive algorithm. However, there is a defect in their algorithm which slows down the speed of the algorithm: all the elements in the universe must participate in the computation process.
In this paper, we clarify a fact that in MGRS, only a part of elements need to be judged whether they belong to approximations. The other part whose equivalence class doee not intersect with approximate target concept X(∀X ⊆ U) are of no need to be judged. Inspired by this fact, we proposed a vector representation of approximations in MGRS, and devised a vector-based fast algorithm for computing approximations in MGRS. The time complexity of the algorithm is O( | X | | U | ). Since the complexity of the matrix-based algorithm is O( | U | 2 ), the time complexity of the algorithm we proposed is theoretically less than Hu's. Experimental evaluation showed that the computation time of our algorithm is less than Hu's algorithm not only when the size of X was increasing gradually but also when the size of the universe was increasing gradually.
The rest of this paper is organised as follows. In Section 2, we review several concepts in MGRS. In Section 3, we prove that only the part of elements that related to target concept X need to be judged and propose another fast algorithm for computing approximation in MGRS. The time complexity of our algorithm is less than Hu's. Experimental evaluation verified the efficiency of both algorithms, which was conducted in Section 4. The paper ends with conclusions and outlooks for further research in Section 5.

Preliminaries
In this section, we review mainly the concepts in MGRSs.

Multigranulation rough sets
In the past decade, many extensions of MGRS have been proposed and since MGRS is our another basic model, we review its main results in this section.
Definition 1: Let IS = U, AT, V AT , f be an information system, where U = x 1 , x 2 , ⋯, x n is a non-empty finite set of the objects, called the universe [1]. A = a 1 , a 2 , ⋯, a r is a non-empty finite set of attributes. the element A ∈ AT is called an attribute set.
Definition 2: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U [21]. The optimistic multigranulation lower and upper approximation of X are denoted by where [x] A k is the equivalence class of x in terms of the attribute set A k , ∼ X is the complement of the set X. Theorem 1: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U. Since where [x] A k is the equivalence class of x in terms of the attribute set this completes the proof. □ Theorem 2: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U [21]. For the optimistic multigranulation upper approximation of X, we have (4) Theorem 3: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U. For the optimistic multigranulation upper approximation of X, we have Proof.
AT, V, f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U [21]. The pessimistic multigranulation lower and upper approximation of X are denoted by ∑ k = 1 m A k P X and ∑ k = 1 m A k P X , respectively, where [x] A k is the equivalence class of x in terms of the attribute set A k , ∼ X is the complement of the set X. Theorem 4: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U. Since where [x] A k is the equivalence class of x in terms of the attribute set Proof. The proof is similar to that of Theorem 1. □ Theorem 5: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U [21]. For the optimistic multigranulation upper approximation of X, we have (9) Theorem 6: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U. For the pessimistic multigranulation upper approximation of X, we have

Vector-based algorithm for computing approximations in MGRS
According to Hu's approach, all samples must participate in the computation process. However, Theorems 3 and 6 demonstrate that only part of the elements in the universe needs to be determined whether they belong to approximations or not. This inspired us to improve the algorithm to be more efficient.
The essential step of computing the approximations of MGRS is to judge an equivalence class, [x] A k , for example, is contained in the target concept X or not. In addition, to set operation, there is a more efficient way which is introduced in [23].
Proof. This corollary can be easily obtained from Lemma 1. □ Example 1: Let IS = U, AT, V AT , f be an information system, as shown in Table 1 Theorem 7: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U. For the optimistic multigranulation upper approximation of X, we have and then ∀x, y ∈ X for the way we choose x and y, we can easily infer that □ Lemma 2: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U. For the optimistic multigranulation upper approximation of X, we have Proof. This lemma can be easily obtained from Theorem 6 □ By Theorem 7 and Lemma 2, we can propose a new approach to compute upper approximations of PMGRS and OMGRS: Defintion 5: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, …, m , and ∀X ⊆ U. The upper approximation character set of X can be calculated as Corollary 2: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, ⋯, m , and ∀X ⊆ U. The optimistic and pessimistic upper approximations can be calculated by Proof. This corollary can be easily obtained by Theorem 7 and Lemma 2 □ Example 2: Continuation of Example 1. From Table 1, we have that

By Definition 5
Proof. This lemma can be easily obtained by Theorem 1. □ Lemma 4: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, ⋯, m . ∀X ⊆ U, we have Proof. This lemma can be easily obtained by Theorem 4 □ Definition 6: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, ⋯, m , and ∀X ⊆ U. The upper approximation character set of X can be calculated as x ∈ X , ∀k = 1, 2, ⋯, m Corollary 3: Let IS = U, AT, V AT , f be an information system, where A k ∈ AT for any k ∈ 1, 2, ⋯, m , and ∀X ⊆ U. The optimistic and pessimistic upper approximations can be calculated by Proof. This corollary can be easily obtained by Lemma 3 and Lemma 4 □ Example 3: (Continuation of Example 2): By Definition 6 By Corollary 3 Algorithm 1 (see Fig. 1) is a vector-based algorithm for computing the lower and upper approximations of optimistic and pessimistic MGRS which is based on the discussion of Section 3. The total time complexity of the algorithm is O( | X | | U | ). Steps 3-6 are to calculate I A k L and I A k U (k ∈ 1, 2, ⋯, m ) whose time complexity is O( | X | | U | ), steps 17-22 are to compute the approximations of MGRS whose time complexity is O(U). Since the time complexity of matrix-based algorithm is O(U 2 ), and in general, we have |X | ≪ | U|, Algorithm 1 (Fig. 1) is more efficient than matrix-based algorithm.

Experimental evaluations
In this section, several experiments have been conducted to verify the validity of the proposed vector-based algorithms. We have selected six data sets, which are described in Table 2. All the experiments have been carried out on a personal computer with Windows 10, Intel(R) Core(TM)I7-6700HQ @2.6 GHz and 8 GB memory. The programming language is Matlab R2015b. First, since time complexity of Algorithm 2 is O( | X | | U | ) and time complexity of Algorithm 1 (Fig. 1) is O( | U | 2 ), six group of experiments have been conducted to compare Algorithm 1 (Fig. 1) and matrix-based algorithm when the size of target concept X was gradually increased by a 10% step in size of U, the strategy for selecting elements of X is completely random. Fig. 2 shows that the computation time of Algorithm 1 (Fig. 1) is less than matrix-based algorithm even though the cardinal number of target concept X is the same. When the size of X is increasing gradually, the computation time of Algorithm 1 (Fig. 1) has a positive growth while Hu's algorithm has almost no change at all. In Fig. 2, Algorithm 1 (Fig. 1) is more efficient than matrix-based algorithm on computing approximations of MGRS when the cardinal number of target concept X is gradually increasing.
Second, six group of experiments have been conducted to compare Algorithm 1 (Fig. 1) and matrix-based algorithm when the size of universe was gradually increased by a 10% step in size of U, the strategy for selecting samples of U and X is completely random. The size of X is 1/3 elements of the temporary universe. Fig. 3 shows when the size of U is increasing gradually, both the computation time of Algorithm 1 (Fig. 1) and matrix-based algorithm have a positive growth. In Fig. 3, Algorithm 1 (Fig. 1) is more efficient than matrix-based algorithm on computing approximations of MGRS when the cardinal number of universe is gradually increasing.  In this paper, a fact that only the part of elements need to be judged whether they belong to approximations in MGRS has been clarified and then a vector-based algorithm for computing approximations of MGRS algorithm has been proposed. In the future, we will focus on updating approximation of MGRS while adding or deleting a granular structure, adding or deleting a sample by approaches which we verified.