Pattern Analysis and Intelligent Systems
Somayeh Lotfi; Mohammad Ghasemzadeh; Mehran Mohsenzadeh; Mitra Mirzarezaee
Volume 7, Issue 1 , February 2021, , Pages 55-66
Abstract
The decision tree is one of the popular methods for learning and reasoning through recursive partitioning of data space. To choose the best attribute in the case on numerical features, partitioning criteria should be calculated for individual values or the value range of each attribute should be divided ...
Read More
The decision tree is one of the popular methods for learning and reasoning through recursive partitioning of data space. To choose the best attribute in the case on numerical features, partitioning criteria should be calculated for individual values or the value range of each attribute should be divided into two or more intervals using a set of cut points. In partitioning range of attribute, the fuzzy partitioning can be used to reduce the noise sensitivity of data and to increase the stability of decision trees. Since the tree-building algorithms need to keep in main memory the whole training dataset, they have memory restrictions. In this paper, we present an algorithm that builds the fuzzy decision tree on the large dataset. In order to avoid storing the entire training dataset in main memory and overcome the memory limitation, the algorithm builds DTs in an incremental way. In the discretization stage, a fuzzy partition was generated on each continuous attribute based on fuzzy entropy. Then, in order to select the best feature for branches, two criteria, including fuzzy information gain and occurrence matrix are used. Besides, real datasets are used to evaluate the behavior of the algorithm in terms of classification accuracy, decision tree complexity, and execution time as well. The results show that proposed algorithm without a need to store the entire dataset in memory and reduce the complexity of the tree is able to overcome the memory limitation and making balance between accuracy and complexity .
Computer Networks and Distributed Systems
Kobra Bagheri; Mehran Mohsenzadeh
Volume 2, Issue 3 , August 2016, , Pages 27-34
Abstract
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance ...
Read More
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domains. High energy consumption in computer systems leads to their limited performance because of the increased consumption of carbon dioxide and amount of electricity bills. Thus, the goal of design of computer systems has been shifted to power and energy efficiency. Data grids can solve large scale applications that require a large amount of data. Data replication is a common solution to improve availability and file access time in such environments. This solution replicates the data file in many different sites. In this paper, a new data replication method is proposed that is not only data aware, but also is energy efficient. Simulation results with CLOUDSIM show that the proposed method gives better energy consumption, average response time, and network usage than other algorithms and prevents the unnecessary creation of replica, which leads to efficient storage usage.
Zahra Sheikhnajdy; Mehran Mohsenzadeh; Mashalah Abbasi Dezfuli
Volume 1, Issue 1 , February 2015, , Pages 29-36
Abstract
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has ...
Read More
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new improved solution for schema matching problem. An improvement hybrid semantic schema matching algorithm which semi automatically finds matching between two data representation schemas is introduced. The algorithm finds mappings based on the hierarchical organization of the elements of a term WordNet dictionary.