Document Type: Original Research Paper

Authors

Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran

Abstract

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes a bottleneck for the whole cloud workflow system and decreases the performance of the system dramatically. Job scheduling and data replication are two important techniques which can enhance the performance of data-intensive applications. It is wise to integrate these techniques into one framework for achieving a single objective. In this paper, we integrate data replication and job scheduling with the aim of reducing response time by reduction of data access time in cloud computing environment. This is called data replication-based scheduling (DRBS). Simulation results show the effectiveness of our algorithm in comparison with well-known algorithms such as random and round-robin.

Keywords

Main Subjects

[1] Djebbar, E.I. and Belalem, G., 2013, December. Optimization of tasks scheduling by an efficacy data placement and replication in cloud computing. In International Conference on Algorithms and Architectures for Parallel Processing (pp. 22-29). Springer International Publishing.
[2] Mansouri, N., 2014. A threshold-based dynamic data replication and parallel job scheduling strategy to enhance Data Grid. Cluster Computing, 17(3), pp.957-977.
[3] Ma, J., Liu, W. and Glatard, T., 2013. A classification of file placement and replication methods on grids. Future Generation Computer Systems, 29(6), pp.1395-1406.
[4] Yuan, D., Yang, Y., Liu, X. and Chen, J., 2010. A data placement strategy in scientific cloud workflows. Future Generation Computer Systems, 26(8), pp.1200-1214.
[5] Saadat, N. and Rahmani, A.M., 2012. PDDRA: A new pre-fetching based dynamic data replication algorithm in data grids. Future Generation Computer Systems, 28(4), pp.666-681.
[6] Sashi, K. and Thanamani, A.S., 2010. Dynamic replica management for data grid. International Journal of Engineering and Technology, 2(4), p.329.
[7] Park, S.M., Kim, J.H., Ko, Y.B. and Yoon, W.S., 2003, December. Dynamic data grid replication strategy based on Internet hierarchy. In International Conference on Grid and Cooperative Computing (pp. 838-846). Springer Berlin Heidelberg.
[8] Sashi, K. and Thanamani, A.S., 2011. Dynamic replication in a data grid using a Modified BHR Region Based Algorithm. Future Generation Computer Systems, 27(2), pp.202-210.
[9] Souri, A. and Rahmani, A.M., 2014. A survey for replica placement techniques in data grid environment. International Journal of Modern Education and Computer Science, 6(5), p.46.
[10] Rahmani, A.M., Fadaie, Z. and Chronopoulos, A.T., 2015. Data placement using Dewey Encoding in a hierarchical data grid. Journal of Network and Computer Applications, 49, pp.88-98.
[11] Saadat, N. and Rahmani, A.M., 2016. A Two-Level Fuzzy Value-Based Replica Replacement Algorithm in Data Grids. International Journal of Grid and High Performance Computing (IJGHPC), 8(4), pp.78-99.
[12] Yuan, D., Cui, L. and Liu, X., 2014, August. Cloud data management for scientific workflows: Research issues, methodologies, and state-of-the-art. In Semantics, Knowledge and Grids (SKG), 2014 10th International Conference on (pp. 21-28). IEEE.
[13] Zhao, Y., Li, Y., Raicu, I., Lin, C., Tian, W. and Xue, R., 2014. Migrating Scientific Workflow Management Systems from the Grid to the Cloud. In Cloud Computing for Data-Intensive Applications (pp. 231-256). Springer New York.
[14] Vijaya-Kumar-C, D.G., 2014, Optimization of Large Data in Cloud computing using Replication Methods. International Journal of Computer Science & Information Technologies, 5(3).