Review
BibTex RIS Cite

BÜYÜK VERİLER İÇİN HADOOP İŞ ÇİZELGELEME ALGORİTMALARINA GENEL BAKIŞ

Year 2022, Volume: 8 Issue: 2, 38 - 48, 30.12.2022
https://doi.org/10.22531/muglajsci.1124422

Abstract

Büyük veri sistemlerindeki hızlı gelişmeler son on yılda meydana gelmektedir. Büyük veri sistemlerinde yüksek performans elde etmek için en önemli unsur "iş Çizelgeleme"dir. Çizelgelemenin bazı zorluklarını çözmek için daha fazla dikkat gerekmektedir. Büyük verileri işlerken daha yüksek performans elde etmek için uygun çizelgeleme gereklidir. Apache Hadoop, en yaygın olarak çok büyük veri hacimlerini verimli bir şekilde yönetmek için kullanılır ve ayrıca iş çizelgeleme ile ilgili sorunları ele almada yetkindir. Büyük veri sistemlerinin performansını iyileştirmek için çeşitli Hadoop iş çizelgeleme algoritmalarını önemli ölçüde analiz ettik. Çizelgeleme algoritması hakkında genel bir fikir edinmek için bu makale iyi bir arka plan sunmaktadır. Bu makale Hadoop büyük veri çerçevesinin temel mimarisi, iş çizelgeleme ve sorunları hakkında genel bir perspektif sunmaktadır. Ardından en önemli ve temel Hadoop iş çizelgeleme algoritmalarını incelemekte ve karşılaştırmaktadır. Ek olarak makale diğer geliştirilmiş algoritmalarının bir incelemesini sunmaktadır. İlk amacı büyük veriler analiz ederken performansı arttırmak için çeşitli çizelgeleme algoritmalarına genel bir bakış sunmaktır. Bu çalışma aynı zaman da araştırmacıya ihtiyaçlarına göre iş çizelgeleme algoritmasına uygun yönlendirme sağlamaktadır.

References

  • Zameel, A., Najmuldeen, M., and Gormus, S., “Context-Aware Caching in Wireless IoT Networks”, 11th International Conference on Electrical and Electronics Engineering (ELECO), IEEE, 2019, pp. 712-717.
  • Seethalakshmi, V., Govindasamy, V., & Akila, V., “Job scheduling in big data-a survey”, International Conference on Computation of Power, Energy, Information and Communication (ICCPEIC) IEEE, 2018, pp. 023-031.
  • Deshai, N., Venkataramana, S., Hemalatha, I., & Varma, G. P. S., “A Study on Big Data Hadoop Map Reduce Job Scheduling”, International Journal of Engineering & Technology, 7(3), 59-65, 2017.
  • Mohamed, E., & Hong, Z., “Hadoop-MapReduce job scheduling algorithms survey”, 7th International Conference on Cloud Computing and Big Data (CCBD), IEEE, 2016, pp. 237-242.
  • Singh, D., Reddy, C.K., “A survey on platforms for big data analytics”, Journal of big data, 2(1): p. 1-20, 2015.
  • Nagina, D. and Dhingra, S., “Scheduling algorithms in big data: A survey”, Int. J. Eng. Comput. Sci, 5(8): p. 17737-17743, 2016.
  • Cheng, D., Zhou, X., Lama, P., Wu, J., & Jiang, C., “Cross-platform resource scheduling for spark and mapreduce on yarn”, IEEE Transactions on Computers, 66(8), 1341-1353, 2017.
  • Apache Hadoop. (2021, November 11) [online]. Available: http://hadoop.apache.org.
  • Hamad, F. and Alawamrah, A., “Measuring the Performance of Parallel Information Processing in Solving Linear Equation Using Multiprocessor Supercomputer”, Modern Applied Science, 12(3): p. 74, 2018.
  • Liu, J., Pacitti, E. and Valduriez, P., “A survey of scheduling frameworks in big data systems”, International Journal of Cloud Computing, 7(2): p. 103-128, 2018.
  • Guo, Y., Wu, L., Yu, W., Wu, B., & Wang, X., ”The improved job scheduling algorithm of Hadoop platform”, arXiv preprint arXiv :1506.03004, 2015.
  • Hudaib, A.A. and Fakhouri, H.N., “An automated approach for software fault detection and recovery”, Communications and Network, 8(03): p. 158, 2016.
  • Hamad, F., “An overview of Hadoop scheduler algorithms”, Modern applied science, 12(8): p. 69, 2018.
  • Usama, M., Liu, M., and Chen, M., “Job schedulers for Big data processing in Hadoop environment: testing real-life schedulers using benchmark programs”, Digital communications and networks, Elsevier, 2017.
  • Hannan, S.A., “An overview on big data and hadoop”, International Journal of Computer Applications, 154(10), 2016.
  • Dai, X. and Bensaou, B., “Scheduling for response time in Hadoop MapReduce” in 2016 IEEE International Conference on Communications (ICC) IEEE, 2016, pp. 1-6.
  • Shi, Y., Zhang, K., Cui, L., Liu, L., Zheng, Y., Zhang, S., & Yu, H., “MapReduce short jobs optimization based on resource reuse”, Microprocessors and Microsystems, 47, 178-187, 2016.
  • Li, X., Jiang, T., & Ruiz, R., “Heuristics for periodical batch job scheduling in a MapReduce computing framework”, Information Sciences, 326: p. 119-133, 2016.
  • Ghazi, M. R., & Gangodkar, D., “Hadoop, MapReduce and HDFS: a developers perspective”, Procedia Computer Science, 48: p. 45-50, 2015.
  • Liroz-Gistau, M., Akbarinia, R., Agrawal, D., & Valduriez, P., “FP-Hadoop: Efficient processing of skewed MapReduce jobs”, Information Systems, 60, 69-84, 2016.
  • Singla, M., “A survey on Static and Dynamic Hadoop Schedulers”, Advances in Computational Sciences and Technology, 10(8): p. 2317-2325, 2017.
  • Rao, B. T., & Reddy, L. S. S., “Samr: A self-adaptive mapreduce scheduling algorithm in heterogeneous environment”, arXiv preprint arXiv: 1207.0780, 2012.
  • Mavridis, I. and Karatza, H., “Performance evaluation of cloud-based log file analysis with Apache Hadoop and Apache Spark”, Journal of Systems and Software, 125: p. 133-151, 2017.
  • Xue, T., You, X., Yan, M., “Research on Hadoop job scheduling based on an improved genetic algorithm”, INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 10(2): p. 1-12, 2017.
  • Suresh, S. and Gopalan, N. P., “An optimal task selection scheme for Hadoop scheduling”, IERI Procedia, 10: p. 70-75, 2014.
  • Mana, S.C., “A feature based comparison study of big data scheduling algorithms”, 2018 International Conference on Computer, Communication, and Signal Processing (ICCCSP) IEEE. 2018.
  • Al-Sayyed, R. M., Fakhouri, H. N., Murad, S. F., & Fakhouri, S. N., “CACS: Cloud Environment Autonomic Computing System”, Journal of Software Engineering and Applications, 10(03), 273, 2017.
  • Wang, Z. and Shen, Y., “Job-aware scheduling for big data processing”, 2015 International Conference on Cloud Computing and Big Data (CCBD), IEEE, 2015, pp. 177-180.
  • Yuan, D., Yang, Y., Liu, X., & Chen, J., “A data placement strategy in scientific cloud workflows” Future Generation Computer Systems, 26(8), 1200-1214, 2010.
  • Chen, Q., Zhang, D., Guo, M., Deng, Q., & Guo, S., “Samr: A self-adaptive mapreduce scheduling algorithm in heterogeneous environment”. In 2010 10th IEEE International Conference on Computer and Information Technology, IEEE, 2010, pp. 2736-2743.
  • Jia, Z., Zhou, R., Zhu, C., Wang, L., Gao, W., Shi, Y., Zhan, J. and Zhang, L., “The implications of diverse applications and scalable data sets in benchmarking big data systems”, In Specifying Big Data Benchmarks, Springer, Berlin, Heidelberg, 2012, pp. 44-59.
  • Alam, A. and Ahmed, J., “Hadoop architecture and its issues” in 2014 International Conference on Computational Science and Computational Intelligence, IEEE, 2014.
  • Casavant, T. L. and Kuhl, J. G., “A taxonomy of scheduling in general-purpose distributed computing systems”. IEEE Transactions on software engineering, 14(2): p. 141-154, 1988.
  • Abraham, A., Buyya, R. and Nath, B., “Nature’s heuristics for scheduling jobs on computational grids”, The 8th IEEE international conference on advanced computing and communications (ADCOM 2000), 2000.
  • Senthilkumar, M. and Ilango, P., “A survey on job scheduling in big data”, Cybernetics and Information Technologies, 16(3): p. 35-51, 2016.
  • Brahmwar, M., Kumar, M., and Sikka, G., “Tolhit–a scheduling algorithm for hadoop cluster”, Procedia Computer Science, 89: p. 203-208, 2016.
  • Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., & Stoica, I., “Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling”, In Proceedings of the 5th European conference on Computer systems, 2010, pp. 265-278.
  • Divya, S., Kanya Rajesh, R., Rini Mary Nithila, I. and Vinothini, M., “Big Data Analysis and Its Scheduling Policy–Hadoop”, IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN, 2278-0661, 2015.
  • Nikhil, B., Riddhikesh, B., & Patil Balu, T. M., “A Survey On Scheduling In Hadoop For Bigdata Processing”, Multidisciplinary Journal of Research in Engineering and Technology, 2(3), 497-501, 2015.
  • Yoo, D. and Sim, K. M., “A comparative review of job scheduling for MapReduce”, 2011 IEEE International Conference on Cloud Computing and Intelligence System, IEEE, 2011.
  • Patil, A. U., Bagban, T. I., & Pande, A. P., “Recent Job Scheduling Algorithms in Hadoop Cluster Environments: A Survey”, International journal of Advanced Research in computer and communication Engineering, 4(2), 2015.
  • Gautam, J. V., Prajapati, H. B., Dabhi, V. K., & Chaudhary, S., “A survey on job scheduling algorithms in big data processing”, 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), IEEE, 2015, pp. 1-11.
  • Xie, Q., Pundir, M., Lu, Y., Abad, C. L., & Campbell, R. H., “Pandas: robust locality-aware scheduling with stochastic delay optimality”, IEEE/ACM Transactions on Networking, 25(2), 662-675, 2016.
  • Kc, K. and Anyanwu, K., “Scheduling hadoop jobs to meet deadlines”, 2010 IEEE Second International Conference on Cloud Computing Technology and Science, IEEE, 2010.
  • Johannessen, R., Yazidi, A., & Feng, B., “Hadoop MapReduce scheduling paradigms”, 2017 IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), IEEE, 2017.
  • Liu, Z., Zhang, Q., Ahmed, R., Boutaba, R., Liu, Y., & Gong, Z., “Dynamic resource allocation for MapReduce with partitioning skew”, IEEE Transactions on Computers, 65(11), 3304-3317, 2016.
  • Yong, M., Garegrat, N., & Mohan, S., “Towards a resource aware scheduler in hadoop”, Proc. ICWS, 2009.
  • Mashayekhy, L., Nejad, M. M., Grosu, D., Lu, D., & Shi, W., “Energy-aware scheduling of mapreduce jobs”. In 2014 IEEE International Congress on Big Data, IEEE, 2014, pp. 32-39.
  • Khalil, W. A., Torkey, H., & Attiya, G., “Survey of Apache Spark optimized job scheduling in Big Data”, International Journal of Industry and Sustainable Development, 1(1): p. 39-48, 2020.
  • Usha, D. and Jenil, A., “A survey of Big Data processing in perspective of Hadoop and MapReduce”, International Journal of Current Engineering and Technology, 4(2): p. 602-606, 2014.
  • Dean, J., Ghemawat, S., ”MapReduce: a flexible data processing tool”, Communications of the ACM, 53(1): p. 72-77, 2010.
  • Yao, Y., Tai, J., Sheng, B., & Mi, N., “A job size-based scheduler for efficient task assignments in Hadoop”, IEEE Trans. Cloud Comput, 77-83, 2015.
  • Cassales, G. W., Charão, A. S., Pinheiro, M. K., Souveyet, C., & Steffenel, L. A., “Context-aware scheduling for apache hadoop over pervasive environments”, Procedia Computer Science, 52, 202-209, 2015.
  • Li, J., Wang, J., Lyu, B., Wu, J., & Yang, X., “An improved algorithm for optimizing MapReduce based on locality and overlapping”, Tsinghua Science and Technology, 23(6), 744-753, 2018.
  • Xu, Y., & Cai, W., “Hadoop job scheduling with dynamic task splitting”, International Conference on Cloud Computing Research and Innovation (ICCCRI) IEEE, 2015, pp. 120-129.
  • Hadjar, K. and Jedidi, A., “A new approach for scheduling tasks and/or jobs in big data cluster”, 2019 4th MEC International Conference on Big Data and Smart City (ICBDSC), IEEE, 2019.
  • Hu, Z. and Li, D., “Improved heuristic job scheduling method to enhance throughput for big data analytics”, Tsinghua Science and Technology, 27(2): p. 344-357, 2021.
  • Rao, B. T., Susmitha, M., Swathi, T., & Akhil, G. “Implementation Of Hybrid Scheduler In Hadoop”. International Journal of Engineering & Technology, 7(2.7), 868-871, 2018.
  • Senthilkumar, M., “Energy-aware task scheduling using hybrid firefly-bat (ffabat) in big data”. Cybern Inf Technol, 18(2), 98-111, 2018.
  • Gandomi, A., Reshadi, M., Movaghar, A., & Khademzadeh, A., “HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework”. Journal of Big Data, 6(1), 1-16, 2019.
  • Zhu, Y., Samsudin, J., Kanagavelu, R., Zhang, W., Wang, L., Aye, T. T., & Goh, R. S. M., “Fast Recovery MapReduce (FAR-MR) to accelerate failure recovery in big data applications”. The Journal of Supercomputing, 76(5), 3572-3588, 2020.
  • Erdem, E., Aydın, T., & Erkayman, B., “Flight scheduling incorporating bad weather conditions through big data analytics: A comparison of metaheuristics”. Expert Systems, 38(8), e12752, 2021.
  • Dhulavvagol, P. M., Totad, S. G., & Sourabh, S., “Performance analysis of job scheduling algorithms on Hadoop multi-cluster environment”. In Emerging Research in Electronics, Computer Science and Technology, Springer, Singapore, 2019, pp. 457-470.
  • Kalia, K., & Gupta, N., “Analysis of hadoop MapReduce scheduling in heterogeneous environment”. Ain Shams Engineering Journal, 12(1), 1101-1110, 2021.
  • Zarei, A., Safari, S., Ahmadi, M., & Mardukhi, F., “Past, Present and Future of Hadoop: A Survey”. arXiv preprint arXiv:2202.13293, 2022.

AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA

Year 2022, Volume: 8 Issue: 2, 38 - 48, 30.12.2022
https://doi.org/10.22531/muglajsci.1124422

Abstract

Rapid advancements in Big data systems have occurred over the last several decades. The significant element for attaining high performance is "Job Scheduling" in Big data systems which requires more utmost attention to resolve some challenges of scheduling. To obtain higher performance when processing the big data, proper scheduling is required. Apache Hadoop is most commonly used to manage immense data volumes in an efficient way and also proficient in handling the issues associated with job scheduling. To improve performance of big data systems, we significantly analyzed various Hadoop job scheduling algorithms. To get an overall idea about the scheduling algorithm, this paper presents a rigorous background. This paper made an overview on the fundamental architecture of Hadoop Big data framework, job scheduling and its issues, then reviewed and compared the most important and fundamental Hadoop job scheduling algorithms. In addition, this paper includes a review of other improved algorithms. The primary objective is to present an overview of various scheduling algorithms to improve performance when analyzing big data. This study will also provide appropriate direction in terms of job scheduling algorithm to the researcher according to which characteristics are most significant.

References

  • Zameel, A., Najmuldeen, M., and Gormus, S., “Context-Aware Caching in Wireless IoT Networks”, 11th International Conference on Electrical and Electronics Engineering (ELECO), IEEE, 2019, pp. 712-717.
  • Seethalakshmi, V., Govindasamy, V., & Akila, V., “Job scheduling in big data-a survey”, International Conference on Computation of Power, Energy, Information and Communication (ICCPEIC) IEEE, 2018, pp. 023-031.
  • Deshai, N., Venkataramana, S., Hemalatha, I., & Varma, G. P. S., “A Study on Big Data Hadoop Map Reduce Job Scheduling”, International Journal of Engineering & Technology, 7(3), 59-65, 2017.
  • Mohamed, E., & Hong, Z., “Hadoop-MapReduce job scheduling algorithms survey”, 7th International Conference on Cloud Computing and Big Data (CCBD), IEEE, 2016, pp. 237-242.
  • Singh, D., Reddy, C.K., “A survey on platforms for big data analytics”, Journal of big data, 2(1): p. 1-20, 2015.
  • Nagina, D. and Dhingra, S., “Scheduling algorithms in big data: A survey”, Int. J. Eng. Comput. Sci, 5(8): p. 17737-17743, 2016.
  • Cheng, D., Zhou, X., Lama, P., Wu, J., & Jiang, C., “Cross-platform resource scheduling for spark and mapreduce on yarn”, IEEE Transactions on Computers, 66(8), 1341-1353, 2017.
  • Apache Hadoop. (2021, November 11) [online]. Available: http://hadoop.apache.org.
  • Hamad, F. and Alawamrah, A., “Measuring the Performance of Parallel Information Processing in Solving Linear Equation Using Multiprocessor Supercomputer”, Modern Applied Science, 12(3): p. 74, 2018.
  • Liu, J., Pacitti, E. and Valduriez, P., “A survey of scheduling frameworks in big data systems”, International Journal of Cloud Computing, 7(2): p. 103-128, 2018.
  • Guo, Y., Wu, L., Yu, W., Wu, B., & Wang, X., ”The improved job scheduling algorithm of Hadoop platform”, arXiv preprint arXiv :1506.03004, 2015.
  • Hudaib, A.A. and Fakhouri, H.N., “An automated approach for software fault detection and recovery”, Communications and Network, 8(03): p. 158, 2016.
  • Hamad, F., “An overview of Hadoop scheduler algorithms”, Modern applied science, 12(8): p. 69, 2018.
  • Usama, M., Liu, M., and Chen, M., “Job schedulers for Big data processing in Hadoop environment: testing real-life schedulers using benchmark programs”, Digital communications and networks, Elsevier, 2017.
  • Hannan, S.A., “An overview on big data and hadoop”, International Journal of Computer Applications, 154(10), 2016.
  • Dai, X. and Bensaou, B., “Scheduling for response time in Hadoop MapReduce” in 2016 IEEE International Conference on Communications (ICC) IEEE, 2016, pp. 1-6.
  • Shi, Y., Zhang, K., Cui, L., Liu, L., Zheng, Y., Zhang, S., & Yu, H., “MapReduce short jobs optimization based on resource reuse”, Microprocessors and Microsystems, 47, 178-187, 2016.
  • Li, X., Jiang, T., & Ruiz, R., “Heuristics for periodical batch job scheduling in a MapReduce computing framework”, Information Sciences, 326: p. 119-133, 2016.
  • Ghazi, M. R., & Gangodkar, D., “Hadoop, MapReduce and HDFS: a developers perspective”, Procedia Computer Science, 48: p. 45-50, 2015.
  • Liroz-Gistau, M., Akbarinia, R., Agrawal, D., & Valduriez, P., “FP-Hadoop: Efficient processing of skewed MapReduce jobs”, Information Systems, 60, 69-84, 2016.
  • Singla, M., “A survey on Static and Dynamic Hadoop Schedulers”, Advances in Computational Sciences and Technology, 10(8): p. 2317-2325, 2017.
  • Rao, B. T., & Reddy, L. S. S., “Samr: A self-adaptive mapreduce scheduling algorithm in heterogeneous environment”, arXiv preprint arXiv: 1207.0780, 2012.
  • Mavridis, I. and Karatza, H., “Performance evaluation of cloud-based log file analysis with Apache Hadoop and Apache Spark”, Journal of Systems and Software, 125: p. 133-151, 2017.
  • Xue, T., You, X., Yan, M., “Research on Hadoop job scheduling based on an improved genetic algorithm”, INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 10(2): p. 1-12, 2017.
  • Suresh, S. and Gopalan, N. P., “An optimal task selection scheme for Hadoop scheduling”, IERI Procedia, 10: p. 70-75, 2014.
  • Mana, S.C., “A feature based comparison study of big data scheduling algorithms”, 2018 International Conference on Computer, Communication, and Signal Processing (ICCCSP) IEEE. 2018.
  • Al-Sayyed, R. M., Fakhouri, H. N., Murad, S. F., & Fakhouri, S. N., “CACS: Cloud Environment Autonomic Computing System”, Journal of Software Engineering and Applications, 10(03), 273, 2017.
  • Wang, Z. and Shen, Y., “Job-aware scheduling for big data processing”, 2015 International Conference on Cloud Computing and Big Data (CCBD), IEEE, 2015, pp. 177-180.
  • Yuan, D., Yang, Y., Liu, X., & Chen, J., “A data placement strategy in scientific cloud workflows” Future Generation Computer Systems, 26(8), 1200-1214, 2010.
  • Chen, Q., Zhang, D., Guo, M., Deng, Q., & Guo, S., “Samr: A self-adaptive mapreduce scheduling algorithm in heterogeneous environment”. In 2010 10th IEEE International Conference on Computer and Information Technology, IEEE, 2010, pp. 2736-2743.
  • Jia, Z., Zhou, R., Zhu, C., Wang, L., Gao, W., Shi, Y., Zhan, J. and Zhang, L., “The implications of diverse applications and scalable data sets in benchmarking big data systems”, In Specifying Big Data Benchmarks, Springer, Berlin, Heidelberg, 2012, pp. 44-59.
  • Alam, A. and Ahmed, J., “Hadoop architecture and its issues” in 2014 International Conference on Computational Science and Computational Intelligence, IEEE, 2014.
  • Casavant, T. L. and Kuhl, J. G., “A taxonomy of scheduling in general-purpose distributed computing systems”. IEEE Transactions on software engineering, 14(2): p. 141-154, 1988.
  • Abraham, A., Buyya, R. and Nath, B., “Nature’s heuristics for scheduling jobs on computational grids”, The 8th IEEE international conference on advanced computing and communications (ADCOM 2000), 2000.
  • Senthilkumar, M. and Ilango, P., “A survey on job scheduling in big data”, Cybernetics and Information Technologies, 16(3): p. 35-51, 2016.
  • Brahmwar, M., Kumar, M., and Sikka, G., “Tolhit–a scheduling algorithm for hadoop cluster”, Procedia Computer Science, 89: p. 203-208, 2016.
  • Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., & Stoica, I., “Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling”, In Proceedings of the 5th European conference on Computer systems, 2010, pp. 265-278.
  • Divya, S., Kanya Rajesh, R., Rini Mary Nithila, I. and Vinothini, M., “Big Data Analysis and Its Scheduling Policy–Hadoop”, IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN, 2278-0661, 2015.
  • Nikhil, B., Riddhikesh, B., & Patil Balu, T. M., “A Survey On Scheduling In Hadoop For Bigdata Processing”, Multidisciplinary Journal of Research in Engineering and Technology, 2(3), 497-501, 2015.
  • Yoo, D. and Sim, K. M., “A comparative review of job scheduling for MapReduce”, 2011 IEEE International Conference on Cloud Computing and Intelligence System, IEEE, 2011.
  • Patil, A. U., Bagban, T. I., & Pande, A. P., “Recent Job Scheduling Algorithms in Hadoop Cluster Environments: A Survey”, International journal of Advanced Research in computer and communication Engineering, 4(2), 2015.
  • Gautam, J. V., Prajapati, H. B., Dabhi, V. K., & Chaudhary, S., “A survey on job scheduling algorithms in big data processing”, 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), IEEE, 2015, pp. 1-11.
  • Xie, Q., Pundir, M., Lu, Y., Abad, C. L., & Campbell, R. H., “Pandas: robust locality-aware scheduling with stochastic delay optimality”, IEEE/ACM Transactions on Networking, 25(2), 662-675, 2016.
  • Kc, K. and Anyanwu, K., “Scheduling hadoop jobs to meet deadlines”, 2010 IEEE Second International Conference on Cloud Computing Technology and Science, IEEE, 2010.
  • Johannessen, R., Yazidi, A., & Feng, B., “Hadoop MapReduce scheduling paradigms”, 2017 IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), IEEE, 2017.
  • Liu, Z., Zhang, Q., Ahmed, R., Boutaba, R., Liu, Y., & Gong, Z., “Dynamic resource allocation for MapReduce with partitioning skew”, IEEE Transactions on Computers, 65(11), 3304-3317, 2016.
  • Yong, M., Garegrat, N., & Mohan, S., “Towards a resource aware scheduler in hadoop”, Proc. ICWS, 2009.
  • Mashayekhy, L., Nejad, M. M., Grosu, D., Lu, D., & Shi, W., “Energy-aware scheduling of mapreduce jobs”. In 2014 IEEE International Congress on Big Data, IEEE, 2014, pp. 32-39.
  • Khalil, W. A., Torkey, H., & Attiya, G., “Survey of Apache Spark optimized job scheduling in Big Data”, International Journal of Industry and Sustainable Development, 1(1): p. 39-48, 2020.
  • Usha, D. and Jenil, A., “A survey of Big Data processing in perspective of Hadoop and MapReduce”, International Journal of Current Engineering and Technology, 4(2): p. 602-606, 2014.
  • Dean, J., Ghemawat, S., ”MapReduce: a flexible data processing tool”, Communications of the ACM, 53(1): p. 72-77, 2010.
  • Yao, Y., Tai, J., Sheng, B., & Mi, N., “A job size-based scheduler for efficient task assignments in Hadoop”, IEEE Trans. Cloud Comput, 77-83, 2015.
  • Cassales, G. W., Charão, A. S., Pinheiro, M. K., Souveyet, C., & Steffenel, L. A., “Context-aware scheduling for apache hadoop over pervasive environments”, Procedia Computer Science, 52, 202-209, 2015.
  • Li, J., Wang, J., Lyu, B., Wu, J., & Yang, X., “An improved algorithm for optimizing MapReduce based on locality and overlapping”, Tsinghua Science and Technology, 23(6), 744-753, 2018.
  • Xu, Y., & Cai, W., “Hadoop job scheduling with dynamic task splitting”, International Conference on Cloud Computing Research and Innovation (ICCCRI) IEEE, 2015, pp. 120-129.
  • Hadjar, K. and Jedidi, A., “A new approach for scheduling tasks and/or jobs in big data cluster”, 2019 4th MEC International Conference on Big Data and Smart City (ICBDSC), IEEE, 2019.
  • Hu, Z. and Li, D., “Improved heuristic job scheduling method to enhance throughput for big data analytics”, Tsinghua Science and Technology, 27(2): p. 344-357, 2021.
  • Rao, B. T., Susmitha, M., Swathi, T., & Akhil, G. “Implementation Of Hybrid Scheduler In Hadoop”. International Journal of Engineering & Technology, 7(2.7), 868-871, 2018.
  • Senthilkumar, M., “Energy-aware task scheduling using hybrid firefly-bat (ffabat) in big data”. Cybern Inf Technol, 18(2), 98-111, 2018.
  • Gandomi, A., Reshadi, M., Movaghar, A., & Khademzadeh, A., “HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework”. Journal of Big Data, 6(1), 1-16, 2019.
  • Zhu, Y., Samsudin, J., Kanagavelu, R., Zhang, W., Wang, L., Aye, T. T., & Goh, R. S. M., “Fast Recovery MapReduce (FAR-MR) to accelerate failure recovery in big data applications”. The Journal of Supercomputing, 76(5), 3572-3588, 2020.
  • Erdem, E., Aydın, T., & Erkayman, B., “Flight scheduling incorporating bad weather conditions through big data analytics: A comparison of metaheuristics”. Expert Systems, 38(8), e12752, 2021.
  • Dhulavvagol, P. M., Totad, S. G., & Sourabh, S., “Performance analysis of job scheduling algorithms on Hadoop multi-cluster environment”. In Emerging Research in Electronics, Computer Science and Technology, Springer, Singapore, 2019, pp. 457-470.
  • Kalia, K., & Gupta, N., “Analysis of hadoop MapReduce scheduling in heterogeneous environment”. Ain Shams Engineering Journal, 12(1), 1101-1110, 2021.
  • Zarei, A., Safari, S., Ahmadi, M., & Mardukhi, F., “Past, Present and Future of Hadoop: A Survey”. arXiv preprint arXiv:2202.13293, 2022.
There are 65 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Journals
Authors

Akhtari Zameel 0000-0001-7215-0559

Ahmet Zengin 0000-0003-0384-4148

Early Pub Date November 2, 2022
Publication Date December 30, 2022
Published in Issue Year 2022 Volume: 8 Issue: 2

Cite

APA Zameel, A., & Zengin, A. (2022). AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA. Mugla Journal of Science and Technology, 8(2), 38-48. https://doi.org/10.22531/muglajsci.1124422
AMA Zameel A, Zengin A. AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA. MJST. December 2022;8(2):38-48. doi:10.22531/muglajsci.1124422
Chicago Zameel, Akhtari, and Ahmet Zengin. “AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA”. Mugla Journal of Science and Technology 8, no. 2 (December 2022): 38-48. https://doi.org/10.22531/muglajsci.1124422.
EndNote Zameel A, Zengin A (December 1, 2022) AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA. Mugla Journal of Science and Technology 8 2 38–48.
IEEE A. Zameel and A. Zengin, “AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA”, MJST, vol. 8, no. 2, pp. 38–48, 2022, doi: 10.22531/muglajsci.1124422.
ISNAD Zameel, Akhtari - Zengin, Ahmet. “AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA”. Mugla Journal of Science and Technology 8/2 (December 2022), 38-48. https://doi.org/10.22531/muglajsci.1124422.
JAMA Zameel A, Zengin A. AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA. MJST. 2022;8:38–48.
MLA Zameel, Akhtari and Ahmet Zengin. “AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA”. Mugla Journal of Science and Technology, vol. 8, no. 2, 2022, pp. 38-48, doi:10.22531/muglajsci.1124422.
Vancouver Zameel A, Zengin A. AN OVERVIEW OF HADOOP JOB SCHEDULING ALGORITHMS FOR BIG DATA. MJST. 2022;8(2):38-4.

5975f2e33b6ce.png
Mugla Journal of Science and Technology (MJST) is licensed under the Creative Commons Attribution-Noncommercial-Pseudonymity License 4.0 international license