##plugins.themes.academic_pro.article.main##
Abstract
Distributed computing accomplished broad appropriation because of consequently parallelizing and transparently executing tasks in distributed environments. Straggling tasks is an essential test confronted by all Big Data Processing Frameworks, for example, Mapreduce[3], Dryad[4], Spark[5]. Stragglers are the assignments that run much slower than different tasks and since a job completes just when it’s last undertaking completions, stragglers postponement work fruition. The literature reviews stragglers recognition and rescheduling systems proposed so far and brings up their strengths and shortcomings. This thesis additionally displays wise attributes and impediments of the existing state- of-the- craftsmanship calculations to take care of the issue of stragglers relief. This thesis presents a systematic and organized study of community detection techniques. The literature survey shows that most of the algorithms fail to efficiently reschedule the stragglers. Innocently one may anticipate that straggler taking care of will be a simple assignment, doubling tasks that are sufficiently slower. Actually it is a complex issue for a few reasons. In the first place, Speculative assignments are not free they seek certain assets, for example, system with other running tasks. Second, picking the node to run speculative task on is as significant as picking the task. Third, in Heterogeneous environment it may be challenging to recognize nods that are marginally slower than the mean and stragglers. At long last, Stragglers ought to be recognized as right on time as could reasonably be expected. The proposed framework uses mobile agent approach for rescheduling because the agent can start the execution at the other place from the very same place they left in the earlier machine. The implementation and results shows the proposed work is efficient and improves the overall performance of a big data processing framework.
##plugins.themes.academic_pro.article.details##
References
2. Chris Eaton, Dirk Daroos, Tom Deustch., and George Lapis, Paul Zikopolous. “Understanding Big Data:.â€, (2011)
3. J. Dean and Sanjay Ghemawat. "MapReduce: Simplified Data processing on large clusters ", Commun. ACM, 51:107-113(2004).
4. Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly, D., "Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks", EuroSys’07, 424, 175-308, (2007).
5. Spark homepage: http://www.spark-project.org
6. M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. “Effective Straggler Mitigation: Attack of the Clonesâ€. In Proceedings of the 8th USENIX conference on Operating systems design and implementation, OSDI’08, pages 29–42, 2008..
7. Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, Ion Stoica, "Effective Straggler Mitigation: Attack of the Clones", In USENIX NSDI,2012.
8. G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, E. Harris, and B. Saha. Reining in the Outliers in Map-Reduce Clusters using Mantri. In USENIX OSDI, 2010.
9. Rohan Gandhi, Amit Sabne "Finding Stragglers in Hadoop", In Proceedings of the 8th USENIX conference on Operating systems design and implementation, OSDI’, 32, 425-443, (2008).
10. Freeman, L. C., "Finding Stragglers in Parallel Computation", ACM, 1, 215-239, (2009).
11. Carl W. Olofson, Randy Perry, "IDC Analyze the future", White Paper, 104, 36-41, (2011).
12. Guimerà , R., Sales-Pardo, M.. Amaral, L. A. N., "Search Engine Architectures from Conventional to P2P", Physical Review E, 70, 025101, (2012).
13. Hadoop Native Scheduler, Hadoop http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/NativeLibraries.html
14. Neil Raden, Hired Brains, "Big Data Analytics Architecture", Physical Review E, 70, 056131, 20 July (2012).
Chris Eaton, Dirk Daroos, Tom Deustch., and George Lapis, Paul Zikopolous. “Understanding Big Data:.â€, (2011)