Abstract
This paper investigates the task of scheduling jobs across several servers in a software system similar to the Enterprise Desktop Grid. One of the features of this system is that it has a specific area of action – collects outbound hyperlinks for a given set of websites. The target set is scanned continuously (and regularly) at certain time intervals. Data obtained from previous scans are used to construct the next scanning task for the purpose of enhancing efficiency (shortening the scanning time in this case). A mathematical model for minimizing the scanning time for a batch mode is constructed; an approximate algorithm for the solution of the model is proposed. A series of experiments are carried out in a real software system. The results obtained from the experiments enabled to compare the proposed batch mode with the known round robin mode. This revealed the advantages and disadvantages of the batch mode.
Pechnikov, A & Chernobrovkin, D (2021). About Crawling Scheduling Problems. Afribary. Retrieved from https://afribary.com/works/about-crawling-scheduling-problems
Pechnikov, Andrey and Denis Chernobrovkin "About Crawling Scheduling Problems" Afribary. Afribary, 13 Apr. 2021, https://afribary.com/works/about-crawling-scheduling-problems. Accessed 16 Nov. 2024.
Pechnikov, Andrey, Denis Chernobrovkin . "About Crawling Scheduling Problems". Afribary, Afribary, 13 Apr. 2021. Web. 16 Nov. 2024. < https://afribary.com/works/about-crawling-scheduling-problems >.
Pechnikov, Andrey and Chernobrovkin, Denis . "About Crawling Scheduling Problems" Afribary (2021). Accessed November 16, 2024. https://afribary.com/works/about-crawling-scheduling-problems