Web crawling

Author(s)

    • Olston, Christopher
    • Najork, Marc

Bibliographic Information

Web crawling

Christopher Olston, Marc Najork

(Foundations and trends in information retrieval, 4:3)

Now Publishers, c2010

Available at  / 1 libraries

Search this Book/Journal

Note

"This book is originally published as Foundations and trends in information retrieval, volume 4, issue 3 (2010), ISSN: 1554-0669"--Back cover

Includes bibliographical references (p. 67-74)

Description and Table of Contents

Description

This is a survey of the science and practice of web crawling. While at first glance web crawling may appear to be merely an application of breadth-first-search, the truth is that there are many challenges ranging from systems concerns such as managing very large data structures, to theoretical questions such as how often to revisit evolving content sources. This survey outlines the fundamental challenges and describes the state-of-the-art models and solutions. It also highlights avenues for future work.

Table of Contents

1: Introduction 2: Crawler Architecture 3: Crawl Ordering Problem 4: Batch Crawl Ordering 5: Incremental Crawl Ordering 6: Avoiding Problematic and Undesirable Content 7: Deep Web Crawling 8: Future Directions. References

by "Nielsen BookData"

Related Books: 1-1 of 1

Details

  • NCID
    BB17240167
  • ISBN
    • 9781601983220
  • Country Code
    us
  • Title Language Code
    eng
  • Text Language Code
    eng
  • Place of Publication
    Hanover, Mass.
  • Pages/Volumes
    ix, 74 p.
  • Size
    24 cm
  • Parent Bibliography ID
Page Top