S. Abiteboul, . Cobena, J. Masanès, and G. Sedrati, A First Experience in Archiving the French Web, Proceedings of the Research and advanced technology for digital libraries: 6th European conference (Italy, 2002.
DOI : 10.1007/3-540-45747-X_1

B. Andersen, The DK-domain: in words and figures, 2005.

M. Ashenfelder, Web Harvesting and Streaming Media, Proceedings of the 6 th International Web Archiving Workshop, 2006.

R. Baeza-yates, C. Castillo, and V. Lopez, Characteristics of the Web of Spain, Cybermetrics, 9, 2005.

R. Baeza-yates, C. Castillo, M. Marin, and A. Rodriguez, Crawling a country, Special interest tracks and posters of the 14th international conference on World Wide Web , WWW '05, 2005.
DOI : 10.1145/1062745.1062768

N. Baly and F. Sauvin, Archiving Streaming Media on the Web, Proof of concept and Firsts Results, Proceedings of the 6 th International Web Archiving Workshop, 2006.

S. Brin and L. Page, The anatomy of a large-scale hypertextual Web search engine, Computer Networks and ISDN Systems, pp.107-117, 1921.
DOI : 10.1016/S0169-7552(98)00110-X

. Dailymotion, Dailymotion ? Partagez vos vidéos, 2008.

D. Gomes and M. Silva, Characterizing a national community web, ACM Transactions on Internet Technology, vol.5, issue.3, pp.508-531
DOI : 10.1145/1084772.1084775

. Heritrix, Heritrix Home Page, 2008.

G. Illien, S. Aubry, Y. Hafri, and F. Lasfargues, Sketching and checking quality for web archives: a first stage report from BnF. Bibliothèque nationale de France, 2006.

G. Illien, Web archiving at BnF, International Preservation News, pp.27-34, 2006.

M. Kimpton, M. Braggs, and J. Ubois, Year-by-Year: From an Archive of the Internet to an Archive on the Internet, 2006.
DOI : 10.1007/978-3-540-46332-0_9

P. Koerbin, Report on the crawl and Harvest of the Whole Australian Web Domain Undertaken during, 2005.

P. Koerbin, The Australian Web domain harvests: a preliminary quantitative analysis of the archive data. National Library of Australia, Canberra, 2008.

J. Masanès, Towards Continuous Web Archiving, D-Lib Magazine, vol.8, issue.12, 2002.
DOI : 10.1045/december2002-masanes

J. Masanès, Selection for Web Archives, 2006.
DOI : 10.1007/978-3-540-46332-0_3

G. Mohr, M. Kimpton, M. Stack, R. , and I. , Introduction to Heritrix, an archival quality Web crawler. Paper presented at the 4 th International Web Archiving Workshop, 2004.

M. Najork and J. L. Wiener, Breadth-First Search Crawling Yields High-Quality Pages, Proceedings of the 10 th international conference on World Wide Web, pp.114-118, 2001.

Y. Sun, Z. Zhuang, . Councilli, and C. Giles, Determining Bias to Search Engines from Robots.txt, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07), pp.149-155, 2007.
DOI : 10.1109/WI.2007.98

Y. Sun, Z. Zhuang, and C. L. Giles, A large-scale study of robots.txt, Proceedings of the 16th international conference on World Wide Web , WWW '07, pp.1123-1124, 2007.
DOI : 10.1145/1242572.1242726