Top Domains by Extracted Triples for Extractor html-mf-species


Back to Statistics

This page contains the list of top domains using the Microformats species of the extraction of November 2013 of the Web Data Commons project. The page shows the top domains employing Microformats species within their websites, ordered by the number of triples found in the crawl corpus.


  1. westmidlandbirdclub.com (18,674 triples)
  2. preen.com (8,684 triples)
  3. thefullwiki.org (5,657 triples)
  4. wikipedia.org (5,575 triples)
  5. blogspot.com (3,017 triples)
  6. territorioscuola.com (2,957 triples)
  7. wiktionary.org (1,693 triples)
  8. bbc.co.uk (1,020 triples)
  9. wn.com (940 triples)
  10. wikia.com (907 triples)
  11. wikidoc.org (843 triples)
  12. eol.org (617 triples)
  13. citizendium.org (407 triples)
  14. mashpedia.com (364 triples)
  15. wordpress.com (243 triples)
  16. webs.com (234 triples)
  17. theplantencyclopedia.org (214 triples)
  18. snaturou2000.sk (100 triples)
  19. tanijaya.com (98 triples)
  20. wikimedia.org (87 triples)
  21. 7seas.ca (75 triples)
  22. xingyimax.com (73 triples)
  23. esacademic.com (63 triples)
  24. eoearth.org (48 triples)
  25. pictures-of-cats.org (47 triples)
  26. pigsonthewing.org.uk (35 triples)
  27. banjaristi.web.id (33 triples)
  28. readtiger.com (26 triples)
  29. indahcraft.net (26 triples)
  30. findaplant.co.nz (26 triples)
  31. goo.ne.jp (21 triples)
  32. answers.com (21 triples)
  33. qesign.com (19 triples)
  34. yolasite.com (17 triples)
  35. yahoo.com (17 triples)
  36. sina.com.tw (17 triples)
  37. drchanshealinginstitute.com (17 triples)
  38. orange.es (16 triples)
  39. index.hr (16 triples)
  40. clubpenguinwiki.info (16 triples)
  41. science20.com (16 triples)
  42. bettavillage.com (16 triples)
  43. misterbulldog.net (14 triples)
  44. academic.ru (13 triples)
  45. fanbox.com (11 triples)
  46. debate.org (11 triples)
  47. heroku.com (10 triples)
  48. ebay.com (10 triples)
  49. tumblr.com (8 triples)
  50. kodoom.com (8 triples)
  51. blogspot.com.ar (7 triples)
  52. obathepatitis.info (7 triples)
  53. encydia.com (6 triples)
  54. veterinariosvs.org (5 triples)
  55. mex.tl (5 triples)
  56. zikkir.net (4 triples)
  57. etceter.com (4 triples)
  58. blogspot.mx (2 triples)
  59. blogfa.com (2 triples)
  60. cafemom.com (2 triples)
  61. sensagent.com (2 triples)
  62. wikibooks.org (2 triples)
  63. mysite.com (2 triples)
  64. blogspot.hu (2 triples)
  65. elpais.com (1 triples)
  66. blogspot.com.es (1 triples)
  67. blogspot.co.uk (1 triples)
  68. lonestarball.com (1 triples)
  69. obolog.com (1 triples)
  70. blogcindario.com (1 triples)
  71. hotdog.hu (1 triples)
  72. scmp.com (1 triples)
  73. hayatnotu.com (1 triples)
  74. nhinsider.com (1 triples)
  75. germanshepherdkingdom.com (1 triples)