Top Domains by Extracted Triples for Extractor html-mf-species


Back to Statistics

This page contains the list of top domains using the Microformats species of the extraction of November 2017 of the Web Data Commons project. The page shows the top domains employing Microformats species within their websites, ordered by the number of triples found in the crawl corpus.


  1. wikipedia.org (653,315 triples)
  2. preen.com (28,100 triples)
  3. blogspot.com (13,704 triples)
  4. antwiki.org (12,407 triples)
  5. hitchhikersgui.de (8,557 triples)
  6. wikimedia.org (6,034 triples)
  7. thefullwiki.org (5,905 triples)
  8. wiktionary.org (5,830 triples)
  9. wikidoc.org (2,511 triples)
  10. wikivisually.com (1,915 triples)
  11. wordpress.com (1,604 triples)
  12. wikien4.appspot.com (1,231 triples)
  13. mashpedia.com (1,085 triples)
  14. insect-collection.com (805 triples)
  15. marefa.org (554 triples)
  16. zipcodezoo.com (525 triples)
  17. like2do.com (438 triples)
  18. everipedia.org (427 triples)
  19. portadelaidewiki.org.au (391 triples)
  20. omicsgroup.org (372 triples)
  21. 7seas.ca (367 triples)
  22. yooooo.us (366 triples)
  23. blogspot.sg (326 triples)
  24. misterbulldog.net (297 triples)
  25. tfode.com (291 triples)
  26. copro.com.ar (239 triples)
  27. dezinsekcija.net (234 triples)
  28. blogspot.ie (222 triples)
  29. enacademic.com (219 triples)
  30. blogspot.rs (214 triples)
  31. yourna.com (212 triples)
  32. 720p.fr (210 triples)
  33. blogspot.com.br (201 triples)
  34. blogspot.gr (196 triples)
  35. blogspot.ca (189 triples)
  36. yolasite.com (169 triples)
  37. wiki2.org (168 triples)
  38. blogspot.co.uk (164 triples)
  39. academic.ru (156 triples)
  40. biotadofuturo.com.br (154 triples)
  41. kidzsearch.com (150 triples)
  42. alohafarms.net (140 triples)
  43. evergreen.edu (138 triples)
  44. nosterprobiotics.com (126 triples)
  45. wikipediaaudio.com (122 triples)
  46. sinoxnursery.com (119 triples)
  47. doodlemeister.com (115 triples)
  48. lycaeum.org (111 triples)
  49. snaturou2,000.sk (100 triples)
  50. altervista.org (100 triples)
  51. deepseafishingsandiego.net (95 triples)
  52. wikien3.appspot.com (90 triples)
  53. newdrugapprovals.org (85 triples)
  54. kiddle.co (85 triples)
  55. nurseryseedlings.co (80 triples)
  56. shahroodoffroad.ir (80 triples)
  57. wikitrans.net (79 triples)
  58. wikiomni.com (76 triples)
  59. blogspot.in (71 triples)
  60. rocklandsbirdsanctuary.info (69 triples)
  61. pesonasumba.com (69 triples)
  62. partcommunity.com (68 triples)
  63. racerocks.ca (64 triples)
  64. beingsearch.com (63 triples)
  65. ipfs.io (59 triples)
  66. pixnet.net (59 triples)
  67. infogalactic.com (59 triples)
  68. wikipedia.gr (56 triples)
  69. kaktus.id (54 triples)
  70. blogspot.my (53 triples)
  71. oiseaux.net (52 triples)
  72. th.ai (52 triples)
  73. yovla.com (51 triples)
  74. plantascarnivoras.com.br (51 triples)
  75. know.cf (49 triples)
  76. biyologlar.com (47 triples)
  77. westwoodpavillion.com (46 triples)
  78. wmflabs.org (45 triples)
  79. cafemom.com (44 triples)
  80. 100ke.info (43 triples)
  81. century-hvac.com (34 triples)
  82. bareinfo.com (33 triples)
  83. blogspot.co.id (33 triples)
  84. lineblog.me (32 triples)
  85. bioxsine.com.pl (30 triples)
  86. blogspot.com.es (30 triples)
  87. demoi.info (29 triples)
  88. kfd.me (27 triples)
  89. explore-science-beyond-the-classroom.com (24 triples)
  90. meddic.jp (23 triples)
  91. mannaismayaadventure.com (23 triples)
  92. animalha.com (23 triples)
  93. webs.com (23 triples)
  94. sciencealcove.com (23 triples)
  95. admicos.cf (23 triples)
  96. newsnfo.co.uk (23 triples)
  97. dailygaggle.com (23 triples)
  98. weblaboratorium.hu (22 triples)
  99. e-monsite.com (22 triples)
  100. blogspot.fr (21 triple)
  101. simplebooklet.com (20 triples)
  102. jsppharma.com (20 triples)
  103. 0wikipedia.org (20 triples)
  104. wikimatn.ir (20 triples)
  105. muahaaa.com (20 triples)
  106. isgezond.nl (20 triples)
  107. mihanblog.com (20 triples)
  108. pise.cz (19 triples)
  109. huatuo.org (18 triples)
  110. itmt.ir (17 triples)
  111. blogspot.ru (17 triples)
  112. distributorpupuktanaman.com (17 triples)
  113. pihattcoffee.com (17 triples)
  114. iftfishing.com (17 triples)
  115. babou-plongee.com (17 triples)
  116. hp-lexicon.org (17 triples)
  117. wikia.com (17 triples)
  118. ctesthetic.com (17 triples)
  119. sapo.pt (17 triples)
  120. jhoona.com (16 triples)
  121. your-dreams-coming-true.com (16 triples)
  122. vkioupi.com (16 triples)
  123. pantanodeelche.es (16 triples)
  124. metchosinmarine.ca (16 triples)
  125. floridaisnature.com (16 triples)
  126. animalstime.com (16 triples)
  127. abundanceofgood.com (16 triples)
  128. travelmerida.com (15 triples)
  129. gatsgarden.com (14 triples)
  130. seashellshop.com (14 triples)
  131. cumbresblogs.com (13 triples)
  132. gen22.net (13 triples)
  133. healthvalleysupplements.com (13 triples)
  134. uncyclopedia.co (13 triples)
  135. blogspot.com.au (13 triples)
  136. ning.com (13 triples)
  137. 21quotes.com (13 triples)
  138. debate.org (11 triple)
  139. bloggen.be (10 triples)
  140. bbc.co.uk (10 triples)
  141. jeniuscaraalkitab.com (10 triples)
  142. popflock.com (10 triples)
  143. pamisnice.com (10 triples)
  144. gpedia.com (9 triples)
  145. mekonginfo.org (9 triples)
  146. hupont.hu (9 triples)
  147. jakearchibald.com (8 triples)
  148. hidupsimpel.com (7 triples)
  149. isnare.com (6 triples)
  150. srokkhmer.org (5 triples)
  151. blogspot.be (5 triples)
  152. happyvideonetwork.com (5 triples)
  153. maniakucing.com (5 triples)
  154. blogspot.no (5 triples)
  155. blogia.com (5 triples)
  156. silichip.org (5 triples)
  157. grabduck.com (5 triples)
  158. montagneaperte.it (5 triples)
  159. judydykstrabrown.com (5 triples)
  160. akairan.com (5 triples)
  161. tr.gg (4 triples)
  162. onlypet.ir (4 triples)
  163. blogg.org (4 triples)
  164. rochakfacts.com (4 triples)
  165. puppyfinder.com (4 triples)
  166. eol.org (4 triples)
  167. blogspot.mx (3 triples)
  168. petsmag.ro (3 triples)
  169. nguontinviet.com (3 triples)
  170. superyachtcuisine.com (2 triples)
  171. concepts.org (2 triples)
  172. blogspot.cl (2 triples)
  173. esacademic.com (2 triples)
  174. learn-barmaga.com (2 triples)
  175. thenarrowgateweb.com (2 triples)
  176. blogspot.it (2 triples)
  177. tatoott1,009.com (2 triples)
  178. blogspot.hu (2 triples)
  179. sunitjotravel.com (2 triples)
  180. pictures-of-cats.org (2 triples)
  181. ahlamountada.com (2 triples)
  182. havanesebreeders.org (2 triples)
  183. gotoknow.org (2 triples)
  184. sibelatasoy.com (1 triple)
  185. naturelium.com (1 triple)
  186. forumsline.com (1 triple)
  187. sussle.org (1 triple)
  188. mex.tl (1 triple)
  189. webgarden.es (1 triple)
  190. aded.co.za (1 triple)
  191. worldbirds.eu (1 triple)
  192. defaultlogic.com (1 triple)
  193. forodominicana.com (1 triple)
  194. ardiyansyah.com (1 triple)
  195. newnesh.com (1 triple)
  196. forumotion.com (1 triple)
  197. answersbd.com (1 triple)
  198. typepad.com (1 triple)
  199. veterinaryknowledge.com (1 triple)
  200. rashal.com (1 triple)
  201. yoo7.com (1 triple)
  202. duchessnduke.com (1 triple)
  203. crear-foros.com (1 triple)
  204. ahlamontada.com (1 triple)
  205. ciclidos-mexico.com (1 triple)
  206. wowcity.com (1 triple)
  207. speakingtree.in (1 triple)
  208. adailydiary.com (1 triple)
  209. turkaramamotoru.com (1 triple)
  210. blogspot.jp (1 triple)
  211. kidzfeed.com (1 triple)
  212. tarantulaspiders.com (1 triple)
  213. blogcu.com (1 triple)
  214. supervivencia-y-naturaleza.com (1 triple)
  215. toppost.co (1 triple)
  216. pillarsbooks.com (1 triple)
  217. blogspot.tw (1 triple)
  218. eoldal.hu (1 triple)
  219. designreplace.com (1 triple)
  220. obolog.es (1 triple)
  221. goodforum.net (1 triple)
  222. fullblog.com.ar (1 triple)
  223. robotpirateninja.com (1 triple)
  224. inside-peru.com (1 triple)