Jump to content

Talk:Data dump torrents: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Content deleted Content added
Ivan386 (talk | contribs)
Line 68: Line 68:


Does anyone have this [https://web.archive.org/web/20160312080904/http://burnbit.com/torrent/397549/ruwiki_20150806_pages_articles_multistream_xml_bz2 dump]? [[User:Ivan386|Ivan386]] ([[User talk:Ivan386|talk]]) 01:43, 19 November 2016 (UTC)
Does anyone have this [https://web.archive.org/web/20160312080904/http://burnbit.com/torrent/397549/ruwiki_20150806_pages_articles_multistream_xml_bz2 dump]? [[User:Ivan386|Ivan386]] ([[User talk:Ivan386|talk]]) 01:43, 19 November 2016 (UTC)
: I have ruwiki-20150806-pages-articles4.xml.bz2, but the latest dumps are https://dumps.wikimedia.org/ruwiki/20161120/ --[[User:AVRS|AVRS]] ([[User talk:AVRS|talk]]) 20:08, 23 November 2016 (UTC)

Revision as of 20:08, 23 November 2016

I think much of all the wikimedia effort should be toward uploading torrent xml

As I have tried download it without torrent and it broke many times so I had to start the process over and over again. Just a advice! MahdiTheGuidedOne (talk) 22:57, 11 July 2012 (UTC)Reply

Different tracker for enwiki-20120902-pages-articles.xml.bz2

Burnbit's tracker has been erratic, and right now their entire website is down. I've been thinking of setting up an additional tracker (I had some spare resources available to run one) and that downtime pushed me to give it a go. It's at tracker.lbft.net:6969 and it's restricted to authorised torrents (currently only enwiki-20120902-pages-articles.xml.bz2).

If anyone has any objection, or if I've stepped on anybody's toes by doing this, feel free to replace the torrent I created and linked with a different one (and I'll stop seeding mine and start seeding yours). If I've done anything wrong with the torrent or tracker, please let me know.

In case anything happens to my tracker (which I don't anticipate) the torrent will continue working; it's web seeded from dumps.wikimedia.org like the Burnbit torrents, I added OpenBitTorrent's tracker as a backup and modern BitTorrent clients do reasonably well on DHT alone.

-- lbft (talk) 10:00, 3 September 2012 (UTC)Reply

Updated dewiki, frwiki, itwiki and nlwiki torrents

Since people seem to be using that enwiki torrent and burnbit's still down, I've also added torrents for the latest dumps of the other wikis on the page (dewiki-20120829-pages-articles.xml.bz2, frwiki-20120827-pages-articles.xml.bz2, itwiki-20120831-pages-articles.xml.bz2.torrent and nlwiki-20120824-pages-articles.xml.bz2.torrent.) I was able to find the original burnbit torrent files for these, so I just added my tracker and OpenBitTorrent's to them. That means that people who already had those torrents should at least be able to see peers in the new torrents over DHT.

If you were downloading one of those torrents and you don't see any seeds/peers, make sure you enable DHT in your client and, if your client supports it, add http://tracker.lbft.net:6969/announce and udp://tracker.openbittorrent.com:80 to the list of trackers - or just remove the torrent from your client, download and add the torrent file again and tell it to verify the existing file.

--lbft (talk) 09:34, 4 September 2012 (UTC)Reply

Thanks. Will try that! :) Jesse V. (talk) 03:38, 16 September 2012 (UTC)Reply

Burnbit's back up again, both their website and their tracker, but I'm not entirely comfortable relying solely on them - so with the latest dump, nlwiki-20120913-pages-articles.xml.bz2, I've added my tracker and OpenBitTorrent to their torrent. If you prefer the original Burnbit torrent, it's linked too. --lbft (talk) 00:53, 18 September 2012 (UTC)Reply

frwiki-20130104-pages-articles.xml.bz2

This torrent is web seeded from dumps.wikimedia.your.org (an official dumps mirror) only, since BurnBit must've got a corrupted download from dumps.wikimedia.org for this torrent. The md5sums file shows that the file should have an MD5 hash of 7602a371059be5dfb5e10e05d9211736, but the dumps.wikimedia.org torrent has a hash of 75b2fe4dfd0146c2b7c38f3c2cf491a6 instead.

Basically, the frwiki-20130104-pages-articles.xml.bz2 torrent linked on Data dump torrents is fine, but if you go stick the dumps.wikimedia.org URL into BurnBit's website the torrent it sends you to is broken. -- lbft (talk) 08:07, 6 January 2013 (UTC)Reply

Alternatives to burnbit.com?

Is there a safe alternative to burnbit? I ask because burnbit is (a) failing to process the latest enwiki dump, (b) getting multiple spam windows past my McAfee Antivirus protection and causing my Firefox to crash. Could the WMF host the torrent files that go with these dumps? -- John of Reading (talk) 12:31, 8 February 2015 (UTC)Reply

Kiwix uses MirrorBrain. You could send a patch to enable such a system on dumps.wikimedia.org. If you are interested, you can probably copy something from Kiwix and ask Kelson if something is missing.[1] --Nemo 08:10, 9 February 2015 (UTC)Reply
Well, I followed your links for I'm not much wiser! Are you saying that MirrorBrain is already used by one WMF project, so could reasonably be extended/adapted/redeployed (??) to do another task? I'm not competent to send a patch myself. Where could I propose that WMF take this on? -- John of Reading (talk) 16:27, 9 February 2015 (UTC)Reply
I'm saying that MirrorBrain is used by some Wikimedia folks, but not by WMF. This was already proposed to WMF, at bugzilla:27653. If someone offered to do the work, I expect WMF would accept.
Otherwise, maybe there are other such services, but I don't know. Sorry, this is all I know. --Nemo 17:00, 9 February 2015 (UTC)Reply
OK, thank you. I'll keep watching this page in case someone has better luck with the latest enwiki dump. -- John of Reading (talk) 18:08, 9 February 2015 (UTC)Reply

Request for enwiki pages-meta-current

Would it be possible to add enwiki-20160305-pages-meta-current.xml.bz2 (and future files) to this page as well? Thanks! GoingBatty (talk) 12:52, 8 March 2016 (UTC)Reply

@GoingBatty: Apparently not. When I tried it, burnbit.com responded immediately with "Sorry! but that file is too big". Perhaps there's another site out there that accepts bigger torrents? -- John of Reading (talk) 18:00, 8 March 2016 (UTC)Reply

Burnbit doesn't support https

All the URLs at //dumps.wikimedia.org/enwiki/20160407/ are HTTPS, which burnbit.com refuses to accept: "The URL given is invalid! Only HTTP urls are supported by most of the torrent clients." Is this a recent change by dumps.wikimedia.org or by burnbit? Is there a workaround? -- John of Reading (talk) 13:03, 16 April 2016 (UTC)Reply

(More) Apparently the workaround is to add ".your" in the middle of the domain name, http://dumps.wikimedia.your.org - though burnbit took over a week to process the file. -- John of Reading (talk) 05:42, 28 April 2016 (UTC)Reply

2016-04-07 data dump

I am using the data dumps to measure the size of Wikipedia over time. This data dump, due to compilation error over its data dump, would be ignored for my needs, as it shrunk by approximately 10% of its actual value. Johnny Au (talk) 12:44, 5 May 2016 (UTC)Reply

EnWiki

I can't seem to access the most recent dump. Can anyone assist?

400 Lux (talk) 15:49, 23 July 2016 (UTC)Reply

Someone has created a torrent for the 20160720 dump, and I've added the link here. The original torrent for the 20160701 dump was corrupt and has been deleted. I've asked Burnbit to create a new one and will add it here if it works properly this time. -- John of Reading (talk) 16:48, 23 July 2016 (UTC)Reply
The most recent torrents can take a few days to complete. Get the 2nd most recent torrent. Chuckr30 (talk) 14:37, 21 October 2016 (UTC)Reply

Burnbit is apparently dead.

According to this burnbit.com has been down for more than a month. Almost all of the links on this page are dead. Brightgalrs (talk) 20:41, 3 October 2016 (UTC)Reply

Yes, burnbit is dead, so is Mononova.org. I just tried both. TBP sucks as it does a popup every 1-2 minutes. Where else can I upload Wikipedia torrents? https://newtorrentzeu.com/ doesn't seem to support uploading torrents. Chuckr30 (talk) 14:33, 21 October 2016 (UTC)Reply

ruwiki-20150806-pages-articles-multistream.xml.bz2

Does anyone have this dump? Ivan386 (talk) 01:43, 19 November 2016 (UTC)Reply

I have ruwiki-20150806-pages-articles4.xml.bz2, but the latest dumps are https://dumps.wikimedia.org/ruwiki/20161120/ --AVRS (talk) 20:08, 23 November 2016 (UTC)Reply