MediaWiki talk:Spam-blacklist/archives/November 2013

From WikiProjectMed
Jump to navigation Jump to search

Proposed additions

epnrstatus.co.in

epnrstatus.co.in: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

  • Regularly added to articles relating to Indian Railways, often by replacing legitimate links in the articles with the spamlink (both external links and reflinks, with official Indian Railways links being the prime target; sample diffs: [1],[2], [3], [4], [5]). The spamlinks are added by IP-users who all geolocate to India, using a new IP each time. Thomas.W talk to me 10:57, 22 August 2013 (UTC)
 Done OhNoitsJamie Talk 22:17, 6 November 2013 (UTC)

www.historyofnations.net

historyofnations.net: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com – This site uses Wikipedia content without attribution and is placed into the External links section of about 100 'History of ...' articles. Abductive (reasoning) 20:40, 27 August 2013 (UTC)

 Done OhNoitsJamie Talk 16:12, 15 November 2013 (UTC)

thewebminer.com

 Done OhNoitsJamie Talk 19:50, 5 November 2013 (UTC)

scrape4me.com

 Done OhNoitsJamie Talk 19:54, 5 November 2013 (UTC)

500music.com

 Done OhNoitsJamie Talk 16:11, 15 November 2013 (UTC)

Kaun Banega Crorepati‎ spam links

[I am relisting the following entry which was archived without discussion or action. Scammers are continuing to add these links so please consider adding them to the blacklist. —Psychonaut (talk) 16:18, 10 September 2013 (UTC)]

The above websites have been repeatedly spamlinked from Kaun Banega Crorepati (an Indian version of Who Wants To Be A Millionaire?). They all falsely claim to be official KBC websites, or else the users inserting links to them falsely claim or imply that they are official KBC websites. As far as I can tell the websites are operated by scammers trying to trick members of the public into paying fees (via PayPal) to register as a contestant. See MediaWiki talk:Spam-blacklist/archives/November 2012#AdSense pub-6522157377920590 for a previous report. Some or all of these sites are already on XLinkBot's revert list, but some particularly persistent spammers are using autoconfirmed accounts (e.g., Neel12mani (talk · contribs)) to insert the links. Note that the blogspot domains exist on many TLDs (.in, .de, .com, etc.). —Psychonaut (talk) 17:52, 26 June 2013 (UTC)

 Done OhNoitsJamie Talk 16:10, 15 November 2013 (UTC)

lespaulstore.com

lespaulstore.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Just simple spam, repeatedly being added by IP editors. Wait--it geolocates to Hungary and that rings a bell, like I've been here before. Drmies (talk) 03:04, 9 October 2013 (UTC)

 Done, I haven't really looked into the problematic edits myself and defer to your judgement, but lespaulstore.com certainly is spammed persistently enough to warrant the blacklist entry. Amalthea 08:39, 4 November 2013 (UTC)

adf.ly

Resolved

adf.ly: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Personally I'm surprised this isn't here already. There may be some reason I'm not aware of why it's not blacklisted, but just in case it should be, I'm bringing it up here. For those unfamiliar, it's a URL shortener similar to bit.ly or TinyURL, only it displays an interstitial advertisement before linking to the site, and gives the person who shortened the link a cut of the proceeds. It serves a good purpose, but there'd be no legitimate reason to link to it here, other than maybe if it ever got its own article. flarn2006 [u t c] time: 04:52, 9 October 2013 (UTC)

It is already blocked globally at meta:Spam blacklist. Amalthea 09:19, 9 October 2013 (UTC)

tulpa.info

tulpa.info: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

Diffs:

And so on and so forth. IsaacAA (talk) 11:57, 18 October 2013 (UTC)

 Done OhNoitsJamie Talk 16:13, 15 November 2013 (UTC)

recordninja.com

recordninja.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

Repeatedly added by an IP user, who has been repeatedly warned and then blocked. Still being added as of today. A new user account has also begun adding the link, and it is clearly a sock of the IP. Link is constantly added the the Background check article, and is a link to a commercial background check website.--Dmol (talk) 09:42, 4 November 2013 (UTC)

 Done. OhNoitsJamie Talk 20:55, 6 November 2013 (UTC)

adfoc.us

A link-shortening site that is being used by spammers, such as here. Not only that, it's one that generates revenue for every click of the link. As a result, this site has two reasons to be blacklisted, as it hides the true link (alonging circumvention of the blacklist) and it also encourages the less salubrious of people to go and spam their link everywhere. Lukeno94 (tell Luke off here) 16:08, 6 November 2013 (UTC)

 Done Thanks for catching that; I saw that it was being used quite obnoxiously on India cinema articles. OhNoitsJamie Talk 17:55, 6 November 2013 (UTC)

hardeshsharma.blogspot.in

A site that reposts software testing articles. This is quite common with bloggers. The problem is that no attribution is given to the original blogger. This is a copyright violation. The IP who has been adding links to articles on the site doesn't seem to want to discuss. They have been removed for now, but I seem to recall that the IP has changed on occasion. The most recent IP's additions can be found at Special:Contributions/115.248.233.202 Walter Görlitz (talk) 00:50, 7 November 2013 (UTC)

 Done Reaper Eternal (talk) 11:34, 11 November 2013 (UTC)
Please adjust to hardeshsharma.blogspot.* since apparently hardeshsharma.blogspot.com is the same location. Walter Görlitz (talk) 17:11, 11 November 2013 (UTC)
Please make that *hardeshsharma.blogspot.* as the user is adding www.hardeshsharma.blogspot.com now. Walter Görlitz (talk) 15:11, 15 November 2013 (UTC)
 Done - Added .com. OhNoitsJamie Talk 15:47, 15 November 2013 (UTC)

short-biography.com

Not sure what happened to this blacklisting request. There's been continued spamming since, so requesting again. --Ronz (talk) 19:31, 14 November 2013 (UTC)  Done OhNoitsJamie Talk 15:52, 15 November 2013 (UTC)

Malformed entries

I believe a few of these entries are missing the "b" part of the leading "\b":

  • \freegovernmentcellphones4u\.com\b
  • \securityguardtraining-hq\.com\b
  • \thekoreanroyal\.(?:com|org)\b

RobinHood70 talk 06:07, 26 August 2013 (UTC)

You're correct. As of right now, they don't work at all. Jackmcbarn (talk) 21:39, 4 September 2013 (UTC)
Note: I would do this (the actual edit itself seems simple) but I can't work out how to log it - failure to do this seems to attract penalties. Just as a matter of interest: I assume that as regular expressions, \f \s and \t are taken to be some character other than f s and t. I think that \t represents the tab character, - what might the other two be? --Redrose64 (talk) 21:15, 10 September 2013 (UTC)
\s matches a whitespace character, and \f matches a form-feed character. — Mr. Stradivarius ♪ talk ♪ 22:12, 10 September 2013 (UTC)

 Done, hopefully I haven't messed anything up — Martin (MSGJ · talk) 09:59, 16 September 2013 (UTC)

thebestknifesharpenerguide.com

The site and a few other knife sharpening sites have been spamming http://en.wikipedia.org/wiki/Knife_sharpening I undid the edits in this diff http://en.wikipedia.org/w/index.php?title=Knife_sharpening&diff=581155583&oldid=577001352

 Not done Thanks for removing the spam, but since it has only happened once so far, I'm not yet ready to blacklist those sites. If editors re-insert the spam, then feel free to re-request blacklisting. Cheers! Reaper Eternal (talk) 13:49, 11 November 2013 (UTC)

hipromtech.com

hipromtech.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

 Done OhNoitsJamie Talk 20:45, 16 November 2013 (UTC)

SGcafe.com

sgcafe.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com An unreliable source that is growing in popularity (A little above 50 articles used it in July, and now over 100 I believe). Discussed shortly at Reliable sources board and was considered unreliable. It is a blog with four writers and no mention of editorial department. Translations of interviews can not be confirmed for accuracy or source [sgcafe.com/2013/06/creator-metal-gear-series-hideo-kojima-metal-gears-solid-v-global-phenomenon/ 1], uses other blogs as references [sgcafe.com/2013/09/male-anime-fans-characters-look-like-engage-compensated-dating/ 2], and violates copyright restrictions [sgcafe.com/2013/09/kokonoe-will-playable-blazblue-chrono-phantasma-ps3-dlc-character/ 3]. DragonZero (Talk · Contribs) 20:52, 14 September 2013 (UTC)

Additionally it is being used for references and sourcing for completely unrelated items seemingly solely to drive up numbers to the site for certain editors. A popular site but not being used constructively by multiple editors. Canterbury Tail talk 11:39, 15 September 2013 (UTC)
 Done OhNoitsJamie Talk 01:05, 16 November 2013 (UTC)

glamchika.com

glamchika.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

A series of dynamic IPs has been posting in a number of articles about the Indian film industry. Rarely has a single IP done more than a handful of articles. (at least in the instances that I have seen). -- TRPoD aka The Red Pen of Doom 19:14, 15 November 2013 (UTC)

 Done OhNoitsJamie Talk 21:34, 16 November 2013 (UTC)

firstmovie.in

firstmovie.in: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com [11] [12] [13]rybec 06:10, 23 September 2013 (UTC)

From what you say this was only added by one IP, this appears to fail WP:BLACK/WP:BLACKLIST. Amalthea 11:26, 7 October 2013 (UTC)
 Not done per Amalthea. OhNoitsJamie Talk 21:41, 16 November 2013 (UTC)

brooklynrail.org

RB231 (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam) User reported at Wikipedia talk:WikiProject Spam in 2012; blocked for spamming 7 August 2013; persists in spamming. Blacklist seems to be the only sure remedy here. Justlettersandnumbers (talk) 00:29, 21 October 2013 (UTC)

The Brooklyn Rail is a legitimate art world resource. The newspaper produces worthy and legitimate material for referencing...Modernist (talk) 11:26, 21 October 2013 (UTC)
That does not mean, Modernist, that it can not be spammed by people with a vested interest. I think however, since it is only one user, that blocking the user is for now a better solution (and possibly for others who take up the same task of pushing the link too far). I see the account is in fact blocked indef now, so maybe we should leave it for now. --Dirk Beetstra T C 12:45, 10 November 2013 (UTC)
I agree with Beetstra. I haven't seen any other SPAs pop up after I blocked the one, and as such don't see a compelling reason to blacklist at this time. OhNoitsJamie Talk 16:57, 10 November 2013 (UTC)
Hey, I agree too - all I'm saying is that the publication is a legitimate art world resource. The account in question was spamming and needed to be blocked, no disagreement there...Modernist (talk) 17:02, 10 November 2013 (UTC)
Yes, it looks as if Ohnoitsjamie found the right solution there, and I did not. I'd be happy for this to be closed or archived or whatever if this board does that. Justlettersandnumbers (talk) 17:40, 10 November 2013 (UTC)
 Not done for now; will revisit if spam resumes. OhNoitsJamie Talk 21:41, 16 November 2013 (UTC)

musiki.org

Examples:

Users have been creating articles for dead and living Turkish composers. The only reference they have used is musiki.org. Musiki points to a site where you can buy the software program Mus2okur. "Mus2okur is a software program that teaches the basics of Turkish music... The software comes with an expansive database of information on prominent composers, lyricists, compilers and other notable people including photos, biographies and lists of works." Program is €40. No mention of what editorial mechanisms are in place. Some of the people added appear notable, others do not. Bülent Türkeli appears to be an example of a non-notable person. Bgwhite (talk) 07:45, 14 November 2013 (UTC)

We are a group of researchers trying to put our data available on the web. Please see: http://akademik.bahcesehir.edu.tr/~bbozkurt/ and http://compmusic.upf.edu/node/8. We are not trying to promote any software, it is only that we don't have other resources in English that could be used as a reference. As a research group leader, I pay students(there are 5 of them) to put our data on wikipedia which I hope may be extended by interested contributors. We were planning to put what ever information we have about 500 Turkish music composers on wikipedia, some short, some longer but none of them I would consider useless (I think just the birth and death dates are important if you are interested in that information for whatever the reason). If these risk of getting deleted, I'd rather not waste my research money on that. Please inform me if this will be the case and we will stop contributing to wikipedia. Thanks. — Preceding unsigned comment added by 193.255.77.106 (talk) 08:56, 14 November 2013 (UTC)

Dear IP, your work is highly appreciated, and the pages you have created are valuable to Wikipedia. The problem is only that the link provided is not a suitable reference in itself. The data should point directly to the source of the information, or at the very least unambiguously tell where the information was found (which may not even be on the internet, but on a page of a book or another document). As they stand now, a user who wants to read more, or who wants to verify the data, should install the database (and hence pay for the data), and search themselves? (Bgwhite is touching on the other question - is the information in the database actually a reliable source - that is, is it a source with editorial oversight etc.). --Dirk Beetstra T C 11:00, 14 November 2013 (UTC)
Forgot to mention, you are right that sources in English are preferred, but if non-English sources are needed to establish that a statement or person is noteworthy then that is absolutely not forbidden. At best, a reader can always use something like a translator (there are free online ones which are reasonable) to translate the document, at worst they can ask another editor to verify the English part for them. In any case, references should indicate whether the subject/statement is notable enough, in whatever language, and with or without direct link to an online source. --Dirk Beetstra T C 11:15, 14 November 2013 (UTC)

The reference (musiki.org) is a digital encyclopaedia, hence I don't see why any printed encyclopaedia would be preferred as a better reference. A printed book also costs some money and in fact most often more difficult to access. For example, the biggest printed encyclopaedia on Turkish makam music is the following: http://www.kitapyurdu.com/kitap/default.asp?id=106810 which is: i) out of print, ii) more costly, iii) less reliable from my point of view since I have it and have seen many inconsistencies, even racist argumentations inside. The reason we refer to the software is we have the information easily accessible to us and I can ask students who have no knowledge on the subject to put this information in wikipedia by just copy-paste and some editing (but I cannot ask them to do an investigation using multiple resources). Let me reduce it to this question: we have the limited resources to put this data on wikipedia, should we do it or not? Thanks. — Preceding unsigned comment added by BarisBozkurtBahcesehir (talkcontribs) 15:59, 14 November 2013 (UTC)

Simple answer; if a user has to pay $40 to see the actual content that a link is referencing to comply with Wikipedia's verifiability policy, the answer is no. I've removed all of the links to www.musiki.org as they don't refer to any specific information about the topics as has already been covered in the above discussion. OhNoitsJamie Talk 16:46, 14 November 2013 (UTC)

The reason we started putting references to musiki.org was that our entries got deleted due to unavailability of references. Will the entries stay there without the reference? if yes, we can put our data without the reference. If no, we stop contributing. Thanks for your explanations. — Preceding unsigned comment added by BarisBozkurtBahcesehir (talkcontribs) 15:14, 15 November 2013 (UTC)

The entries need some sort of references for verifiability and to establish the notability of the subjects. If you have no other references other than a general link to your pay site, than I'm afraid most of them will probably be deleted via speedy or WP:AFD deletion. OhNoitsJamie Talk 15:43, 15 November 2013 (UTC)

Thanks. It is NOT MY pay site. I bought the digital book and we are putting some information from there. I am trying to contribute to freely available material and somehow some people already made up their minds (without asking, questioning) we are spamming wikipedia. Great! I will ask the students to stop the process. — Preceding unsigned comment added by BarisBozkurtBahcesehir (talkcontribs) 13:14, 16 November 2013 (UTC)

Thank you. As such, unnecessary to add to spam blacklist at this time.  Not done

idolfeatures.com

idolfeatures.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

A site owned by new user Ccharles32 (talk · contribs · deleted contribs · logs · filter log · block user · block log) (who also edited as 173.17.118.135 (talk · contribs · deleted contribs · filter log · WHOIS · RDNS · RBLs · http · block user · block log)), who has been spamming links across WP for about a week. All of his contribs to articles are link additions, so I will give a few examples with the explanation that all of his mainspace edits are spam, and were all reverted by Nymf: [14], [15]. User also displays a complete lack of understanding of Wikipedia, having left this message after blanking an MfD. MSJapan (talk) 23:01, 15 November 2013 (UTC)

 Not done Blacklisting is premature in this case; I gave the user a final warning. If we see new accounts spring up or other persistent attempts, contact me directly and I will blacklist. OhNoitsJamie Talk 00:57, 16 November 2013 (UTC)

Proposed removals

Koolmuzone

This site was blacklisted after spamming by multiple IP editors. I have found that it is a notable Music blog and I have started an article about it (Koolmuzone). But I was prompted by the spam filter when I entered this link in the infobox. Besides I find that the most of the ips who did spamming were on a single ip range and range block could have helped here. Also this spamming seems to have stopped now. So I request it to be removed from the blacklist. Thanks --SMS Talk 17:22, 5 August 2013 (UTC)

"This spamming seems to have stopped" because the site is blacklisted, obviously. It was blacklisted just a couple months ago. It is highly likely that the spamming will resume if it is de-listed.
I don't see a reason to de-list the whole site, since it's highly unlikely that a blog would ever be used as a reliable source for articles unrelated to the blog. Instead, you may request white-listing of a specific page (such as www.koolmuzone.pk/about/).  Defer to Whitelist. ~Amatulić (talk) 20:31, 6 August 2013 (UTC)
My bad, I thought Xlinkbot keep eye on these blacklisted link addition and every addition is logged here. Probably you can also pour some light on this site's reliability as a source. I find that many news agencies of Pakistan (Daily Times, The Express Tribune, The News International) cite this blog for news related to entertainment industry of the country. So can we also cite it here? --SMS Talk 17:02, 7 August 2013 (UTC)
No, that log just records the successful additions. Once it's blacklisted, there's nothing to log. I could be mistaken, but I think the blacklist hits don't get logged at all. I'd be surprised if they weren't but I haven't seen such a log.
As to your question about reliability, well, generally we don't cite blogs; see WP:ELNO. There are exceptions, such as if the blog author is writing on a topic for which he's a notable expert, or it's a news blog authored by a bona-fide journalist. Blogs are often WP:TERTIARY sources, meaning the information you find in them can usually be found elsewhere in a secondary source. If you just need one or two pages, the whitelist is the place to request that. ~Amatulić (talk) 04:24, 10 August 2013 (UTC)
Thanks that helps. I will proceed with the request at Whitelist. --SMS Talk 08:14, 10 August 2013 (UTC)
@Amatulic:: I'm confused. As far as I can tell koolmuzone.com was added in 2009, not a few months ago, and koolmuzone.pk has never been blacklisted at all? There are quite a few links to it. .com is at the moment redirecting to .pk, the oldest .pk addition I can find is from February 2012. If we assume the page moved domains at that time then it appears it wasn't a significant spam problem those last 20 months. Amalthea 15:36, 5 October 2013 (UTC)

old-games.com

I cannot seem to find anything related on the local blacklist and I'm hitting the blacklist trying to use it as a source. No obvious reason to blacklist jumps out. I'm assuming some regex is involved with the -games.com suffix. :) ·Salvidrim!·  23:44, 11 September 2013 (UTC)

dyingscene.com

Why is this site blacklisted? I don't see any reason in the log. I wanted to add some information to an article about Greg Hetson, but I couldn't add references, as dyingscene.com is blacklisted. Nazgul02 (talk)

It appears this site was blacklisted back in 2010 for sourcing its own articles on Wikipedia. I have personally contacted the owner of the site and was assured they no longer contribute to wikipedia from their own site. Since it's been 3 years, my recommendation would be to unblock them in order to allow our contributors to reference them for articles related to punk music. - Dr.Music —Preceding undated comment added 18:39, 28 August 2013 (UTC)

"They no longer contribute to Wikipedia from their own site." What does that have to do with anything? They're not blocked from editing (from their site or elsewhere). They're blacklisted. Two totally different things. They can edit, but they just can't add their site to Wikipedia. Naturally, if they're blacklisted, they have no motivation to contribute. That's hardly surprising.
 Defer to Whitelist to request white-listing of individual pages. ~Amatulić (talk) 06:09, 30 August 2013 (UTC)
The request of a longtime contributor (Nazgul) is non-trivial, and three years is a long time to exclude a site for what may have been a one-time indiscretion. An assertion by the site's owner is also significant and it's worthwhile to assume good faith on the owner's part. I support removing this from the blacklist. -Pete (talk) 17:27, 30 August 2013 (UTC)
It is indeed logged here. As the original blacklister, I don't support removing it from the list; COI editors were warned multiple times to stop, and it took a blacklist entry to stop them. I don't have a problem with selective whitelisting. OhNoitsJamie Talk 15:44, 23 September 2013 (UTC)

petition

I understand the desire to avoid links to online petition-gathering sites, as a likely spam source. However, the blanket banning of any URL with the word "petition" in it (as best as I can understand the Regex, it ain't my thing) has been causing false spam flags in multiple articles that I deal with, because they link to legitimate news articles dealing with someone petitioning the court (example) or to copies of such court petitions (example). -Nat Gertler (talk) 02:56, 26 August 2013 (UTC)

Support. Way too many FPs to be useful (regex in question is \bpetition(?:online|s)?\b. @NatGertler: You're correct in your understanding. Also, JzG (talk · contribs) added what evolved into this filter here, but I can't find a log entry for it. Jackmcbarn (talk) 03:09, 26 August 2013 (UTC)
Support due to false-positives for URLs of news-stories about petitions. For example, this flagging is what first caught my attention. How about tightening the filter to match only the hostname (before first "/", or also other specific known sites by hostname) rather than "anywhere in URL". DMacks (talk) 07:56, 26 August 2013 (UTC)
Support - Another false positive here. The global filter already has a less restructive entry that deals with petition sites (\bpetition(?:online|s24|site|spot|-?them)\.com\b) so I'm not sure the local one is even necessary. If it is, I agree with DMacks that something like \bpetition(?:online|s)?[A-Za-z0-9]*\.(com|org|net)\b would prevent a lot of the false positives. TDL (talk) 22:31, 26 August 2013 (UTC)
hmm, this rule needs to be adapted. Petition sites should all be blacklisted per WP:SOAPBOX (they are at best a primary source, but that petition will only be notable enough to be mentioned in any article when there are secondary sources, making the primary source superfluous), not any link that contains the word petition in it (note that there are many domains without the word petition that are plain petition sites ...). --Dirk Beetstra T C 15:24, 28 August 2013 (UTC)
Actually .... Maybe it is e bot misinterpreting and tagging wrongly .. There is no catch on 'petition' itself ... http://www.wired.com/wiredscience/2012/05/a-petition-for-free-online-access-of-taxpayer-funded-research/ ... <--- see! --Dirk Beetstra T C 18:31, 28 August 2013 (UTC)
this link is reported as blacklisted on Access2Research, as mentioned on the meta blacklist talkpage. --Dirk Beetstra T C 18:34, 28 August 2013 (UTC)
Support. Also in the meantime, I removed the banner from Access2Research and also (I hope correctly) added these URLs to the whitelist here: User:Cyberpower678/spam-exception.js -Pete (talk) 19:58, 28 August 2013 (UTC)
Support I experienced problems as well. Blue Rasberry (talk) 20:09, 28 August 2013 (UTC)
note these links are not blacklisted! --Dirk Beetstra T C 21:16, 28 August 2013 (UTC)
Dirk, can you help me understand what caused the bot to make this edit? I must confess that I am not terribly well versed in Wikipedia's various anti-spam tools. Whatever caught that bot's attention, I think, is the thing that should be changed. -Pete (talk) 22:02, 28 August 2013 (UTC)r to Dirk -- ?
The bot gathers and compiles the regexes exactly as the blacklist extension for Wikipedia does. It then validates the links against the regex. If it finds a positive match, it validates the regex against the whitelist, if it doesn't find a positive match, it then checks the exceptions list. If it finds a positive match, it ignores it and if it doesn't, it stores the link in the blacklist database and proceeds to flag it.—cyberpower ChatOnline 22:33, 28 August 2013 (UTC)
So it's working off the same Wikipedia-specific blacklist, and also the global blacklist? If that's the case, why does Dirk say these links are not blacklisted? Surely the bot caught them somehow -- I guess that's the part I'm not getting. I thought they were caught because of a regex line based on the word "petition" -- am I wrong? -Pete (talk) 22:39, 28 August 2013 (UTC)
Yes. It is working off of both blacklists, regexes are compiled exactly as the wikipedia software compiles it and uses them in the scan. Dirk is pointing out that the blacklist doesn't seem to be blocking the addition of links with petition in it for some reason which would indicate a bug in the filter itself. I myself added links to pages, and the filter seems to only intermittently stop the edit. Pete, your understanding of what flagged this bot is correct. — Preceding unsigned comment added by Cyberpower678 (talkcontribs)
I've figured out what the issue is. The MediaWiki spam-blacklist only matches within the domain name, while User:Cyberbot II matches anywhere in the url. I get the filter notice if I try to put in a url with petition before the .com. TDL (talk) 23:08, 28 August 2013 (UTC)
It helps everyone, if you post in one spot and not post everywhere for me to have to follow you. You are tripping the global regex rule, not the local one.—cyberpower ChatOnline 00:22, 29 August 2013 (UTC)
Thanks to a few editors, the problem has been traced to the bot using an old outdated regex generator from the blacklist extension.
  • I have updated the regex generator. It should now mirror Wikipedia's blacklist filter when scanning regexes. The change will go into effect on the next run.—cyberpower ChatOnline 14:26, 29 August 2013 (UTC)
Do you have any way of going through all the articles that the bot flagged this run, rechecking them, and removing any false warnings that it generated? --Nat Gertler (talk) 15:09, 29 August 2013 (UTC)
The bot untags any misplaced tags. The bot determines tags that are misplaced when the links on it don't register in the active buffer of the bot. The active buffer is a collection of links stored in array elements identified by their page. Inclusion to this buffer is when there is a positive match to the blacklist, and a negative match to the whitelist and exceptions list. Currently, the bot's buffer is still loaded with the old regex generator so it won't untag the false ones this round.—cyberpower ChatOnline 15:36, 29 August 2013 (UTC)
Cyberpower, I think it would help a good deal if the tags the bot leaves explicitly invite editors to remove the tag (instead of or in addition to leaving a comment here) if they believe it has been left in error. Even as an experienced user, I was reluctant to do so at Access2Research because the tag explicitly directs editors to the blacklist/whitelist process. However, many editors are not technically inclined and have no idea what a blacklist is. Would you consider tweaking the text at the top of the tag to include this suggestion? -Pete (talk) 17:30, 30 August 2013 (UTC)

prodirectsoccer.com

This is a website that I have used extensively as a source for the Nike Total 90 and Nike Mercurial Vapor articles. I have no idea whether anyone has used this site to spam Wikipedia in the past, but as you can tell, my intentions with it are purely encyclopaedic. I would appreciate this site being unblocked, since it is one of the leading resources on soccer equipment. – PeeJay 09:25, 5 September 2013 (UTC)

tellynagari.com

I am astonished why this site is blocked as spam. This is first time I am trying to link an interesting relevant article reference on an existing page on wiki. — Preceding unsigned comment added by 110.44.113.253 (talkcontribs) 06:14, 20 September 2013 (UTC)

It was quite relentlessly spammed .. if you want to use one link as a reference, I would suggest to whitelist:  Defer to Whitelist. --Dirk Beetstra T C 06:07, 24 September 2013 (UTC)

academicroom.com

Greetings. I was tasked with editing an article that has been rejected twice. I need to use academicroom.com as a source but it is blacklisted. I went to the "Spam Blacklist" page (https://meta.wikimedia.org/wiki/Spam_blacklist) and it is not listed there. I can't speak for others, but my for own field, which is nanotechnology, Academic Room is a very credible resource, running out of Harvard University. I don't think it serves anyone well to put a blanked ban on it. I therefore request this site be removed from the blacklist. The resources that I need to use can be found in the following hierarchy of the blacklisted site: /physical-sciences/nanotechnology Nanotech-editor (talk) 21:26, 23 September 2013 (UTC)

 Defer to Whitelist Given the heavy spamming attempts, it's not appropriate to remove from blacklist, regardless of credentials. OhNoitsJamie Talk 21:36, 23 September 2013 (UTC)
Thanks for your response--I agree with you but the spamming was a year ago. With Harvard in the picture, several thousand academics have joined the portal, increasing the amount of content substantially. But at the same time your concern is understood. Is there some kind of a conditional removal of blacklist or some kind of monitoring that can be done? I think this would be a useful feature to have, if it does not already exist. I however strongly recommend removing the blacklist given the period since the spamming. Nanotech-editor (talk) 03:57, 25 September 2013 (UTC)

historyandpolicy.org

I believe www.historyandpolicy.org should be univerally whitelisted. It is an academic research site on UK business, history, and government policy. See, for example, the citation flagged by the spambot on w:Supermarket. I assume there are papers that discuss individual businesses, and this is what fooled the bot. Choor monster (talk) 17:44, 24 September 2013 (UTC)

I agree. A link to this site was tagged today. The link (which astonishingly, I can't even reproduce here as part of this discussion!) appears to be to a legitimate paper, written by a lecturer in twentieth-century British history at the University of York, who is also the author of a book on Black Market Morality, and whose doctoral thesis won the University of Cambridge's Ellen McArthur Prize in Economic History for 2003. We may argue whether his work is reliable, or biassed, or if it is academically sound, but calling it spam seems to be very strange. --Nigelj (talk) 22:51, 24 September 2013 (UTC)
This request makes it clear why the domain was originally blacklisted.
I don't know whether or not this is still valid; but in any case the reference to a paper hosted there is valuable for Dunton Plotlands, so either de-blacklisting or adding exceptions would seem sensible. --David Edgar (talk) 00:50, 25 September 2013 (UTC)
I've just been directed here from University where a link added in 2010, long before the cause of the original action, has just been picked up for blacklisting. The link in question discusses the history of UK universities, and is apparently written by a respectable historian Robert G. W. Anderson MA DPhil FRSE FRSA [16]. So while I'm new to this game the proposal from David Edgar makes sense to me. Regards, Jonathan A Jones (talk) 06:32, 25 September 2013 (UTC)

I am tempted here to suggest to request whitelisting. It was clearly spammed (and they noted the effects of their spamming - their ranking went up massively, their SEO worked).  Defer to Whitelist. --Dirk Beetstra T C 08:08, 25 September 2013 (UTC)

But is it really spam when a reputable academic edits to link to something relevant even if they originally wrote it themselves? Suggest the signal to noise ratio with this is quite high, as opposed to proper spammers who're trying to sell something and insert crap links who generate a lot of noise but no signal. Barney the barney barney (talk) 16:04, 26 September 2013 (UTC)
This is pretty obviously a case of internal Wikipedia politics trumping results - you've gotten so defensive about spammers that you're blacklisting a high-quality academic reference source with no evidence of misuse or even overuse. It's a clear error. 99.249.254.228 (talk) 21:22, 14 October 2013 (UTC)
Removed from the blacklist. An isolated incident from 2011 isn't by itself good reason to blacklist the site in 2013, especially when it's being used legitimately in many articles. Nyttend (talk) 12:13, 15 October 2013 (UTC)

beacon.org

beacon.org: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

Beacon Press is a Boston-based independent publisher. I don't know a lot about them, but I understand them to be a legit outfit. A blacklisted-links template popped up in Nancy Gertner today; the article cites the Beacon Press website for Gertner's book in the EL section, which I think is fair. Reviewing [the history] it appears it was blacklisted in July of 2012 because of 3 abusers with "systematic efforts to promote this publisher and their books." I'm inclined to think that blocking the entire publisher's site isn't the proper scale, and this should be unblocked. But maybe I'm missing the magnitude of the problem, and only the individual books that are not a problem should be whitelisted? Thanks. jhawkinson (talk) 06:56, 25 September 2013 (UTC)

I see mainly two types of links, one for products, and one for authors. The former is linking to their specific page for selling the book - we have ISBN to replace that, we do not need to link to where a book is sold (we might want to link to a preview for references, although also that is under discussion, and I don't know if they provide previews - none of the links seem preview links), and for authors, there are independent websites that link to all books written by a certain author, I don't think that listings on the publisher page are useful .. nor neutral. Especially, seen that there were editors pushing this link (promoting their site, likely to sell more books) suggests me that these should be removed, they are not really needed and there are more neutral alternatives, or, where there is unique information, that some specific links should be whitelisted. so,  Defer to Whitelist. --Dirk Beetstra T C 08:33, 25 September 2013 (UTC)

Infibeam.com

Due to unknown reasons this link is blocked and shows in the Spam Blacklist. Infibeam.com is an authorized Indian e-commerce company. We provide our customers with best deals and make their hectic life easy with the online shopping facility from their homes or offices. Any customer that visits Wikipedia will be able to go to our home site link directly and would not have to face the any problem for searching the link through Wikipedia. We assure that we will abide by the content policies of your sites in the future to avoid any such issues. :) ·Rachnarawat·  05:23, 24 September 2013‎ GMT + 5:30

Well, the reason it is here is because editors, likely with a conflict of interest (read: people who were interested in being able to link to the site, just as you here remark) were blatantly spamming the site on wikipedia (handful of IPs, some named accounts which strongly suggested a vested interest, creation of spammy Wikipedia pages regarding this organisation). If Wikipedia editors think that they add to a certain page, then that specific link can be whitelisted, but I don't think that we should remove it here now.  Defer to Whitelist for specific links either directly linked to the topic, or to be used as a reference. --Dirk Beetstra T C 12:51, 25 September 2013 (UTC)

historyandpolicy.org

Appears to be a legitimiate website with links to KCL. Barney the barney barney (talk) 20:22, 25 September 2013 (UTC)

We are not questioning the legitimacy of a site, but whether it was spammed. I above suggested to whitelist specific links ( Defer to Whitelist). --Dirk Beetstra T C 08:54, 26 September 2013 (UTC)

countrycode.org

This was the incident that triggered the addition of this domain to the spam blacklist. The incident lists a number of examples of link spam - but most are examples for a similar domain, "areacode.org". I can't find any examples of spam for countrycode.org. The two domains are registered to different people, but the websites look very similar, and both link to the same parent organisation "numberingplans.com".

There are 18 pages that reference this website, and it looks like the references were all added before the blacklist (i.e. before June 2010). These pages have all recently been tagged with the "Blacklisted-links" template (which lead me here). Can this domain be unlisted, or should I remove the links from the pages instead? Thanks in advance! Dracunculus (talk) 19:17, 30 September 2013 (UTC)

 Not done [17] and [18] are ad-free alternatives (and more authoritative). OhNoitsJamie Talk 14:57, 3 October 2013 (UTC)
The links you've provided don't have the right kind of information... both pages have "country codes" which are standard abbreviations of the country name. These are different from the "country code" that you would use to make a long-distance telephone call to that country, or the "area code" that you would use to dial to a different region within your own country. For example, the unstats.un.org page has the numerical code "004" for Afghanistan, but the code you would use to phone long-distance to Afghanistan is "93". Dracunculus (talk) 18:21, 8 October 2013 (UTC)
So... should I give up on this and just remove the links and related information from the 18 pages? Thanks, Dracunculus (talk) 19:26, 28 October 2013 (UTC)

bvinewbie.com

bvinewbie.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Good day. Looked through the information, but I can't see why this guide is on the blacklist. I believe that it contains unique information about the British Virgin Islands, ones that can't be found anywhere else. For example information on immigration and work permits.— Preceding unsigned comment added by John McConvill (talkcontribs)

 Not done Spammed by multiple accounts [19],[20],[21]. Wikipedia isn't a how-to guide. OhNoitsJamie Talk 22:50, 4 November 2013 (UTC)

prouty.org

I am following up to VRTS ticket # 2013081210002123 sent by a representative of prouty.org, which was blacklisted in August 2011.

The site wasn't blacklisted on the basis of any discussion on this page, but rather based on an ANI discussion (archived here). The blacklisting was apparently done in response to disruption caused by linking to an attack page at www.prouty.org/mcadams/

I know that we don't de-list sites based on requests from people with a conflict of interest, and anyone who monitors this page will know I have declined such requests on many occasions.

However, based on that ANI conversation, it seems that the blacklisting may have been done hastily. The final comment from bureaucrat Infrogmation (talk · contribs) suggests that this listing should be revisited.

I suggest not de-listing, but modifying the entry to blacklist only that attack page. I would do it myself, but I prefer the transparency of discussion this public page first, rather than back-room OTRS communications. ~Amatulić (talk) 22:17, 13 August 2013 (UTC)

It's been a month. Shall I assume non-response equals concurrence? ~Amatulić (talk) 23:03, 11 September 2013 (UTC)
Concur -- Infrogmation (talk) 23:15, 13 September 2013 (UTC)
Why not just exceptions for specific sources, such as those to be restored to Fletcher_Prouty? Links to diffs showing where else prouty.org links were appropriate would be informative. --Elvey (talk) 22:13, 18 September 2013 (UTC)

 Not done - I ended up whitelisting the 'about' page instead, which seemed to satisfy the prouty.org representative. ~Amatulić (talk) 19:42, 3 October 2013 (UTC)

NORM.org

Resolved

NORM supplies information that upsets some people. It is a site with foreskin restoration information and is anti-routine male infant circumcision. This means that people have split into entrenched warring camps. This site is likely to have offended someone rather than, itself, being spam. It does have items for sale, yes, but this is not its primary function. Please look at the original reaosn for listing with care and determine if the site is genuinely spammy, with a view to delisting it. Fiddle Faddle 16:59, 24 September 2013 (UTC)

They don't sell anything, here is a citation from the website, "devices" section: "NORM does not endorse any device below and all information is provided as a service to help you find the device that works the best for you. Please compare prices, features and talk with other men who are restoring before making a purchase" -- This is a non-commercial, informative organzation (E-Kartoffel (talk) 17:23, 24 September 2013 (UTC))
As an advocacy site, I don't see how it would typically meet WP:RS guidelines.  Defer to Whitelist for individual links. OhNoitsJamie Talk 18:03, 24 September 2013 (UTC)
Jamie, a site's reliability or even neutrality is utterly irrelevant in this context. This sort of inappropriate judgement will blur the lines of the blacklist, and create a general-purpose way to introduce a chilling effect on contentious subjects. The spam blacklist's criteria MUST be limited to spamming, and not related to RS. IMHO. Eaglizard (talk) 22:25, 30 September 2013 (UTC)
While I have followed your suggestion, the site is not spam. Advocacy, yes. information, yes, spam no. Surely this is a spam blacklist? RS is a further step. Fiddle Faddle 21:39, 25 September 2013 (UTC)
The misnomer of this page has been noted a long, long time ago, and requests to change that have been made as well - but note that it is not about whether a site is spam, it is about whether a site got spammed, which is a subtle difference. We have had spam from very notable, very important organisations, and if that gets to a level that it is uncontrollable, then the spam-blacklist is the way to stop the spamming (or bad abuse of a similar kind). When that happens, the bar for removal is not whether a site is spam, but whether it is of general use, which I do not believe, and whether it could be pushed/abused further, which I don't think is completely unlikely. --Dirk Beetstra T C 07:12, 26 September 2013 (UTC)
Very few sites are of general use, and almost all, nay all, are capable of being introduced as inappropriate links. One needs to pay regard to context. Sites such as this one have genuine limited use here. This one probably only has a valid use in articles concerned with the human male penis, either as an anatomical organ in its own right or on pages detailing penis modification. It is, in lower case, a reliable source, though probably does not pass muster as a Reliable Source. Its deployment as a genuine Primary Source in most cases and as a Secondary Source in some cases is most assuredly useful, within the limitations of usage of an authoritative Primary Source. My suspicion is that, in the past, it was disliked by someone, and that it was then added, either by them, or by someone acting in the honest but mistaken belied that it was inappropriate. This is likely because the concept of routine male infant circumcision creates camps and warring factions. So please give this one still further consideration. Fiddle Faddle 12:03, 26 September 2013 (UTC)
Hmm .. there is the problem .. you suspect 'that, in the past, it was disliked by someone, and that it was then added, either by them, ..' .. that assumes either incompetence or bad faith on the editor(s) that blacklisted the link. Note that e.g. examiner.com was plainly spammed, and is (though bit less than in the past) a spam-magnet, yet it contains good stuff; it was not blacklisted because it is not a WP:RS, as is sometimes assumed. Owners of respectable organisations (or e.g. their SEOs or webmasters) do sometimes get into the habit of spamming their good sites (people involved with examiner.com were involved in the spamming, case of WP:COI). My suspicion would be that that type of action was also the case here, or it was indeed plainly spammed. There are very, very few cases of WP:IDONTLIKEIT that result in blacklisting, it would be an abuse of process. For this type of site, just as for many porn sites, I see a lot of potential of abuse, and just limited appropriate use - sometimes the blacklist is just the best option to keep abuse at bay. But I'll try to have a look here. --Dirk Beetstra T C 12:31, 26 September 2013 (UTC)
Sometimes things slip past. This site has, in the past, been part of campaigns of "I don't like it" in venues not limited to Wikipedia. A persuasive argument can be made to encourage others to act in good faith if the persuader is skilled. It does not show incompetence, just bad luck to have been the person seeing the issue or of having it described to them. You are about to give this all and only what I ask for: an impartial review. I understand the points you make about imperfect usage of links well. Interestingly, if I took the time, I can see sufficient WP:RS about NORM itself to create a valid article on them here. NORM-UK an allied but separate organisation has one, and, norm.org, naturally, is one of the links form that article. Fiddle Faddle 12:51, 26 September 2013 (UTC)
I know things can slip past .. but that does not happen too often (and I have been here long).
Anyway, found the addition by MuZemike (talk · contribs). Looking a bit further, the site was indeed 'spammed' - editors running around adding (and re-adding) the links to pages where these links do not belong (like Talk:Barack Obama and National Liberation Army (Libya)) with as reason: "The National Organization of Restoring Men has information about Circumcision and Foreskin Restoration. Sign the White House circumcision petition needs signers up to and on OCT 23" and "The Occupy Wikipedia Movement will not stop. The Occupy Wikipedia movement is prepared for censorship by the man. Read this message".. just the type of promotional edits the spam blacklist is supposed to stop. Maybe it is indeed not liked enough .. at least that was found by some editors who found it necessary to promote it further ...  ;-) --Dirk Beetstra T C 13:05, 26 September 2013 (UTC)
Good lord. What total stupidity from those who added it! Was this one imbecilic editor or a team of them? If one editor, then I suspect it has been nipped in the bud. If team handed I suppose that is a different matter. I'd still like to see it off the blacklist, but I can see persuasive arguments for retention here in the latter case. Fiddle Faddle 15:09, 26 September 2013 (UTC)
There were at least two, and the threat that they would not stop suggests that blocking one would quickly result in yet another IP. I would not have had hope, seeing the statements, that blocking would help (I also saw edit warring by one IP). Please, just whitelist the links rhat are really needed. --Dirk Beetstra T C 06:00, 27 September 2013 (UTC)
I suppose so :) I hate idiots like those, who spoil things for the rest of us. It is quite the reverse of what we are about. Fiddle Faddle 06:38, 27 September 2013 (UTC)
It does not even have to be the people themselves, it could be SEO (though this looks not like that) or an employee who is over eager. Unfortunately, we see this every now and then with respectable organisations, although the nofollow makes the sites not increase on the google ranking, there will still be people following your links, arriving on your site where you can 'sell' your 'product'. It pays to have the links and that remains the problem. Can I safely  Clerk declined this and  Defer to Whitelist? --Dirk Beetstra T C 07:13, 27 September 2013 (UTC)
With regret, yes, I am afraid you can and should. Fiddle Faddle 22:32, 27 September 2013 (UTC)
← I'm new here, but from all I understand I'd just remove the blacklist entry. This may have warranted blacklisting at the time (multiple IPs (at least 1, 2, 3, 4), range of articles, seemingly distinctive domain), but the event was two years ago (advertised a US petition on 23 October 2011) and the blacklist entry did not stop the disruption, they merely changed the URLs until I assume they either ran out of proxy IPs or saw that they were reverted too quickly to be effective.
While I don't see that this is a useful source (despite it being used), it doesn't appear to fit the criteria from WP:BLACK/WP:BLACKLIST any longer.
Amalthea 13:29, 5 October 2013 (UTC)
Adding additional support for  Clerk declined this and  Defer to Whitelist. OhNoitsJamie Talk 15:13, 5 October 2013 (UTC)
Yes, I saw your comment above, and there's already an entry at the whitelist -- it was when I was considering that request actually that made me come here. I'm trying to understand the reasoning to not de-blacklist since I can't align the responses above with the written guideline as I understand it. I was under the assumption that blacklist entries are handled similarly to blocks: once one is no longer needed to prevent damage, it is to be removed. Is that wrong? Is it common and accepted practice that a domain that is no reliable secondary source will remain on the blacklist once it has been added (no matter if it was added appropriately or not)? Amalthea 15:53, 5 October 2013 (UTC)
No, blacklistings are not handles the same way blocks are. If a blacklisted site is not an appropriate WP:RS for Wikipedia, there is no compelling reason to remove it from the blacklist. OhNoitsJamie Talk 16:34, 5 October 2013 (UTC)
@Amalthea: I would always consider two things: what is the overall, general expected use of the whole site vs. what were the situations when it was blacklisted (and would we expect that the situation with that site has changed). E.g. for a propaganda site that was pushed like norm.org, the abuse was quite heavy, and the abuser promised to continue (edit warring, changing IP - do I think that blocking the IPs would have stopped it - no), whereas I do not think that a lot of the more than 4 million Wikipedia pages will use a reference to norm.org (the page itself, and possibly some pages on the subject of circumcision). In such a case, I would rely on the whitelist (what I suggested here). If a really good site (say, CNN.com) would get spammed, we would not even consider blacklisting, or, if a quite reasonable site got uncontrollably spammed and got blacklisted (maybe it was not that useful at that time), but it is widely useful in references (expecting to come up on hundreds of pages), we would consider de-blacklisting (I've alluded to that in the past regarding examiner.com ..).
I've been around here for years now, and I have seen the cases - sites get de-blacklisted and the socks appear soon after (for one case, less than two weeks!) .. 'nofollow' may have some effect - still, it already pays to have people follow your link and come to your site (and maybe buy something), and it also helps already to have people look at your cause. Since it pays to have your links here on Wikipedia, 'spammers' (or their SEOs or others who benefit from your cause) will return (you can almost compare it with school-IPs - you can block 'm for a year, but a year later a new schoolkid will add their 'poop'-vandalism and you block again for another year (and so on). I'm all for giving sites a 'second chance' .. but since we are already heavily understaffed, I remain skeptic, and tend to err on the save side. --Dirk Beetstra T C 16:21, 9 October 2013 (UTC)
Thank you both for your replies, I think I understand better now how you handle the blacklist (although I still don't agree with this particular case). Amalthea 10:18, 11 October 2013 (UTC)

Required links have been whitelisted since. Amalthea 10:18, 11 October 2013 (UTC)

NOTE: I've no problem with the way this request has been handled. However, for the record, there is absolutely nothing that makes advocacy sites, per se, ineligible for use as a reliable source. Perhaps Jamie (Ohnoitsjamie (talk · contribs)) needs to reread the policy he refers us to, in particular, the WP:BIASED section of WP:RS!! It reads in part, "reliable sources are not required to be neutral, unbiased, or objective"! It looks to me like Jamie twice misrepresented policy by indicating that advocacy sites, per se, do not meet WP:RS guidelines - once after having the issue pointed out! (by Eaglizard (talk · contribs)) I do not like to see administrators, in particular, misrepresent policy like that. --Elvey (talk) 19:54, 22 October 2013 (UTC)

reverbnation.com

I believe that ReverbNation (Alexa ranked 1526 Globally) should be removed from the blacklist.

  1. ReverbNation has an article and is a well financed and accepted operation that provides local and national rankings for musical artists.
  2. Toolserver shows 193 incoming links to ReverbNation http://toolserver.org/~dispenser/cgi-bin/backlinkscount.py?title=ReverbNation
  3. ReverbNation has a template for use in external links sections Template:Reverbnation
  4. I believe that whitelisting individual pages will be too much work for well over 100 pages that were affected by a bot last night.
  5. The site is professionally run, I don't know how to check to see if the spam still persists or if the site had a virus that has been repaired, I'm confident that this is not a website that is interested in spamming anyone.009o9 (talk) 08:43, 26 September 2013 (UTC)
  6. Appspot indicates 2179 links to reverbnation.com in the English Wikipedia http://wikipediatools.appspot.com/linksearch.jsp?set=major&link=reverbnation.com&https=1 009o9 (talk) 08:56, 26 September 2013 (UTC)

Also, the editor who blacklisted the site did not log any reason for the entry and has since stated s/he would not oppose de-blacklisting. Jaguar766 (talk) 22:42, 27 September 2013 (UTC)

I've actually checkY Done this earlier today after researching the situation (for reference, was blacklisted following a sockpuppet investigation where someone apparently tried to advertise his blog and songs via many external links).
I must however point out that I do not agree with any of the reasons given above. I removed the blacklist entry because I believe it is no longer necessary, as per WP:BLACKLIST. After looking into the site, I generally consider any Reverbnation pages unsuitable for use in a Wikipedia article, both as an external link (there is almost always a better official representation of the topic to list, per WP:ELMINOFFICIAL a Reverbnation/MySpace/Twitter/... profile should then not be listed) and as a reference (possibly as a primary source, if you can show that the topic even has control over their Reverbnation representation). In fact I'm currently going through all instances where they are used, and thus far have removed/replaced every one.
Amalthea 20:46, 2 October 2013 (UTC)

voobly.com

I think this site does not deserve to be blacklisted. On the site you can find replacement service for the old MSN Gaming Zone. This should be noted in the MSN Gaming Zone article, but it can't due to the blacklist.

  1. The site is run by a community that only cares about playing their games. There is no reason for them to spam anywhere.
  2. It is important that the page can be linked to from old MSN Gaming Zone articles, because it provides (even improved) replacement for the CD-ROM games that the MSN Gaming Zone used to support, before it went offline.

— Preceding unsigned comment added by 188.192.102.20 (talkcontribs) 09:11, 28 September 2013‎ (UTC)

 Not done Wikipedia isn't an advertisement venue for a non-notable website. OhNoitsJamie Talk 18:02, 8 November 2013 (UTC)

abseits-soccer.com

abseits-soccer is a website that contains info on German football and is widely used by WP:FOOTY. The trigger is apparently \bsoccer\.com\b. Cannot see a reason why abseits-soccer would be blocked for spam, so I believe this is a case of a false positive. Additionally, this entry is from April 2012, but has no reasoning given in the log. --Madcynic (talk) 14:21, 4 October 2013 (UTC)

That entry should be changed to (?<=//|\.)soccer\.com\b to fix the same problem as happened at #Hyphenated domain being misinterpreted?. Jackmcbarn (talk) 14:55, 4 October 2013 (UTC)
Could someone do that, then? I don't trust myself with this... Madcynic (talk) 16:41, 5 October 2013 (UTC)
Amalthea removed the entry completely, so this should be fixed now. Jackmcbarn (talk) 17:53, 5 October 2013 (UTC)
Ah yes, checkY done -- didn't realize there was another false positive, I removed it after seeing a report of a false positive with canadian-soccer.com. Amalthea 18:11, 5 October 2013 (UTC)

www.otrcat.com

You did not have to pull them, just requesting whitelisting or de-blacklisting. It was blacklisted because it was spammed (with a lot of other links of the same owner) by a large sockfarm using throw-away accounts (with creative names like 6958g3gqws, Aqsdfwaefsd, easily 50 or 60 of them), I guess they need the incoming links. I would suggest to ask for whitelisting for every single link that is properly useful and not-replaceable as a reference - .  Defer to Whitelist. --Dirk Beetstra T C 06:20, 7 October 2013 (UTC)

www.examiner.com.au/

This looks to be a false positive (matching examiner.com).. the regex for examiner.com is already pretty complex, someone who is more proficient in complex regex than I am will need to tweak it so examiner.com.au can be used. --Versageek 23:10, 8 October 2013 (UTC)
See also MediaWiki talk:Spam-whitelist/Archives/2023/05#examiner.com.au, this was incorrectly tagged by the bot. The bot operator seems to suggest it should be corrected soon. Amalthea 23:46, 8 October 2013 (UTC)
This is best covered on the whitelist (to avoid overly complex blacklist rules; editors often already don't understand what rule is actually blocking their link).  Defer to Whitelist. --Dirk Beetstra T C 16:26, 9 October 2013 (UTC)

wondershare.com

It do offer some great tips and solutions for guys like me, so I just think this site is not deserved to blacklisted.

 Not done If it's on the meta blacklist, you'll have to take your request there. OhNoitsJamie Talk 18:04, 8 November 2013 (UTC)

thewebminer.com

  • thewebminer.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com
  • I want to add at this link http://en.wikipedia.org/wiki/K_means as free tool for clustering discovery with k-means method.
  • Blacklisting is not necessary because TheWebMiner can be used either as a service or as a free tool. TheWebMiner comes in support of education being used in universities as a case study for data scraping tools. Many people showed interest for Thewebminer as this video shows (many other articles regading TheWebMiner available on the internet at request). Although it was mentioned that TheWebMiner was deleted because its lack of indicators to its relevance under the section A7 of the criteria for speedy deletion that is not accurate because in the named section there is a specific exception for educational tools.

In conclusion we request immediate removal of TheWebMiner from the wikipedia blacklist and its return to the web scraping search page on the site. — Preceding unsigned comment added by 141.85.0.97 (talk) 15:34, 13 October 2013 (UTC)

  • looks like a request by someone connected with the site and does not address reasons for blacklisting. Stifle (talk) 11:53, 19 October 2013 (UTC)
  • Hello, i am writing to you because of the recent decline that you've done to the request of removal from the blacklist of the site www.thewebminer.com

i want to state at first that i am not in connection to the website administrators although i use it for personal extractions that are required in my field of activity, which is marketing. I also want to be clear that my extractions are relatively small so i don't pay any money for that service and i was not asked to do this request. I simply think that thewebminer is a very efficient tool for extractions. At first i've come to it browsing through Wikipedia and this is how i usually got to it, until the site was removed. I was curious about the removal so this is how i found out about the wikipedia blacklisting. I am only writing to you because i want you to reconsider the denial of the delisting and to take into consideration the fact that the web miner is a very popular tool within marketing companies like mine and it should have a place into wikipedia as it is extremely relevant to the web scraping topic. Laura Stein — Preceding unsigned comment added by Ady1689 (talkcontribs) 01:55, 5 November 2013 (UTC)

 Not done I fail to see how this link would be useful to Wikipedia, and as such there is no compelling reason to de-blacklist it. OhNoitsJamie Talk 19:17, 5 November 2013 (UTC)

thewebminer.com (redux)

i want to state at first that i am not in connection to the website administrators although i use it for personal extractions that are required in my field of activity, which is marketing. I also want to be clear that my extractions are relatively small so i don't pay any money for that service and i was not asked to do this request. I simply think that thewebminer is a very efficient tool for extractions. At first i've come to it browsing through Wikipedia and this is how i usually got to it, until the site was removed. I was curious about the removal so this is how i found out about the wikipedia blacklisting. I am only writing to you because i want you to reconsider the denial of the delisting and to take into consideration the fact that the web miner is a very popular tool within marketing companies like mine and it should have a place into wikipedia as it is extremely relevant to the web scraping topic. Laura Stein — Preceding unsigned comment added by Ady1689 (talkcontribs) 01:55, 5 November 2013 (UTC)

 Not done Interesting coincidence that your IP originates in Romania, as does the site in questions domain name registration. Give it up. OhNoitsJamie Talk 21:07, 18 November 2013 (UTC)

onion-router.net

Hello? --Rezonansowy (talk • contribs) 13:01, 9 November 2013 (UTC)
It doesn't appear to be on the blacklist. My suspician is that User:cyberbot II got a false positive from this rule: \b[_\-0-9a-z]+\.onion\b # was \bsilkroad.*\.onion\b. The bot, which threw a lot of false positives, appears to have been (thankfully) disabled. I'm removing the warning from that page. OhNoitsJamie Talk 17:53, 9 November 2013 (UTC)
OK, thanks! --Rezonansowy (talk • contribs) 23:02, 9 November 2013 (UTC)
It is not a false positive, the bot appropriately tagged that page - that rule is catching 'www.onion-router.net' (the www-prefix gets matched by the \b[_\-0-9a-z=]+). This specific link should be whitelisted.  Defer to Whitelist. --Dirk Beetstra T C 12:49, 10 November 2013 (UTC)
If it's not a false positive, why have I been able to edit the Tor article without getting a blacklist message? OhNoitsJamie Talk 20:30, 10 November 2013 (UTC)
Likely because you did not add the link. Try to add the link here and save (I just tried, ran into the blacklist message - "The following link has triggered a protection filter: www.onion .."). Sigh, and that is just exactly why someone should finally write a bot to tag pages and responsible editors should solve the problem in stead of shooting the messenger. The bot is working properly, it is just that you don't like it.  Defer to Whitelist. --Dirk Beetstra T C 07:34, 12 November 2013 (UTC)
I will again stress - having blacklisted links on a page is a problem that has to be solved, as combinations of vandalism and/or GF edits can damage (generally accidentaly) blacklisted links and will result in damage to Wikipedia - it really is the best that those cases are whitelisted (for accidental cases like this or good documentation on sites which were heavily abused), or plainly removed (for the spam-cruft that was not removed on blacklisting), or removed from the blacklist (for the few cases where the situation has dramatically changed). Cheers! --Dirk Beetstra T C 07:38, 12 November 2013 (UTC)
I apologize, as I was unaware that was how the blacklist trigger worked. OhNoitsJamie Talk 15:58, 12 November 2013 (UTC)

Troubleshooting and problems

CSS overflow

style="overflow:auto... is recognized as a SPAM site/link; style="width:...;overflow:auto... isn't. –pjoef (talkcontribs) 08:19, 16 April 2013 (UTC)

False positive?

I'm trying to troubleshoot an issue reported via OTRS where a person tried to add a link to one of the Requested Articles sections (their personal website). The domain has a pattern like so: www.<name>-actor.com. I tried several combinations (e.g., myteethhurt-actor.com and foobar-actor.com and it seems the issue is the -actor bit. The domain in question is not on either the local or global blacklists, and I can't find a pattern in either that would match "-actor" exactly. Should I request an exception or just ask the person to omit the link or is this something that we should fix? §FreeRangeFrogcroak 21:36, 19 June 2013 (UTC)

Sorry for not getting back to you faster, the item in question is \bactor(?:suriya|arya)?\.com\b from meta added on 23:39, 28 November 2009 the user was optimizing several regex and goofed. The correct regex should be \bactor(suriya|arya)\.com\b Werieth (talk) 22:52, 28 June 2013 (UTC)
No problem - sorry I also missed your reply. Glad you found it! §FreeRangeFrogcroak 18:55, 21 July 2013 (UTC)


"\bstay[\w-]*\.co\.uk\b" false positive

The entry for "\bstay[\w-]*\.co\.uk\b" was added on 12 April 2011 in response to this request. However, this rule is overly broad (it seems to catch any domain starting "stay" and ending ".co.uk"). This is causing problems with the link to www.staysure.co.uk on Sunday Times Fast Track 100. Could this rule be removed or rewritten to be more specific to the domains mentioned in the original request? Thank you. – PartTimeGnome (talk | contribs) 17:16, 28 September 2013 (UTC)

@Amatulic: @Hu12: ping. Jackmcbarn (talk) 17:54, 28 September 2013 (UTC)
Looking at that site, it doesn't look like something Wikipedia should link to, because the site's only purpose appears to be promotional. I'd say  Defer to Whitelist to request white-listing of any specific page, although in browsing the site, I can't see any that could legitimately be used as a reference or even an external link. The tables shown in Sunday Times Fast Track 100 are linkfarms violating WP:NOTDIR, the external link columns really need to go. ~Amatulić (talk) 03:57, 29 September 2013 (UTC)
Fair point. I've removed the external links. – PartTimeGnome (talk | contribs) 22:12, 1 October 2013 (UTC)

Hyphenated domain being misinterpreted?

At Meta, another user reported concern about blacklisting of jesus-passion.com. Existing links to this domain cannot be edited. The user was told that this domain is not blacklisted at meta, so it must be a local listing here. I don't see jesus-passion on the EN blacklist, but passion.com is there. Has the blacklisting of passion.com been extended to include the hyphenated form? If so, how can this be fixed? --Orlady (talk) 17:31, 29 September 2013 (UTC)

The pattern \bpassion\.com\b will match the link in question, since \b is a word boundary. Werieth (talk) 18:23, 29 September 2013 (UTC)
Changing it to \b(?<!jesus-)passion\.com\b will fix this. Jackmcbarn (talk) 18:42, 29 September 2013 (UTC)
That change would fix things for that one domain, but it won't fix it for any other domain that innocently uses the form "-passion.com". I haven't investigated the existence of such domains, but it's easy to imagine names like chocolate-passion.com, football-passion.com, democracy-passion.com, and garden-passion.com. Is there a way to edit the pattern so that it would apply only to passion.com, and not hyphenated forms? --Orlady (talk) 18:56, 29 September 2013 (UTC)
Yes, changing it to (?<=//|\.)passion.com\b should do that. Jackmcbarn (talk) 19:03, 29 September 2013 (UTC)

The problem is, that we have to take into account the possibility tha the rule maybe wanted to catch all possible instances of '<blah>-passion.com'. There are those typical domain ending (recently it was 'blahblahfacts.com', a large set of links all ending in 'facts.com'). I'd go for the whitelist here, for jesus-passion.ciom, keeps the already difficult regexes just a bit easier to read for those whose links are blocked.

I will have a look at what the passion rule was supposed to catch. --Dirk Beetstra T C 03:47, 30 September 2013 (UTC)

The spammed link was indeed passion.com itself (see COIBot reports linked from the tracking, and follow 'tracked' in there to see the spamming cases). I have adapted the rule according to the suggestion by Jackmcbarn, http://www.jesus-passion.com should now be linkable. --Dirk Beetstra T C 07:25, 30 September 2013 (UTC)

Discussion

Erwin's tool on meta

On meta, we use the gadget User:Erwin/SBHandler (m:MediaWiki:Gadget-SBHandler.js) to add items to the blacklist on meta. It works from the Spam-blacklist talkpage (m:Talk:Spam blacklist), and from the cross-wiki reports generated by COIBot ('m:User:COIBot/XWiki/example.org'). I think that this tool could also be handy here on en.wikipedia, knowing that we have here also the talkpage of the blacklist ('here'), and the local reports ('Wikipedia:WikiProject Spam/Local/example.org') where this could be enabled.

Would there be interest to have this tool here, and people who are capable/interested to move the gadget here (I tried to hack and activate it through my local .js, but I could not get it to work)?

(Not willingly wanting to complicate things .. but one could consider to expand the tool to also work on XLinkBot's revertlist - being capable to blacklist from there, or to revertlist from here). --Dirk Beetstra T C 12:32, 4 July 2013 (UTC)

Proposed addition backlog

Returning to check something that I'd reported at the start of July (because the spammer is still going), it looks like none of the proposed additions have been processed this month - the only additions to the blacklist have been from admins adding URLs directly. Is there a reason why these aren't being processed? --McGeddon (talk) 09:48, 29 July 2013 (UTC)

The problem is generally a significant lack of manpower. There are only a very few admins active here, all the others (including editors 'selecting' new admins) find XfD's more important. See also Category:Open Local COIBot Reports (and that are just bot flagged cases of what may be suspicious behaviour - the professional spam generally is less easy to detect - note that the bot closes/marks stale requests after some time of 'inactivity' of the link or when it got cleared up). --Dirk Beetstra T C 10:02, 29 July 2013 (UTC)