Wikipedia talk:Arbitration Committee/Requests for comment/Article creation at scale/Archive 1

From WikiProjectMed
Jump to navigation Jump to search
Archive 1 Archive 2 Archive 3 Archive 4

Deletion vs. creation

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.



Is this discussion meant to cover mass-creations, mass-deletions, both or both but in separate pages? Jo-Jo Eumerus (talk) 16:42, 21 August 2022 (UTC)

Hey, @Jo-Jo Eumerus! The first RfC will cover mass creations, with a second to cover mass deletions. There are more links in the status box to help people follow. Valereee (talk) 16:49, 21 August 2022 (UTC)
I'm not convinced it's going to be helpful separate the two. While they aren't identical (mass creation doesn't always lead to mass deletion, and mass deletion isn't always linked to mass creation) there is such a large overlap - and overlap with other mass actions (e.g. redirecting), that it seems like it would be more productive to discuss them together. Thryduulf (talk) 17:13, 21 August 2022 (UTC)
The overlap was seen as part of the problem -- that without getting creation, which is a major cause of deletion, out of the way, it would become too muddled. Valereee (talk) 17:21, 21 August 2022 (UTC)
I agree that you can't deal with deletion without dealing with creation (which was the point of the initial comments), but I don't agree that you can deal with creation without dealing with deletion. We need to deal with all mass actions that disrupt existing processes together - attempting to deal with them individually has demonstrably not worked until now. Thryduulf (talk) 18:17, 21 August 2022 (UTC)
When did we attempt to deal with them individually? Levivich 17:33, 24 August 2022 (UTC)
Every time someone has tried previously to deal with issues around mass creation or mass deletion. Thryduulf (talk) 22:52, 24 August 2022 (UTC)
The last time I can think of when we tried to deal with issues around mass creation or mass deletion was WP:NSPORTS2022, which dealt with both creation and deletion together in the same RfC. Another time that comes to my mind is the "Portals War of 2019" (WP:ENDPORTALS2, WP:POG), which also dealt with both the mass creation and the mass deletion of portals. I can't think of a time when the community looked at the two separately: mass creation of pages in one RfC, followed by mass deletion of pages in another RfC, as we are doing here. I believe this may be a first? Levivich 23:15, 24 August 2022 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

"AfD" or "Article creation"?

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.



(copied from other talk)

The project page is at Wikipedia:Arbitration Committee/Requests for comment/AfD at scale, but the text says "The initial community consultation will begin soon at Wikipedia talk:Arbitration Committee/Requests for comment/Article creation at scale." Which is it to be? Scolaire (talk) 16:11, 21 August 2022 (UTC)

Both were described as issues in the case, so my guess is that one will redirect to the other. —VersaceSpace 🌃 16:19, 21 August 2022 (UTC)
Hey, @Scolaire, there are more links in the status box now to help follow the process! Valereee (talk) 16:48, 21 August 2022 (UTC)
@VersaceSpace and Valereee: Redirects might solve the problem that users clicking the "Talk" tab on this project page will be brought to a different talk page, but it won't solve the problem that the designated talk page doesn't have an associated project page. Scolaire (talk) 17:01, 21 August 2022 (UTC)
This here is already a discussion that doesn't appear on the "talk page". You urgently need to move one or the other of the two pages. Scolaire (talk) 17:04, 21 August 2022 (UTC)
Or at least create a separate Wikipedia:Arbitration Committee/Requests for comment/Article creation at scale page. Scolaire (talk) 17:07, 21 August 2022 (UTC)
With the current set-up it's only a matter of time before we're going to have overlapping and duplicate discussions at the different talk pages. For example the question asked at Wikipedia talk:Arbitration Committee/Requests for comment/Article creation at scale could easily have been asked here and would have been equally on topic. This is not going to be conducive to a smooth process, you need to start actively managing things before this gets out of hand. Thryduulf (talk) 17:21, 21 August 2022 (UTC)
Yes, discussion should happen here. When this RfC is over, we'll start discussion at the RfC on AfD. Valereee (talk) 17:25, 21 August 2022 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Definitions

For purposes of starting this discussion:

  • "At scale" refers to rapid and/or mass creations/deletions of stub articles that do not clearly meet WP:GNG (stubs) at a rate higher than can be handled by English Wikipedia processes.
  • "Rapid and/or mass creation" refers to multiple stubs created by the same editor in succession using identical sources.
  • "Rapid and/or mass deletion" refers to nomination for deletion together or in succession of stubs created by the same editor.
  • "At a rate higher than can be handled by English Wikipedia processes" refers to (what? Does this even need to be defined?).

Refining these definitions for the RfC may be a necessary part of this workshopping. Please see solution 17 below.

Proposed issues to be addressed

Proposed issue 1: Mass creations from databases

Mass creation of articles from entries in a database. - Donald Albury 14:35, 24 August 2022 (UTC)

Proposed issue 2: Stealth WP:FAITACCOMPLI

The mass addition of articles to the encyclopedia can be done as a WP:FAITACCOMPLI by stealth because the addition of each article is not subject to review, while the mass deletion of articles cannot be done as a stealth process because the deletion of each article is subject to review. The deletion of one hundred articles must be discussed because the deletion of one article must be discussed. The addition of one hundred articles is not discussed because the addition of articles is not discussed because it is encouraged as normal expansion of the encyclopedia.

The ArbCom WP:FAITACCOMPLI principle disapproves of mass edits after the editor or editors have been advised that their actions are controversial. Sometimes the controversy surrounding mass edits only arises after they have already been made. Robert McClenon (talk) 08:26, 25 August 2022 (UTC)

Proposed issue 2a: Stealth and fait accompli

The mass addition of articles to the encyclopedia can be done as a WP:FAITACCOMPLI by stealth because the addition of each article is not subject to community review. Individual articles are patrolled by individual new page reviewers. Because they are individuals, they have a harder time picking up on a pattern than the community does at the centralized venue of AfD. On the other hand, the mass deletion of articles at AfD cannot be done as a stealth process because the deletion of each article is subject to extended community discussion. The deletion of one hundred articles must be discussed because the deletion of one article must be discussed. The addition of one hundred articles is not discussed because article creation is encouraged as normal expansion of the encyclopedia.

The ArbCom WP:FAITACCOMPLI principle disapproves of mass edits after the editor or editors have been advised that their actions are controversial. Sometimes the controversy surrounding mass edits only arises after they have already been made. HouseBlastertalk 11:17 pm, Yesterday (UTC−4)

Proposed issue 3: Trainwrecks on Mass Deletion

Deletion nominations of multiple pages sometimes become train wrecks. (That's what a Wikipedia train wreck is.) This happens because one or more editors split their support or opposition to deletion between different pages. Then multiple editors take multiple positions that are not just forms of Keep or Delete or overall alternatives to deletion.

This is even more likely with mass nominations for deletion than with bundled nominations of up to ten or twelve articles. Robert McClenon (talk) 12:08 am, Today (UTC−4)

Proposed issue 4: articles on notable subjects aren't a problem

If the subject of an article is verifiable and met an SNG[clarification needed] when it was created, no matter how it was created, there is no immediate problem with it existing. Such articles are effective seeds for development. In other words, there is not problem with mass creation.

If articles, or groups of articles, are considered not to be notable then they can be dealt with through standard deletion routes. Of course, with some geographical articles verifiability might be a key issue.

I feel some balance is necessary here. This is a valid perspective on the issue and thinking outside whatever box this discussion seems to have gotten into. Ideally there might be some middle ground that could be reached. Blue Square Thing (talk) 16:25, 29 August 2022 (UTC)

Proposed issue 5: changing status of SNGs vs. GNG

Some SNGs, notably WP:NSPORTS, have been used to support mass-creation, based on thresholds of notability besides GNG. These articles are subsequently taken to AfD on the grounds that while their subjects meet participation criteria listed in NSPORTS, for example, they do not meet GNG. Without commenting on the merits of these articles, if the threshold for notability is functionally different when articles are being created versus when they are discussed for deletion, we will necessarily have a stream of repetitive AfDs and considerable associated conflict. Vanamonde (Talk) 11:12, 31 August 2022 (UTC)

Proposed issue 6: Mass creations without consensus

Mass creation of articles without consensus is a significant issue as it presents opponents with a fait accompli that exhausts their ability to contest the creations while attempts to do so through existing processes results in those processes, like AfD, being overwhelmed. BilledMammal (talk) 22:40, 30 August 2022 (UTC)

Proposed issue 7: Presumed notability

Wikipedia editors differ considerably in their understanding of what it means when a subject is presumed to be notable. Common interpretations include that the subject must have an article; that it must have coverage on Wikipedia, but not necessarily an article; that it guarantees nothing at all. This is mirrored by confusing wording in our notability policy itself. This confusion is made worse when the threshold of notability being applied is something besides GNG: in such cases, it has been argued that the language about presumption means that a subject must also meet GNG when challenged at AfD. Vanamonde (Talk) 06:28, 4 September 2022 (UTC)

Proposed issue 8: Competence is required:

Some editors may produce large numbers of sub-standard articles, which are a burden on NPP[clarification needed] and AfD. This may be seen as an issue of competence in interpreting and applying the requirements for notability and verifiability. While we can, and probably should, be tolerant of occasional marginal articles created by new contributors and existing contributors who seldom create articles, mass creation of sub-standard articles should not be allowed. · · · Peter Southwood (talk): 15:28, 1 September 2022 (UTC)

Proposed issue 9: Failure to expand

Articles created as stubs – as mass-created articles tend to be – all too frequently aren't expanded, even when an AfD leads to the production of high-quality sources. Such stubs would very likely be rejected at AfC. The bar for established users should not be lower than for newbies. Scolaire (talk) 16:30, 1 September 2022 (UTC)

Proposed issue 10: wildly varying expectations and definitions

I know this is already on the agenda to sort out, in some ways, but the current #Definitions section itself is an example of the problem(s). It defines "at scale" not in terms of a particular rate, number, or quality, but relative to whatever community capacity happens to be at that point in time, with no explanation of how to measure that (see also: proposed issue 11). It limits "Rapid and/or mass creation" to when the same sources are used, but there's a thread at VPR right now about "mass creation" where people are taking issue with fish species being created using varied sources. Are we disqualifying that from "mass creation" at the outset? If so, what do we make of that thread? What is the role of semi-automated tools in all this (that fish thread was started by someone not using tools, but about someone accused of "bot-like editing" for creation many articles). The definitions, the point at which permission must be sought, how to request that permission, and under what conditions the community should allow that permission are all ill defined. — Rhododendrites talk \\ 20:06, 1 September 2022 (UTC)

Proposed issue 11: the role of "community capacity"

Several of the arguments about mass creation, as well as a wide variety of other issues, are related to an idea of "community capacity". That is, the extent to which the community can screen, maintain, improve, and, if necessary, merge/delete articles should be a consideration when deciding whether to "allow" users to create articles. I can see the validity of this argument when there's a high likelihood the articles/edits will be bad (failing notability, BLP violations, etc.), but for those that are otherwise policy compliant these arguments on the basis of capacity (or of "my watchlist" or "my recent changes patrolling") ring hollow. The ever-expanding "sum of human knowledge" will always outpace our relatively small (and by some measures shrinking) editor base's capacity. We cannot maintain/improve the articles we already have, but that's not a good reason not to turn off article creation. — Rhododendrites talk \\ 20:06, 1 September 2022 (UTC)

Proposed issue 12: stubs (or the relationship between article quality at time of creation and mass creation)

In recent discussions about mass creation, it seems like "mass creation" is a proxy argument for a general dislike of stubs, or people who create lots of stubs. If someone were going to create 500 B-class articles in the same span of time, even though it would mean more total edits, I suspect we'd see far fewer objections than to proposals to create 500 articles. — Rhododendrites talk \\ 20:06, 1 September 2022 (UTC)

Proposed issue 13: AfD !voting behavior and guideline enforcement

Background: For NSPORT, there is a difference between meeting a presumption of notability (or of SIGCOV) and actually meeting the notability requirements of the guideline. The first presumption (referenced by sentence 2 of NSPORT), which only requires verifying with a citation to RS that the subject meets a sport-specific subguideline (SSG) and (post-RfC) a single source of SIGCOV in SIRS, allows an article to be created in mainspace without the threat of speedy deletion, less chance of scrutiny by patrolling editors, and somewhat more leeway at AfD in finding sources that may be offline or in other languages. However, per the first sentence; third sentence; FAQs 1, 2, and 5; and various other parts of the guideline, this presumption is rebuttable, the subjects of these articles must actually meet GNG whether the in-article sourcing reflects this or not, and the in-article sourcing also must eventually demonstrate GNG. This dichotomy was/is true for both pre- and post-RfC NSPORT.

The SSGs--which are generally created by the relevant sports wikiproject and then refined on the NSPORT talk page--are supposed to predict GNG for 90-95% of [a minimally-qualifying representative sample of] all their subjects, although before the RfC AFAIK no project actually proved this for any criterion. Because the SSGs were crafted by a small group of fans of the respective sports, rather than through global community consensus, the standards for what constituted "SIGCOV" and a criterion's predictive power for meeting GNG varied wildly between sports. Consequently, an enormous number of articles sourced to databases were created under the presumption of SIGCOV/notability afforded by meeting a SSG, but the actual notability status of each subject was not verified before its article was put into mainspace. Pre-RfC, the NSPORT presumptions forced deletion nominators of SSG-meeting subjects to show an extremely exhaustive search for sources (way, way beyond that asked by BEFORE--see this AfD where directly linking the search results of 30 top sports news sites that extensively cover the relevant leagues, plus all regional newspapers in one country, across four different languages and two alphabets, was still not convincing enough to some !voters and discussion was dragged on for more than a month after) just to avoid a SNOW keep, and even then most articles would be kept due to numerous sports editors !voting "keep, meets [SSG]" and closers being unaware of/refusing to acknowledge the overarching NSPORT requirement that GNG be met. At the same time, many of these sports editors would nominate and !vote "delete, doesn't meet [SSG]" without enforcing any kind of BEFORE.

WP:NSPORTS2022 removed or amended the criteria for meeting SSGs, and added the requirement of a single SIGCOV source cited in the article for any presumptions of further SIGCOV/GNG from a SSG to apply. This resulted in tens of thousands of articles of unconfirmed notability no longer being protected by a presumption of meeting GNG, and therefore becoming susceptible to AfD nominations that didn't have to show a super-ultra-BEFORE was performed.

The problems:

Proposed issue 13.a: BEFORE expectations

Overzealous AfD nominations based on no longer/not meeting SSG criteria, with minimal or no apparent BEFORE. Because the new NSPORT criteria were instituted through a large global RfC on all sports rather than through sport-specific crafting, some sports project editors (who opposed the successful RfC proposals) consider these noms to be disruptive and tendentious. At the same time, even when a standard BEFORE is shown, they also demand the nominators do super-ultra-BEFORE (e.g. waiting 15 years for digitized sources to be made public, or traveling to specific countries to access offline sources) even when the SSG presumptions cannot be applied (due to lacking the requisite cited piece of SIGCOV).

Proposed issue cluster 13.b: Repeated anti-consensus !votes at AfD

There are a number of extremely active sports bio AfD participants who deliberately reject global consensuses and guidelines in their !votes. b1. While such anti-consensus arguments should be disregarded, many closers continue to give them weight, especially in low-participation AfDs. This results in closes where a bloc of keep !votes with deprecated arguments are sufficient to turn a delete into NC or even keep, even though suspension of guidelines through LOCALCON should not be commonplace and should be based on strong IAR assertions rather than ILIKEIT. b2. Athlete biographies must contain a SIRS with SIGCOV cited in the article to enjoy any NSPORT presumptions of meeting GNG. This requirement received enormous support in the global RfC and should not be controversial to implement. However, already there have been dozens of sports bio AfDs closed as keep or NC where no such source was ever identified. These AfDs occur like this: keep !voters assert the subject "is certain to meet GNG" by virtue of having "international caps" -- a criterion in SPORTSBASIC -- and therefore we can assume coverage exists offline somewhere. Delete !voters point out that without the required source, any reasoning following from the subject meeting some sport-specific/SPORTSBASIC criterion is rejected: the subject must be treated as if it doesn't have any notability at all. Despite this, closers routinely refuse to close against numerical majority or even significant minority. b3. There is no mechanism empowering sanctions for repeated instances of this behavior, and therefore no way to discourage these tendentious !votes.

Proposed issue 13.c: Repeatedly bringing unusable sources to AfD/refbombing

Some editors who previously defaulted to "keep meets [SSG]" have pivoted to claiming a subject meets GNG, either baselessly or by citing obviously noncompliant sources (like non-independent coverage from a governing sports org, an awarding org, a subject's team, a subject's school, PR, interviews, etc.; or trivial coverage from passing mentions, routine transactional reports, routine match coverage, non-substantial interview commentary, etc.). It is a gigantic timesink for other editors to read and rebut each of these sources, especially when the editors providing them do the same thing at every sports AfD regardless of prior outcomes. We also see some editors adding in unencyclopedic trivia from these sources to boost page bytes and ref count.

Proposed issue 13.d: LOCALCON problems extend to DRV

DRV should be used to evaluate how consistent a close is with an AfD's ROUGHCONSENSUS, particularly as it intersects with PaGs; it should not act as a venue for perpetuating the same anti-consensus LOCALCON behaviors of an AfD. This does not work if a large number of regular DRV participants are also in the minority of users who opposed many of the recent successful guideline restrictions and !vote accordingly. JoelleJay (talk) 04:22, 5 September 2022 (UTC)

Proposed issue 14: an absolutist approach is unhelpful

In the past, some editors have created large numbers of pages dealing very briefly with individual subjects (the mass creation of stub articles) with the apparent aim of creating a complete set of articles dealing with every example of a subject. Although there is an argument that stub articles act as seeds for article development, there is no need for anyone to create articles about, for example, every species of moth, settlement in Algeria, season played by a college baseball team, bus route etc... WP:PAGEDECIDE encourages alternative ways of dealing with content, including including details in other articles, creating articles which encompass group of subjects, using list articles to summarise content, redirecting and disambiguating effectively.

There is no need to take an "absolutist" view about article creation. Articles do not have to be created because the subject exists. There are sometimes better ways of dealing with content.

This is particularly true of very short articles. WP:IDEALSTUB suggests that stubs must contain sufficient context and enough information for other editors to expand upon them. Some editors have been restricted by placing a minimum word limit of the length of article they can create, thus limiting their ability to create very short articles and ensuring that stub articles are more likely to meet this guideline. Other sanctions have included limiting the number of articles an editor can create in a period of time.

In the same way, there is no need to take an "absolutist" view about article deletion. There are alternatives to deletion which work alongside the alternatives to creation noted above. Redirection, merging, disambiguation etc... can, and should, be used as a way of resolving issues caused by the mass-creation of stub articles. Blue Square Thing (talk) 08:23, 2 September 2022 (UTC)

Proposed issue 15: Clear definitions

The #Definitions section says:

  • "Rapid and/or mass creation" refers to multiple stubs created by the same editor in succession using identical sources.

This is poorly scoped. Imagine that an editor creates one short new article per month, using the same two sources for each article. These are the only edits this editor makes in the year. Twelve edits per year is not "rapid", right? And with an output of one article per month, it's not "mass", either. But this is "multiple stubs", "same editor", "in succession", and (possibly) even "identical sources". I think this will have to be solved by adding numbers to the definition, e.g., "more than 25 articles per day" (a number I put here because of previous comments like "25-50 articles per day is not a problem", which is the basis for the current WP:MASSCREATION requirement for bot approval).

Separately, I think we would also need an agreement about what counts as "identical sources". I'm sure that most editors would agree that citing a different fact from the same webpage (e.g., a webpage listing football statistics for every athlete who ever played for a specific team) would count as "identical". In that case, the sources are identical right down to the exact URL, even if one article uses a fact from the top and another pulls a fact from the middle. But imagine that one of the sources is a new book, Profiles of Ruritania's Indigenous Women Artists. January's article is about the artist profiled in the second chapter. February's article is about the artist profiled in the sixth chapter. Do editors agree or disagree that this constitutes "identical sources"? What if it's a website rather than a book, and the articles cite https://www.example.com/January_feature_artist and https://www.example.com/February_feature_artist? Are those "identical"?

Proposed issue 16: Excessive focus on GNG

Some proposed solutions (I particularly note Proposed solution 1.2: All new articles are required to include at least two sources that plausibly contribute towards WP:GNG) are based on the false premise that an article can only be notable by passing GNG. Although most SNGs defer to GNG, not all of them do (I note in particular WP:PROF). Additionally, there is far from universal agreement among editors on whether sources that are widely used to count towards notability for some other SNGs would also count for GNG (for instance WP:AUTHOR is commonly passed through published book reviews, but some deletion-minded editors have expressed the opinion that only sources about the personal life of the author rather than sources about their works can count towards GNG). Any proposal that would force all articles to go through a strict interpretation of GNG amounts to a stealth rewrite of Wikipedia's notability policy, far beyond the remit of this RfC to address only issues of mass creation. —David Eppstein (talk) 07:48, 3 September 2022 (UTC)

Proposed issue 17: Definitions

I'm adding this as another issue, as I think it probably needs to be finalized here before we get into the RfC(s) and there seems to be a lot of discussion of it. Valereee (talk) 14:00, 3 September 2022 (UTC)

Proposed issue 18: One or two RfCs?

Is it feasible to try to address both creation and deletion in a single RfC? Valereee (talk) 14:43, 3 September 2022 (UTC)

Proposed issue 19: WP:BEFORE, WP:BURDEN and mass-article-creation vs. deletion

Some editors have taken the position (controversial, and to my knowledge with no clear consensus backing it) that WP:BEFORE strictly requires a search for sourcing before nominating an article for deletion or in some extreme cases even WP:PRODing it, to the point of suggesting that sanctions could be leveled against editors based solely on the failure to satisfy their requests for such searches. This is a particular problem when it intersects with mass-article creation, because the same editors generally take the perspective that no prior effort to add or even search for reliable sources is necessary to create or even mass-create articles. The result of these two intersecting interpretations of policy is that one editor can create a thousand articles using an automated script with no prior discussion and without making any effort to search for sources for them, then demand that anyone who wants to reverse their action put in the work that they declined to do - a clear inversion of WP:BURDEN. There needs to be a clearly-established principle that anyone who creates an article is at the very least presumed to have done a search source for it (which means none can ever be demanded from anyone else, since editors can simply assume that the sources currently in the article represent the best efforts of the article's creators), but I would suggest flatly reversing the interpretation some have been pushing for WP:BEFORE and making a search for sources the clear, sole responsibility of an article creator. This would greatly limit mass-article creations, since anyone who created large numbers of articles would be expected to demonstrate that they had performed the necessary source-search, and would ensure that totally-unsourced stubs don't linger. --Aquillion (talk) 04:18, 4 September 2022 (UTC)

Period for proposing issues and solutions is now ended

If anyone has a brainstorm, please post it in your own section and ping me; I'll be working on combining what's been suggested so far. Valereee (talk) 11:48, 7 September 2022 (UTC)

Proposed solutions

Proposed solution 1.1 (to issue 1: Mass creations):

Require new articles to be supported by at least one citation to a reliable source that is not a database. - Donald Albury 14:36, 24 August 2022 (UTC) (Edited 14:46, 24 August 2022 (UTC))

Proposed solution 1.2 (to issue 1: Mass creations and issue 5):

All new articles are required to include at least two sources that plausibly contribute towards WP:GNG. BilledMammal (talk) 11:08, 1 September 2022 (UTC)

Proposed solution 1.3 (to issue 1, issue 5, and issue 19)

For the proposal in 1.2, if there is disagreement over whether the sources are sufficient, that is handled on talk and ultimately WP:AFD as currently; in other words, the simple assertion that a source satisfies the GNG is sufficient, though repeated assertions that strain the presumption of good faith might be a conduct issue. --Aquillion (talk) 04:43, 4 September 2022 (UTC)

Proposed solution 1.4 (to issue 1, issue 5, and issue 19)

As an addendum to 1.2 (if implemented), a WP:CSD will be created for articles that unequivocally fail this requirement, ie. there is no plausible good-faith interpretation whatsoever under which the existing sourcing (for the current or any previous versions of the article) could credibly be said to satisfy 1.2's requirements.

(The idea here is to give the requirements in 1.2 "teeth" in a way that doesn't require constantly resorting to WP:ANI. The "unequivocally" is extremely important.) --Aquillion (talk) 04:43, 4 September 2022 (UTC)

Proposed partial solution 2.1 (to issue 2: Stealth/Fait accompli):

Reports should be developed that can be either run by a bot and posted to a project folder for review or run by a human and posted to a project folder for review. The reports can show what editors have produced the most articles by categories in a day or month. A high rate of production may be either cause for recognition as a contributor or cause for discussion. Robert McClenon (talk) 08:26, 25 August 2022 (UTC)

Proposed partial solution 3.1 (to issue 3: Trainwrecks):

First, define a size threshold where any nomination to delete more than N items will be considered a bulk nomination, and subject to special restrictions. One of those restrictions should be that the nominator must specify the logic, e.g., by defining a query that populates a category. This may focus discussion on the merits of deleting articles that belong to the category rather than deleting the individual articles.

Second, for bulk nominations, disallow any !vote to exclude certain articles from the nomination. If the editor disagrees with some of the articles, they are stating that the bulk nomination is not appropriate. A decision to Keep a bulk nomination will NOT prevent a bulk nomination to delete most of the previously nominated articles. Improvement of the scope of the nomination will be by repeated nominations, not by a split close. Robert McClenon (talk) 12:08 am, Today (UTC−4)[reply]

Proposed partial solution 3.2 (to issue 3: Trainwrecks)

Establish a venue where editors can collaboratively discuss proposed bulk nominations of pages for deletion with the sole aim of determining whether one or more groups share sufficient commonality such that if nominated for deletion (or merger, etc) together the nomination would be unlikely to end in a trainwreck. This venue would explicitly not determine whether or not the pages should be kept, deleted, merged, etc. It is recommended that relevant wikiprojects and/or editors active on the articles/in the topic area are alerted to these discussions.

Use of the venue prior to a bulk nomination would be optional but recommended. There would be no obligation to proceed with a (bulk) nomination after discussion. Bulk nominations made contrary to consensus at this venue should have a lower threshold for being speedily closed as a trainwreck, bulk nominations made in accordance with consensus should have a higher threshold for such closures. Thryduulf (talk) 6:44 am, Today (UTC−4)

Proposed partial solution 3.3 (to issue 3: Trainwrecks)

To avoid WP:FAITACCOMPLI, indiscriminate mass creations should be handled by mass deletion. Create a process whereby hundreds or thousands of similarly mass-created articles can be deleted at once without the requirement for individual WP:BEFORE searches. This would be appropriate for cases where, for example, an editor duplicates the contents of a sports or geography database that contains a mix of notable and non-notable entries. This would have a higher standard of community participation than AfD, for example it could take place or be advertised at Village Pump. The idea is that although some notable topics would be deleted, their re-creation would place a smaller burden on the community than evaluating each one individually for notability. A mass nomination could use criteria such as "Articles created by (X editor) sourced only to (Y database)". –dlthewave 11:04 am, Today (UTC−4)

Proposed solution 4.1 (to issue 4)

If an article is verifiable, do nothing. Blue Square Thing (talk) 16:26, 29 August 2022 (UTC)

Proposed solution 5.1 (to issue 5)

The threshold to be met when an article is created needs to be the same threshold that is applied at AfD. SNGs that do not confer notability independent of GNG therefore should not be used to justify mass creation. Vanamonde (Talk) 11:12, 31 August 2022 (UTC)

Proposed solution 6.1 (to issue 6: Mass creations without consensus)

Editors are permitted to nominate groups of related mass-created articles for draftification. Any group nominated through this process must meet the following criteria:

  1. They were all created by the same editor
  2. That editor engaged in the creation of large numbers of similar articles - i.e. mass creation
  3. A significant majority of the articles in the group are not suitable for article space - i.e. they fail WP:N or violate WP:NOT

Opposition to these nominations may only be on the basis that one or more of these requirements are not met; closers would be required to discount opposition on other grounds, as well as opposition that is on these grounds but is not supported by evidence.BilledMammal (talk) 22:40, 30 August 2022 (UTC)

Proposed solution 6.2 (to issue 6: Mass creations without consensus)

Mass-creation is only permitted for groups of subjects that meet a guideline that confers notability independent of GNG (such as GEOLAND or NPOL), or that meet GNG by definition (such as described biological taxa). Mass creation that does not conform to this guideline may be sanctioned as disruptive.

(This is related to 5.1 above). Vanamonde (Talk) 06:32, 4 September 2022 (UTC)

Proposed solution 7.1 (to issue 7: Presumed notability):

The various notability guidelines need to make it explicit when an SNG may be used to supplement GNG; when an SNG must be used instead of GNG; and when an SNG cannot be used at all. The language at WP:N needs to be clarified to eliminate contradiction. The circumstances that can be used to challenge a presumption of notability need to be described more explicitly. Vanamonde (Talk) 17:49, 31 August 2022 (UTC)

Proposed solution 7.2 (to issue 7: Presumed notability):

When creating a series of stubs on similar topics the requirements for proof of notability could be tightened up to require the creator to actually prove notability for each stub, by way of providing at least one reliable source indicating general notability. In some cases, like recognised biological taxa, this should be trivial, as the published description is sufficient evidence, in most other cases it will be more complicated, but at least this reduces the BEFORE load on other people. This could be managed by putting a restriction on all further article creation by the editor until the problem is resolved. This solution also ties in with proposed solution 8. to the requirement for competence for bulk article creation. Presumed notability may be acceptable for the occasional article creator, but is often misused by bulk creators, who should know better. Expecting and demanding that other people do the work you should have done is disruptive.· · · Peter Southwood (talk): 04:33, 3 September 2022 (UTC)

Proposed solution 7.3 (to issue 7: Presumed notability):

Because individual editors disagree about what constitutes notability, and because humans "see through their own eyes", so that our response to a subject is always colored by all of our personal knowledge and experiences, even when we try to be as unbiased as possible, individual editors should not unilaterally reject or move an article out of the mainspace when the main concern is that it's a non-notable subject.

Instead, if any editor's main concern is notability, the article should be sent to WP:AFD for community-wide consideration. (If you are concerned that this will feel threatening to page creators, we could rename AFD to something like "Articles for notability determination".) WhatamIdoing (talk) 01:28, 7 September 2022 (UTC)

Proposed solution 8.1 (to issue 8: Competence is required):

Editors could be required to display competence in creating articles before getting a permission to create large numbers of articles. Until this permission is earned, their backlog of unreviewed pages could be automatically limited so that it becomes impossible for them to create more mainspace articles while an excessive review backlog exists. Permission could be based on average quality of the last N articles created. If the permission is misused and a high percentage are deleted they lose the permission, as deletion would indicate unacceptable quality. With such a system, someone who creates consistently acceptable articles can do so at whatever rate they can manage, creators of problematic articles must learn to get it right at a rate that allows the reviewers to manage the input. Editors who only occasionally create articles should be unaffected. This should encourage a better quality of stub at the very least, as a mass creator of low quality stubs would risk losing the right. Each user would have a public automated recent article creation quality rating. Some software would be needed to track the system and to prevent gaming the system. This could be integrated with new page review. A system like this should also reduce the load on new page review and AfD. A similar system could possibly be used to monitor quality of deletion nominations. Too many failed nominations and you temporarily lose the right to nominate. · · · Peter Southwood (talk): 15:28, 1 September 2022 (UTC)

Proposed solution 9.1 (to issue 9: Failure to expand)

Newly added stubs not containing at least one (or two) non-database ref(s), that have not been expanded within five days, should be automatically userfied, thus putting the burden of expansion back onto the creator. There should be specified sanctions for editors moving articles back into mainspace without expanding them. Scolaire (talk) 16:15, 4 September 2022 (UTC)

Proposed solution 9.2 (to issue 9: Failure to expand)

A long, long time ago, before VFD was split into subpages and so each article was discussed directly on the Wikipedia:Votes for deletion page itself, the deletion discussion for a kept article was pasted onto that article's talk page. We can likewise transclude the afd subpage onto the article's talk page now instead of just linking to it in the easily-ignored {{oldafd}} or {{article history}} template at the top, which - especially for a stub article which won't see its talk archived for a long time - should raise the prominence of any sources identified at AFD. —Cryptic 17:00, 5 September 2022 (UTC)

Proposed solution 10 (to issue 10)

First, this RfC should probably plan to use "at scale" in relative terms, since this RfC will also be defining what that means. So, for example, a question might ask "should articles created at scale (in term defined by the outcome of this RfC) be subject to...".

A plan for clarity:

  • Separate issues of content policies/guidelines (notability, sourcing) from the rate. They shouldn't be combined. Creating articles that don't comply with wikipolicy already lacks the support of the community, so obviously creating lots of articles that aren't policy compliant should not be allowed. It does seem like some people want there to be higher standards for mass created articles, so it may be useful for this RfC to ask whether that position has consensus (e.g. regarding higher sourcing standards than just showing notability, or improved to a level higher than stub class).
    There are gray areas like the status of some SNGs, and some guardrails that should be established like not supporting articles based just on database citations, but those rules should be true regardless of the number of articles.
  • Set a concrete rate above which permission must be requested. This rate will take some discussion/negotiation, but for the sake of argument I'll throw out "a limit of 20 mainspace non-redirect articles per day and 50 per week". I think this could even be enforced through an edit filter, if need be.
    Side note: I'd be curious to see some statistics, if anyone is technically proficient enough to pull these, about how often anyone exceeds these figures. I suspect it's not all that often anymore.
  • Set a default rate extension that people can request. This should be codified according to what the community feels its capacity is at that point in time. That is, any editing exceeding this "default rate extension" would need to be truly exceptional, and is all but guaranteed to be declined. For the sake of throwing out numbers, perhaps it's 5x the normal limit: 100/day, 250/week. I cannot fathom there being consensus for an article creation project exceeding these numbers these days.
  • Create a dedicated forum, not confused with bot editing, where these requests can be reviewed. Sometimes, these requests may be a simple formality and should be granted quickly. Other times, it will be useful to have a period of comment/questions/feedback. The criteria for declining should be clearly articulated. Much of the rest of this RfC should be dedicated to determining these reasons and framing them as clearly as possible. For example: concerns about content policies being followed, negotiations over the rate, disputed notability criteria, etc.

This is very much spitballing a starting point. Take my numbers with a grain of salt. My main concern is clarity. — Rhododendrites talk \\ 20:51, 1 September 2022 (UTC)

Proposed solution 14.1 (to issue 14: an absolutist approach is unhelpful)

  • Editors must actively consider alternatives to article creation:
    • Editors should be discouraged from creating large numbers of very short articles, especially when these are sourced only to database style sources. Alternatives to article creation should be put to editors who do so and they can be expected to respond to these;
    • Where required, sanctions should be applied to editors who fail to consider alternatives to article creation. These could include limiting article creation to a minimum word length, removal of autopatrolled status, limiting the number of articles an editor can create, and more extreme sanctions;
  • Editors must also actively consider alternatives to deletion:
    • Where reasonable alternatives to deletion exist, they must be used, preferably before articles come to the deletion process. Alternatives to article deletion should be put to editors and they can be expected to respond to them;
    • Where required, sanctions should be applied to editors who fail to consider reasonable alternatives to deletion. These could include removing the ability to nominate pages for deletion, topic-banning from deletion discussions, and more extreme sanctions;
  • Alternatives for deletion should be explored as ways to deal with the past mass creation of short, minimally sourced stub articles. Working with the editors who created these articles and interested wikiprojects is an essential part of this.

This promotes an alternative view which acknowledges that stub articles can be both a problem but also provide a useful role. Blue Square Thing (talk) 08:25, 2 September 2022 (UTC)

Proposed solution 15: Clear definitions

Write a clearer definition, such as:

"Rapid and/or mass creation" is the creation of more than about 25 articles in the same 24-hour period, or more than about 150 articles in the same week, each of which meet all of the following requirements:

  • created by the same editor,
  • are very similar to each other in subject matter (e.g., all about fish species),
  • are very similar to each other in contents (e.g., "X is a red fish, Y is a yellow fish, Z is a blue fish..."),
  • cite completely identical sources in all stubs, with no additional or unique sources, and
  • are still Wikipedia:Stubs at the time a discussion of possible mass-creation begins (or after a reasonable time period, if the discussion begins on the same day as the first page creations).

A source is completely identical if the citations are (or should be) exactly the same, point for point (e.g., the same chapter of a book, the same page on a website). Citing different pages or chapters in the same book or citing different pages on the same website does not constitute a completely identical source.

Mass creation does not depend upon the notability of the subjects, whether enough sources are cited, whether the articles claim the subject is important or significant, whether editors are satisfied with the quality of the first revision, or whether the mass creation was pre-approved. Pre-approved mass-creation of well-sourced perfect stub articles about interesting and obviously notable subjects is still mass creation.

It is possible for an editor to interleave mass-creation edits with edits to existing articles and the creation of non-qualifying articles (e.g., articles on other subjects, articles using additional sources, articles that have been improved beyond the stub level, etc.). In that case, only the creation of qualifying stubs counts towards the daily and weekly limits.


A few comments:

  • I wrote "about 25 articles in the same 24-hour period" because I didn't want to see someone creating 24 boilerplate stubs on "Monday night" and 24 more on "Tuesday morning" and claiming that this was obviously permitted.
  • It's possible to game this in ways that we don't really want to see, but it would require something like wanting to mass-create articles on both geographical locations, music albums, animals species, and food. Then you could create, say, a dozen stubs for each subject each day and never officially trigger "mass creation" levels despite creating ~50 articles in a day, whereas if you did 50 articles for just geographical locations in the same day, and then 50 articles for just music albums the next day, etc., that would be considered mass creation.
    • On the other hand, this is perhaps not a significant problem, because reviewers tend to focus on their preferred subject areas. Reviewing a dozen stubs about food each day is probably easier than reviewing no articles today, and 50 tomorrow. Also, who's going to do that?
  • Most of the ways to "game this" mean "writing the kind of articles that we want to see". For example, if you write 25 articles that contain the same two sources, but each one additionally contains a unique source (so across 25 articles, you are citing 27 different sources), then that means that it's not "mass creation" – but that's what we want people to do. Similarly, this definition would exclude the same-day creation of high-quality articles from the definition of "mass creation".
  • I did not mention database entries as a source type in the identical/non-identical explanation, because I thought it better to let other discussions settle first. Given the assumptions by some editors that databases are inferior (presumably based on their experiences within certain subject areas, as there is quite some variation between sports statistics, movie databases, census databases, etc.), I think it would be ideal for those discussions to settle first, before figuring out how to mention that here. I do think it is important to explicitly state whether an entry, e.g., for a place in census.gov is "identical" to an entry for a different place in census.gov.
  • I wonder how often anyone creates more than 25 non-redirect articles in a day.

WhatamIdoing (talk) 17:32, 5 September 2022 (UTC)

Proposed solution 16:

Proposed solution 17.1 (to issue 17:Definitions)

"At scale" refers to rapid and/or mass creations/deletions of stub articles that do not clearly meet WP:GNG (stubs) at a rate higher than can be handled by English Wikipedia processes.

Proposed solution 17.2

"Rapid and/or mass creation" refers to multiple stubs created by the same editor in succession using identical sources.

Proposed solution 17.3

"Rapid and/or mass deletion" refers to nomination for deletion together or in succession of stubs created by the same editor.

Proposed solution 17.4

"At a rate higher than can be handled by English Wikipedia processes" refers to (what? Does this even need to be defined?).

Proposed solution 17.5

Mass creation refers to the creation of multiple articles on related topics without checking whether each individual topic meets WP:GNG. Typically, this refers to the creation of articles on a set of topics drawn from a database or list, all of which are known or assumed to meet an WP:SNG.

(Please note: this isn't intended to define all such creation as a problem. This may be an entirely valid approach in some cases; it remains to us to determine which.)

Proposed solution 18.1 (to issue 18:One or two RfCs?)

The issues raised here can adequately be handled in a single RfC.

Proposed solution 18.2

The issues raised here and proposed solutions are too complex for a single RfC, and addressing the issues surrounding creation are a necessary precursor to addressing those surrounding deletion.

Proposed solution 19 (to issue 13.a and issue 19)

Modify WP:BEFORE to unambiguously make the source-search before sending an article to WP:AFD a suggestion, unambiguously state that it is not a requirement, and have it reference WP:BURDEN to remind editors that in case of a conflict over content, the burden to produce or search for sourcing for contested text is on people who added or wish to retain it, never on people who wish to remove or delete it.

(See issue 19 for why this is something related to mass-article creation, since the perception that people can create uncited articles and then demand that other people perform a search-search before nominating them for deletion is one of the things that encourages mass-article creation in general; if people clearly understand more clearly that the WP:BEFORE search-search is and has always been optional, the mass-creation of uncited or poorly-cited articles will be less incentivized.) --Aquillion (talk) 04:48, 4 September 2022 (UTC)

Period for proposing issues and solutions is now ended

If anyone has a brainstorm, please post it in your own section and ping me; I'll be working on combining what's been suggested so far. Valereee (talk) 11:49, 7 September 2022 (UTC)

Comments

Comments in this discussion are unthreaded. Please respond to other editors, ask questions, and make comments/suggestions within your own section. Per the rules above, all sections are limited to 800 words.

Comments by Donald Albury

Proposal 1.1 would still allow the creation of new articles using a database, but a requirement for at least one citation to a non-database reliable source would slow down the rate of creation, and make it more likely that new articles will meet notability requirements. This proposal avoids any attempt to meter the rate of creation of articles. It is a rough proposal, and if anyone finds merit in it, please refine it. - Donald Albury 14:46, 24 August 2022 (UTC)

  • @GoldenRing: I think some sort of streamlined process of deletion would be appropriate, something between PROD and SPEEDY, i.e., provide an opportunity for a reliable source that is not a database to be added, but under a time limit. I think consensus is needed that stopping the creation of undersourced stubs is appropriate before we debate the details of what to do with newly created articles that do not meet that criteria. 16:05, 1 September 2022 (UTC)
  • @BilledMammal: I also think something like a long-life prod would be useful (see above response to GoldenRing).— Preceding unsigned comment added by Donald Albury (talkcontribs) 16:10, 5 September 2022 (UTC)

Comments by Plantdrew

"At scale" needs to be defined. My assumption is that the "anything more than 25 or 50" figure given at WP:MASSCREATE is per day. But that isn't specified, and it can be interpreted as EVER [1] [2]. I don't see any problem with editors creating more than 25-50 stubs over the course of their Wikipedia career. Plantdrew (talk) 16:29, 31 August 2022 (UTC)

Both, I guess. But I was only thinking about here when I made the comment. Plantdrew (talk) 18:55, 31 August 2022 (UTC)

Comments by isaacl

To briefly repeat comments I made on the draft talk page: the usual connotations of "at scale" is regular production rate. I think it is confusing to use the phrase in this context to implicitly mean "at a higher rate than can be handled by English Wikipedia processes". I would prefer skipping the first level of indirection in the "Definitions" section and just say "rapid creation/deletion of stub articles". isaacl (talk) 20:01, 31 August 2022 (UTC)

Over time, I agree there can be a cumulative backlog such that even a slow rate can lead to problems. However, I feel that the root problem is the input rate exceeds the available processing capacity, and thus the rate of creation/deletion is more rapid than can be handled. "At scale" has the connotation of a sustainable creation rate. isaacl (talk) 20:18, 31 August 2022 (UTC)

I disagree that the number of articles created is an issue in itself, without considering the rate of creation. If an editor creates one new article a month for 25 months, it shouldn't be an issue. I do think that problems can arise at a threshold lower than 25 articles a day. A per week rate might be a better threshold to use.

The tricky part is that in a volunteer environment, there's a lot of uncertainty in the capacity of any process. The number of involved editors is small enough that who's choosing to get involved in new page patrol on any given day can affect capacity significantly. The output of the deletion process automatically scales up or down based on participation levels. If the input queue gets longer than desired, this can initiate a feedback loop where editors slow down the incoming rate of deletion requests.

To me there are two separate aspects. The first is the minimum amount of content in a stub article. There is a tension between those who dislike a proliferation of stub articles and those who feel stubs encourage article development. I think community sentiment might have changed over the years, and so think it would be good to revisit this. The second is how can there be a feedback loop to help adjust the article creation rate to meet the new page patrol processing rate, as volunteer resources rise and fall. Rate is what needs to be managed, whether or not the articles are similar, created by one editor, or created by many editors. If there is consensus to raise the level of minimum content in stubs, that should also help slow down the article creation rate.

@Valereee: I disagree with limiting the definition of mass creation/deletion to the creation/deletion of similar articles, and by the same editor. As evidenced by BilledMammal's recent comment on the creation of fish articles, it's the content in the articles that is key. isaacl (talk) 15:17, 1 September 2022 (UTC)

@Valereee: To take a step back from "term X should mean Y" debates, I think what needs to be worked out is what behaviours are considered problematic and why. Any desired short-hand terms to refer to these behaviours can be determined later. Not every proposal needs to deal with all of the scenarios across all of the criteria considered to mean "at scale". For example, say there is agreement that creating articles at a problematic scale includes creating stub articles that do not have adequate sourcing to illustrate that the standards for having an article have been met, at a certain rate. One proposal might propose a policy to delete articles created in bulk by a single editor. Another proposal might propose a procedure to draftify articles that were created rapidly by multiple editors. I would prefer not defining "mass creation" to just mean "created by a single editor", as this would let one proposal use the term without caveats but require others to qualify the term. isaacl (talk) 20:21, 3 September 2022 (UTC)

Comments by Barkeep49

For background, deciding on a wording to describe what happened in the case which spurred this RfC was tricky. It was not always rapid. Instead there was a cumulative effect that had the end point of being, to use Isaac's language, "at a higher rate than can be handled by English Wikipedia processes". Whatever wording is used that is the key idea. Barkeep49 (talk) 20:08, 31 August 2022 (UTC)

Comments by Blue Square Thing

The definitions are really helpful - it really helps to know what we're actually talking about. I wonder if we need to say that "at scale" needs to be over a time period. I suppose what I'm getting at is someone creating 30 articles in one day isn't out of the question; repeating this 30 times over a three month period might well be.

And are we trying to limit this to stub-type article creation or not? Or is that implied? Blue Square Thing (talk) 20:39, 31 August 2022 (UTC)

@GoldenRing: Some form of special area may be required. Note that I've added Prop 14 which promotes alternatives to both creation and deletion as a way forward. Blue Square Thing (talk) 08:38, 2 September 2022 (UTC)
@Pbsouthwood: "do not clearly meet..." or "clearly do not meet"? Think I prefer the latter. See also Prop 14. Blue Square Thing (talk) 08:38, 2 September 2022 (UTC)

Comments by Bluerasberry

Show examples. At this point I am not seeing them. I want to see examples so that I can determine the extent to which this is a real existing problem or a potential future problem, and my judgement may change on that basis. Bluerasberry (talk) 22:41, 31 August 2022 (UTC)

Thanks BilledMammal, we have example data at Wikipedia_talk:Arbitration_Committee/Requests_for_comment/Article_creation_at_scale#Statistics_for_mass_creation
I want to share an interpretation of the data and request that anyone confirm what I am observing.
What we are looking for is "article creation at scale", defined here in this RfC either in the definition section or a later working definition.
When I look at that example data, the likely most problematic behavior would be in "Editors who have created more than seven articles in the past week, including lists and disambiguation pages". In this query for last week (it will change with time) I presently see 13 users in all of English Wikipedia who are averaging more than 2 Wikipedia articles per day. The top 4 users made 32-54 articles last week, while the other 9 users made 14-24 articles. I have not checked what kinds of articles they are making, but to the extent that the problematic behavior of "Article creation at scale" exists, are we talking about ~10 people making ~3 articles a day? Is that an accurate description of the size of this problem? If so, is a smaller problem than what I imagined. I think I imagined someone using automation to put out 10s of articles a day.
I am not dismissing the potential problem here, but when this RfC goes forward, here is some information that I think the intro should convey:
  1. About how many editors this RfC would curb (looks like 10-20?)
  2. About how many articles this would scrutinize (looks like 400/week?)
  3. In the current way of doing things without regulation, approximately what percentage of mass-created articles are below the WP:GNG standard? Estimate to 33%, 66%, or 100% - guesses are fine and precision not necessary.

Bluerasberry (talk) 00:05, 8 September 2022 (UTC)

Comments by BilledMammal

@GoldenRing: My proposal would only affect new articles; old articles are a separate problem. For new articles, that would be managed through NPP and AfC; any articles with less than two sources that plausibly contribute to GNG would be rejected. I believe this would also have the side benefit of reducing the amount of work NPP is required to do; considering notability would simplify down to checking whether those sources have been provided. BilledMammal (talk) 03:34, 2 September 2022 (UTC)


@Ovinus and AKAF: Statistics can be found on Valereee's talk page.

There are a few figures worth noting from the five year query; very few editors engage in mass creation, but those that do have a huge impact. Of the editors who created articles in the past five years, 0.5% created 50% of all articles, and only 0.1% created more than 1000 articles. This means that implementing any controls on mass creation won't be an expansive task as there are very few editors that these controls will affect.

It is also worth noting that these articles are lower quality, when quality is assessed by length; the average article created by editors who made between 10 and 999 is 2.3 times longer than the average article created by editors who made more than 1000. BilledMammal (talk) 08:27, 6 September 2022 (UTC)


@Enos733: The narrow remedy would be one along the lines of "Editors wanting to create more than ten highly-similar articles are required to get consensus to do. Such articles created without consensus should be moved to draft space and must not be returned without consensus."

For reference, these articles are ones that I would consider "highly-similar", although the similarity would not need to be as great as it is with these examples for the restriction to apply:

  1. Marios Orfanidis
  2. Jean-Louis Bretteville
  3. Kostadin Blagoev
  4. Atanas Tsanov

This policy would have virtually no impact on anyone who is not engaged in mass creation; I don't believe it is possible to find any editor that this would restrain who was not engaged in mass creation.

However, I believe your interpretation of proposals like 1.2 is incorrect. Currently, a topic needs to meet three criteria to have an article; it must be notable (WP:GNG and some WP:SNGs), it must be encyclopedic (WP:NOT), and it must be suitable for a standalone page (WP:PAGEDECIDE). 1.2 doesn't suggest altering any of these; it suggests adding a fourth, quality.

Specifically, it requires that editors wishing to create an article provide enough sources to make a basic article. Yes, this does mean that it's possible for a topic to be notable and not suitable for an article, but that is already possible, under WP:NOT and WP:PAGEDECIDE, and as such I don't think this is a huge shift. In addition, a proposal like 1.2 has several advantages over the narrow remedy above; it naturally constrains mass creation, it simplifies the work of new page patrol, and it addresses the issue of Wikipedia being mostly empty around the edges. BilledMammal (talk) 08:27, 6 September 2022 (UTC)


@Bluerasberry: I think this query, which shows the article creations by month for 2021, is more informative. The number of editors that are problematic is still small, but because they are undertaking these actions at scale the problems they create are significant.

In regards to your questions, I would estimate it would only affect a few dozen editors each year, but would affect tens of thousands of articles. I would also estimate that at creation, approximately 100% of mass created articles don't demonstrate compliance with GNG - there may be a few exceptions, but very few. BilledMammal (talk) 02:28, 8 September 2022 (UTC)

Comments by 127(point)0(point)0(point)1

I made a similar comment on the talk page of the ArbCom case, but I think one of the things that most needs to change is this feeling among some editors that deletion is inherently bad, and that the 'struggle' against it is existential - once its gone its over. Some way to both reinforce the notion that deletion is and should always be a core part of building an encyclopedia, and reassure that if we ever get it wrong or new things come to light that undeleting is as simple as hitting a button. When I was most active here many moons ago, it was relatively common for editors to request deleted pages that they thought could be salvaged be userfied- any admin could restore it to userspace to allow that editor to work on it and if nothing ever came of it the userfied pages would be redeleted at some point a few months in the future. Id like to see that return as a common request. Might bring down the temperature of those who feel deletion is a Great WrongTM. --WhoIs 127.0.0.1 ping/loopback 11:52, 1 September 2022 (UTC)

Comments by GoldenRing

I have a question for those proposing requirements for new articles, User:Donald Albury, User:BilledMammal and to some degree User:Blue Square Thing and User:Vanamonde93: Where an article fails whatever requirement or threshold you are proposing, what is the result? If it is that the article should be deleted at AfD then how will that result in any change to the current situation where these articles are ending up at AfD anyway? Or is the answer that we will have a new CSD, allowing any uninvolved admin to delete any article that doesn't have N (where N here seems to be 1 or 2) sources which that admin judges to be reliable? Bear in mind, when you answer, that Category:Articles lacking sources currently has nearly 140k articles in it. GoldenRing (talk) 14:48, 1 September 2022 (UTC)

Edited to get User:Vanamonde93's name right. GoldenRing (talk) 14:49, 1 September 2022 (UTC)
I'm sure we'll get whacked for a threaded discussion shortly. I don't see the point of all the subject-specific notability guidelines, since they almost all say "this is just a shorthand way of judging whether something is likely to meet GNG" or words to that general effect. Why not just look for sources? I can see that it does create a real problem when people are mass- or auto-creating articles, since the creator can say the specific guideline is a reason to assume that the GNG is met even if it isn't. My question was more aimed at those proposing specific numbers of sources required before an article is created. GoldenRing (talk) 15:48, 1 September 2022 (UTC)

Comments by Vanamonde93

  • @GoldenRing: My proposals are intended to refer to mass-creation, which can be regulated independently from creation in general. I'm aware we don't have a precise definition for mass-creation at the moment, but I expect that's something that will emerge during this RfC. My complaint is simply that if the justification for an article's creation is utterly inadmissible as a "keep" rationale at AfD, we have a problem. Vanamonde (Talk) 15:04, 1 September 2022 (UTC)
    @GoldenRing: So, I'm with you where most SNGs are concerned; the cleanest solution to the problem I've outlined is to get rid of them altogether. There's a few SNGs that go beyond that, though; WP:GEOLAND, WP:PROF, and WP:NPOL come to mind; and there's at least one (WP:NCORP) that's more restrictive than GNG. After NSPORTS, GEOLAND is the one that causes most conflict at AfD, and I actually think it's a reasonable criterion; we just need to iron out the wording. Vanamonde (Talk) 16:41, 1 September 2022 (UTC)
  • @Atsme: I'm not advocating for getting rid of all SNGs, but for those SNGs which are not currently described as an alternative to GNG. If meeting an SNG is insufficient to keep an article at AfD, then it serves no purpose besides causing confusion. Vanamonde (Talk) 10:26, 2 September 2022 (UTC)
  • I just want to note for the record I support handling deletion and creation in the same RfC. I see we're still not agreed on this point, perhaps understandably. Vanamonde (Talk) 14:42, 3 September 2022 (UTC)
  • @Rhododendrites: I believe we've had several instances of bot creation, and in at least a few (see Polbot) they were helpful. Vanamonde (Talk) 14:43, 3 September 2022 (UTC)
  • @Valereee and Xeno: I presume you do not wish to put every one of these proposals to a !vote, and therefore may wish to condense them. If so, it may be helpful to provide a mechanism for editors to agree/endorse/point out similarities between their proposals and those of others. (I'm sure you're receiving many many pings here, I'll try not to do this often). Vanamonde (Talk) 06:34, 4 September 2022 (UTC)

Comments by Atsme

Are we actually pitting humans against bots by allowing mass stub creation and/or mass deletion? Will Botipedia replace WP? Think about it. Is our community being replaced by automation, or is it enhancing our community? Food for thought: you drive into a filling station in the wee hours of the morning, and there are no attendants, but you are still able to refuel using your debit/card credit. Do you see how this relates? If we, as humans, cannot keep up with the demands placed on us by algorithms, then we simply get replaced. In some instances, that is a good thing, but what does automation do to human perception...such as events relative to history, critical thinking, common sense, inspiration, and the overall power of reasoning? Atsme 💬 📧 19:04, 1 September 2022 (UTC)

  • GoldenRing, Vanamonde93 – I have agreed with both of you 98.5% of the time re: your decisions, but in this case, I am at a loss over what you are basing your opinions on for dismissing SNG. Quite frankly, it conflicts with the "sum of all knowledge", but worse yet, it is based on apples to oranges comparisons relative to notability. N is far too nuanced to base it on nothing more than "coverage", especially in light of today's PRs, the echo chambers, clickbait, etc. What I am seeing in your argument could, in the long term, inadvertently support the eradication or censorship of both history and reality as an influencing factor relative to coverage by RS...which is why we have IAR...i.e.; sometimes historic events are not reported for various reasons. When one considers the relatively young age of the internet, what it has become, the availability of RS, and modern notability from a common sense perspective, I cannot help but wonder if you are giving far too much credence to "coverage" relative to what constitutes notability or what is worthy of being noted, when the most important aspect is WP:V, a core content policy not a guideline. Atsme 💬 📧 22:06, 1 September 2022 (UTC)
Hi, Rhodo - I agree for the most part because the evidence tells us so...but what about the hidden evidence? We just concluded a mass deletion that some editors believed were bot creations when they were human creations. How do you know when there are mass bot creations, unless you are an NPP reviewer and can see those newly created articles in the NPP queue? We have experienced multiple situations of bot creation, and they are not easy to detect. Backlogs are not easy to reduce. Atsme 💬 📧 22:11, 1 September 2022 (UTC)

Comments by Rhododendrites

[placeholder]

@Atsme: No. Most of the "mass creation" or "botlike" creation are just humans creating more articles than some people would like. I don't think we allow any bot-created articles. — Rhododendrites talk \\ 20:08, 1 September 2022 (UTC)

Comments by Pbsouthwood

Valereee, Please do, that looks about right.· · · Peter Southwood (talk): 03:03, 2 September 2022 (UTC)
Valereee, I have split the entry into an issue and a solution, which may mess less with the numbering and be clearer about what the proposed solution is for. I hope this helps, and will leave it to your discretion how you handle it further. · · · Peter Southwood (talk): 03:36, 2 September 2022 (UTC)

Barkeep49 For clarification, is this an issue with a cumulative effect of the specific editor's problematic contributions over time occurring unnoticed in the background, and suddenly being recognised as a problem when they had already gotten out of hand?· · · Peter Southwood (talk): 04:18, 2 September 2022 (UTC)

Blue Square Thing, I think we are referring to articles which do not clearly meet standards for inclusion. I do not think anyone would object to mass creation of articles which are clearly suitable, which would exclude all start class and better, and valid redirects, disambiguations etc. So mostly stubs, and probably mostly without easily verifiable references indicating notability. · · · Peter Southwood (talk): 04:18, 2 September 2022 (UTC)

Comments by CT55555

I don't know how to turn this into a rule, but I think a lot of the deletion drama could be avoided with a bit of dialogue. If anyone who wants to delete 100+ "stamps in X country" or 1,000 football players could just start off by starting a discussion and included a range of perspectives, we'd all feel less pressure and less rush and probably less stress.

Some people got recently topic banned because they took extreme views and did not appear to value calm discussion with people they disagree with. Can we encourage more of that? Can we somehow devalue the views of people who 99%+ want to delete everything or 99%+ want to keep everything and let the slightly more moderate views have a bit more influence? 16:47, 2 September 2022 (UTC)— Preceding unsigned comment added by CT55555 (talkcontribs)

Comments by JoelleJay

I don't know if this is out of scope since it's less pressing, but one limitation of focusing on stubs as the problem is that there are a good number of articles on non-notable subjects whose bios have been fluffed up with every passing mention, press release, hyperlocal profile, high school state result, etc. This generally coincides with an AfD refbomb and article "rescue". JoelleJay (talk) 00:37, 3 September 2022 (UTC)

Would it be possible to limit users to some number of poorly-sourced (micro?)stubs in their creation history (since X date) at any one time? If they exceed that number, they can't create any further articles of any type until they either expand one of their stubs with DUE, referenced material, or add a strong SIGCOV SIRS (for subjects governed by GNG). JoelleJay (talk) 23:31, 5 September 2022 (UTC)

Comments by Indy beetle

  • With regards to issue 8: I can think of at least one instance where one autoconfirmed user who was constantly creating poorly-sourced, poorly-written stubs (although they varied in their subject matter) was topic banned from publishing direct to main space and restricted to using AfC and having them reviewed by another editor before being published. If a draft is rejected this user often does not follow-up on suggested improvements, and consequently their talk page is full of "your abandoned draft is going to be deleted" notices, and these drafts are then indeed deleted, which proves that this system can work. I don't know if that should really inform a change in policy per se, but it remains an option for dealing with problematic article creators. AfC has a terrible backlog so having to go through the process is something of a punishment, especially if you write hundreds of articles, but the best way to go speedily through the process (and eventually have the restriction lifted) is to actually write unambiguously policy-compliant material. If you can't do that, you deserve to live with the consequences.
  • With regards to issues 5 & 7: I generally agree with Vanamonde's proposed solutions. Re Peter When creating a series of stubs on similar topics the requirements for proof of notability could be tightened up to require the creator to actually prove notability for each stub, by way of providing at least one reliable source indicating general notability. The "General Notability Guideline" currently requires "multiple" (at least two) instances of SIGCOV. I think what you're suggesting would essentially be a loosening of the GNG. I don't think it should be that hard for a user to provide the bare minimum of two sources. This brings me to Billed Mammal's 1.2 solution, which would essentially make GNG the only binding standard for article notability. I do think more work needs to be done to harmonize the SNGs with the GNG. Particularly, in cases where the SNG itself says it is subordinate to the GNG, in those instances we might as well eliminate the SNG, and at the very least we need to have AfD closers who are willing to actually enforce these standards in their closes. I would fully support removing most SNGs, with a few exceptions, like David Eppstein's WP:AUTHOR example regarding book reviews, though I fear consensus for that does not yet exist. Eppstein's example is actually really good because the debate over that doesn't have to do with lack of sources pe se, and is essentially a WP:PAGEDECIDE matter on what the subject should be (book reviews are usually SIGCOV of the books, not their authors). -Indy beetle (talk) 11:33, 3 September 2022 (UTC)
  • @JoelleJay: With regards to your comment on fluffed material, another occasional problem is the addition of a large "Further reading" section with no proof that the sources listed there actually have any direct relation to the subject of the article at hand. -Indy beetle (talk) 11:36, 3 September 2022 (UTC)

Comments by Robert McClenon

User:Valereee appears first to say that this discussion is about mass creation, and then asks whether it should also be about mass deletion. I will note that Problem 4, Trainwrecks, is a problem with mass deletion. So is this discussion both about mass creation and mass deletion, or should Problem 4 be removed, or has Problem 4 sneaked in? Robert McClenon (talk) 20:10, 3 September 2022 (UTC)

User:MJL - It appears that you changed the Table of Contents options so that individual Problems and Solutions are no longer displayed. Were you advised by the moderators to do that, or was there a consensus to do that, or did you do that because you thought it was a good idea? I disagree, but would like to know what your reason is. Robert McClenon (talk) 05:13, 6 September 2022 (UTC)

I disagree with excluding problems with mass deletion, but I am willing to accept the judgment of the moderators that mass deletion is a different set of issues. Robert McClenon (talk) 05:22, 6 September 2022 (UTC)

Comments by NotReallySoroka

I have just created WP:ACAS and WT:ACAS as shortcuts. Please update Wikipedia:Arbitration Committee/Requests for comment/Article creation at scale with a link to WP:ACAS, and this page with a link to WT:ACAS. Thank you. NotReallySoroka (talk) 00:28, 4 September 2022 (UTC)

I concur with McClenon's point about trainwrecks being a problem with mass deletion. Problem 4 should be removed. NotReallySoroka (talk) 00:28, 4 September 2022 (UTC)

@Valereee: I performed my desired edits about the shortcuts. Thanks. NotReallySoroka (talk) 05:25, 5 September 2022 (UTC)

Comments by David Eppstein

@BilledMammal: Re your comment However, I also disagree; it doesn't override it: You may disagree with the current consensus all you want, but the wording of WP:PROF that "This guideline is ... explicitly listed as an alternative to the general notability guideline" is unambiguous, and multiple past discussions on this exact issue have not shifted that consensus. Any proposal that imposes GNG-requirements on new articles that would not normally be subject to GNG is a seriously problematic overreach, would make it much more difficult to create (non-mass-stub) articles on academics, would go well beyond the intended focus of this discussion on mass creation of stubs and would automatically have my strong opposition. That goes for the current proposed solutions 1.2, 6.1, 17.1, and 17.5. Proposal 7.1 is also problematic because it imposes a requirement of expansion to the notability guidelines rather than actually addressing the topic at hand, mass creation of dubiously-notable stubs. Proposed solution 5.1 has much more acceptable wording (with respect to this issue, at least), because it explicitly recognizes that there are SNGs that do confer notability independent of GNG. —David Eppstein (talk) 06:15, 4 September 2022 (UTC)

Proposed solution 1.3 also implicitly assumes GNG is king and so has the same issue. —David Eppstein (talk) 15:51, 4 September 2022 (UTC)

While I'm discussing this issue maybe it would be appropriate for me to briefly explain why I am so adamant that PROF must not be subordinated to GNG. Frankly, I'm not happy with GNG as a notability guideline at all. GNG means, in a nutshell that we cover topics that have been successful in hyping themselves to the point of getting media coverage for their hype. That may be the best we can do as an inclusion criterion for some types of topic, and may even be the right thing to do for our coverage of topics that are themselves centered on hype (celebrities, say). But hype is distortion and causes us to perform all sorts of mental distortions in justifying our hype-based criteria. A frequent example is the repeated belief that non-trivial in-depth coverage of election candidates in major newspapers is somehow trivial or not in-depth because the candidate has not yet won the election; no, that's a distortion, caused by the cognitive dissonance of wanting to exclude unelected candidates while also wanting to rely on a notability criterion that if read literally would not allow us to exclude them. Whenever it is possible to set inclusion threshold based on a specific level of accomplishment or significance, it produces a much more WP:NPOV coverage of that overall type of topic, while maintaining a comparable level of selectivity. Both WP:NPOV and some level of selectivity are crucial for the integrity of the project; basing our selectivity on hype is not. So, for instance, I am much happier with NPOL (you must be at least at the level of a cabinet minister or provincial legislator) than I am with counting current newspaper articles about unelected candidates to determine which ones we include. I can accept the point that NOLY wasn't working well, but it did set a clear threshold; the reason it wasn't working well wasn't because it was based on accomplishment rather than hype, but rather because it set the bar too low. Being able to write an article at all (WP:V) is also crucial, but GNG goes far beyond that in what it requires. —David Eppstein (talk) 06:28, 4 September 2022 (UTC)

Comments by Ovinus

I agree with Bluerasberry that examples (and some statistics) would be invaluable to this discussion. What are the specific problematic articles, who created them, at what rate, and by what means? Are the problems endemic only to species, sports, geography, and settlement stubs?

Although any higher number is still better than none, I think the threshold for creation that should be under some form of community review should be set deliberately low—as low as 3 articles per day, averaged over a week. Why? Because the minor bureaucracy of approving prolific writers "hey, keep up your excellent work" is far less frustrating than the inherent controversy (and subsequent fallout) of most large-scale creations, whether or not they are desired by the community.

Perhaps such a threshold is too low to be an effective definition w.r.t. this RfC on "mass creation". But that's why we need statistics; which creators produced more than 3, but less than 20, stubs per day, in the past week? Hopefully someone adept with the quarry could find out. (BilledMammal?) Ovinus (talk) 16:51, 4 September 2022 (UTC)

Valereee Indeed that would be helpful.... Unfortunately my question isn't at all rhetorical; I don't know enough history to write a summary. But I think statistics, or links to statistics, can be plopped somewhere in the RfC (or this pre-RfC), like a list of the top 100 article creators in the past month. Ovinus (talk) 19:32, 4 September 2022 (UTC)

Comments by S Marshall

Way too many issues, way too many solutions. These need distilling into a concise and orderly disquisition before the discussion goes live.—S Marshall T/C 20:59, 4 September 2022 (UTC)

Comments by El Dubs

Proposed issues are not issues. This thread is too solutions focused because this thread has assumed that "proposed" issues are issues that need solutions. It's especially a problem if people are posting a proposed issue, then posting a solution to that issue. That can very quickly become just using the issue/solution process to push an agenda.

I recommend:

  1. The proposed issues all be filled out with why they are actual issues.
  2. Discussion needs to occur on whether these issues are accepted by the community as actual issues.
  3. Only after these two steps should we be moving into solutions territory.

This will enable a much more structured discussion process that addresses actual issues. El Dubs (talk) 21:34, 4 September 2022 (UTC)

Comments by Editor AKAF

@Blue Square Thing : If X Articles in a day is OK, but not every day, then maybe a tiered approach would be better: For example: 25 articles/day; 50 articles/week; 100 articles/month; 200 articles/year. This could be a hard technical limit, with a bot flag required to exceed. On a related note: It is not clear to me that 15 stubs for members of the 1896 Lancashire cricket team is inherently preferrable to a single rather stubby article listing the same information. This technical limit would encourage the second variant. AKAF (talk) 07:43, 5 September 2022 (UTC)

@Billedmammal : A spot-check of the editors creating 100 articles (Editors who have created more than 100 articles in the past year, by month), shows that most of these are also probably never going to be more than stubs. Maybe 10/day;20/Week;40/Month;80/Year would be more realistic. AKAF (talk) 10:42, 5 September 2022 (UTC)

@BilledMammal: I do see your point, but the problem is mass editing without consensus. Over on WP:BRFA, many typical requests are: "Perform *TASK* on *LIST* with *BOT*", and then you have an approved task with consensus, or at least a centralized point of discussion for that task. We want to make it more inconvenient to perform bot-like edits with a non-bot account. For instance (IMHO, and apropos of nothing) hard-limits of XXedits/day; YY/week; ZZ/year on user accounts would be useful as a brake on thoughtless editing and addiction-spectrum problems. AKAF (talk) 14:27, 5 September 2022 (UTC)

Comments by Andrew D

My first impression is that this is a parody as so far we have 23 issues and 28 proposed solutions. The scale of this exercise has therefore passed the point that a sensible result can be expected.

To test how the size of a group affects its ability to make decisions, they created a model based on information flow networks and found that a significant change occurred when groups hit 20. “We found a realistic linking pattern of people and gave artificial committees random initial opinions on subjects,” he says. “At 20 you see a strong difference in coalition building. Smaller groups form and they block each other, which explains why it is exceedingly hard to come up with unanimous decisions when cabinets are large.”

Next, I note that Ovinus sensibly asks for some examples. This indicates that the 28 solutions are not evidence-based – tsk. For an actual fresh example, please see the Village Pump where there's a fuss about some catfish articles created by Lumpsucker. My view is that Lumpsucker is doing good work building the encyclopedia and we should applaud and encourage such diligence. Is that already one of the solutions? I may have to read them now...

Comments by Cryptic

@WhatamIdoing: I wonder how often anyone creates more than 25 non-redirect articles in a day.: With some limitations, in particular limiting it to a calendar day instead of a 24-hour period and only counting still-existing pages that are currently in mainspace and not redirects, 2098 times since late June 2018. More than four times every three days on average. —Cryptic 18:14, 5 September 2022 (UTC)

Comments by Jogurney

@JoelleJay: I don't think Issue 13 is relevant to article creation at scale. While I agree that there are AfD matters to be discussed, it seems like those are more appropriate for the second RfC that will deal with article deletion at scale (or perhaps somewhere else). It is my understanding that many of the editors who have been involved in article creation at scale are no longer editing, and I don't see any evidence that their article creation behaviors were encouraged by recent AfD voting patterns or outcomes. For the editors that are still active, I don't think we should speculate that they are motivated or encouraged by AfD outcomes without strong evidence. Jogurney (talk) 04:53, 6 September 2022 (UTC)

Comments by MJL

@Robert McClenon: It was just a decision by me, so I've changed it to {{TOC limit|3}} since you've objected to {{TOC limit|2}}. –MJLTalk 06:01, 6 September 2022 (UTC)

Comments by Enos733

I do recognize there is a problem with mass creation (of both pages and redirects). But, I wonder if some of these potential solutions are worse than the identified problem. In general, I am in agreement with David Eppstein and others who are concerned that this set of discussions would change the nature of WP:V and WP:GNG or even the SNGs. I appreciate the work to identify who are the mass creators, but to me the question is why they are creating those articles and whether there is a narrow remedy - either technological fixes or process adjustments - such as editors can only place into mainspace so many articles per day/week or requiring that all proposed articles from a mass creator are reviewed by (an administrator?) before placed into mainspace. --Enos733 (talk) 04:53, 6 September 2022 (UTC)

Comments by Shooterwalker

I agree with several editors that this discussion is not ready to go live. Speaking personally, it's a lot to parse. Thinking as an organizer, it's near impossible that this will lead to a clear next step, let alone a consensus. This exercise is still useful to get a rough brainstorm of solutions, but I think it will need a facilitator/moderator/volunteer team who can parse this into a few broad solutions. (Personally, I agree with Enos733 that this can probably be solved with a narrow technical remedy.) Shooterwalker (talk) 15:47, 6 September 2022 (UTC)

Comments by FOARP

This discussion is not ready for prime-time. Too many solutions are presented, some of which seem to go along directions that do not seem likely to receive acceptance but simply be being proposed for "balance" or because they may have been discussed in the past. Particularly ideas for pre-approval of articles are not going to fly (and we already have policies on Mass Creation that are not enforced! What is the point of adding to them?).

There is also too much focus on Sports Bio issues, which frankly were just a temporary battle-ground for the people involved in the ARBCOM case, when Geography is every bit as bad (there are 98,128 articles citing sports-reference.com, but 97,181 articles citing GNIS and/or GEOnet Names Server ID numbers - all of which are problematic sources when - as is typical - they are the only sourcing used).

Really, for mass-creation the focus here is a single issue: WP:NOTDIRECTORY and WP:NOTDICTIONARY and whether they allow making Wikipedia a sports almanac/gazetteer/dictionary of specieses. This is in reality the core of the dispute and where fluffing of the issue for a decade+ has caused vast problems. FOARP (talk) 13:18, 8 September 2022 (UTC)

Comments by moderators

Plantdrew, are you suggesting a revision to the Definitions section here, or to the stated policy, or to both? Valereee (talk) 18:13, 31 August 2022 (UTC)

I've added a starter definition. Comments welcome. Valereee (talk) 19:22, 31 August 2022 (UTC)

Isaacl, I've refined the definition to reflect the input. Valereee (talk) 20:22, 31 August 2022 (UTC)

Blue Square Thing, I've added a starter definition. 20:47, 31 August 2022 (UTC)

IMO stub is implied. In most cases we're talking about permanent microstubs. Anyone who can create non-stubs multiple times a day for days on end...Bully. Open to discussion, though. Valereee (talk) 20:49, 31 August 2022 (UTC)

BilledMammal, could you suggest language? Valereee (talk) 23:15, 31 August 2022 (UTC)

Do we need to define "highly similar articles"? Valereee (talk) 23:32, 31 August 2022 (UTC)

Isaacl, feel free to unhat once you've shortened. BilledMammal, I'll note you are very close to your limit. Valereee (talk) 09:55, 1 September 2022 (UTC)

BilledMammal, thanks for shortening! Please feel free to add a new propoosed solution 1.2, if you think that's the most appropriate spot. Or add a completely new issue/solution proposal set, if that seems more useful. Anyone can add new issues/proposals, and it would actually be helpful for people to see that happen. I didn't want it happening at the draft simply because I didn't want the workshopping to start in a bit of obscure userspace, but in this workshop we actively do want people adding issues/solutions so we can workshop them for the RfC. Valereee (talk) 11:02, 1 September 2022 (UTC)

Pbsouthwood, that looks like a solution, perhaps to issue 6? Any objection to moving it to Proposed solution 6.2? Valereee (talk) 16:17, 1 September 2022 (UTC)

WhatamIdoing, can you propose wording for the definitions in the proposed solutions section? Valereee (talk) 21:38, 2 September 2022 (UTC)

Isaacl, we don't have to develop definitions before discussing the other issues/solutions, but we are going to have to deal with them in this workshop eventually, as we'll need those to refine the questions for the RfC. Valereee (talk) 16:04, 3 September 2022 (UTC)

@Isaacl, how do we get anywhere on any question without at some point finding some agreement on what we even mean by creation/deletion at scale? I mean, what is the scenario? Valereee (talk) 16:31, 3 September 2022 (UTC)
@Isaacl, I apologize if I'm being dense. The RfC(s) are about creation/deletion at scale. To me we don't have to lock in every proposal to only deal with a specific definition of "mass creation", for example means we're opening this RfC up to issues that have nothing to do with mass creation/deletion. That seems like it's likely to expand an RfC that is already so big, we're already asking whether we should cut it in half and run two RfCs. Again, apologies if I'm misunderstanding your point. Valereee (talk) 18:06, 3 September 2022 (UTC)

Vanamonde93, no worries about pings, I'd rather be pinged ten times than not be pinged and miss something. I was just thinking about starting to combine stuff and how best to manage that...any suggestions welcome. Valereee (talk) 13:34, 4 September 2022 (UTC)

Robert McClenon, suggestions re: mass deletions are fine here, too. We haven't finalized whether this is going to definitely require two RfCs. Valereee (talk) 13:34, 4 September 2022 (UTC)

ETA: Perhaps you'd like to change the name of that issue/proposal to clarify that it's about deletion? I'm not sure it's really a name that conveys much information. Valereee (talk) 13:40, 4 September 2022 (UTC)

NotReallySoroka, I'm not sure where you're asking for these to be included. Why don't you go ahead and make the edits you think are necessary, and I'll adjust if I think it's needed. Valereee (talk) 13:37, 4 September 2022 (UTC)

Ovinus and Bluerasberry, no objection to someone creating a page providing examples from the long history of this issue if that helps editors who are unfamiliar with it, we could provide a link in the background section, but someone else will have to take that on. The status box at WP:ACAS has links to the ArbCom case that ordered the RfC, that's probably a good place to start, hundreds of diffs there. Valereee (talk) 17:13, 4 September 2022 (UTC)

@Ovinus and @Bluerasberry, there is now a section below with statistics, thanks to BilledMammal. Valereee (talk) 13:28, 7 September 2022 (UTC)

S Marshall, that is happening now. Once we've got a distilled set we'll post those (I'm hoping tomorrow Wednesday) and start getting comments to refine and then endorsements for using them in the RfC. Valereee (talk) 23:10, 4 September 2022 (UTC)

Robert McClenon, after going through the various proposed solutions, it looked like trying to address everything in a single RfC was going to be enough more complicated (requiring multipart questions, for example) than doing it in two that it was worth hassles of the extra RfC. Totally get that there are good arguments on both sides. Valereee (talk) 16:35, 7 September 2022 (UTC)

Threaded discussion re: 2 RfCs

In this section we'll use threads, but please limit yourself to as few and as brief comments as possible. Anything that starts to feel like bludgeoning is going to make me cranky. Any comment longer than a couple hundred words I'll probably just skim, so it's worth taking the extra time to write short.

As I'm starting to work on combining/refining proposals, it's looking like we may indeed need to do two consecutive RfCs that probably can't overlap. To illustrate, here's what I've drafted (and it's just a draft, no comments here on the draft itself, but if you have a concern you can go to my talk) for one proposal:

Creation at scale under SNGs

1a. Require all creations at scale to have at least one source which would plausibly contribute to GNG.
1b. Require all creations at scale to have at least two sources which would plausibly contribute to GNG.
1d. Create a speedy deletion criterion for articles created at scale that clearly fail to have source(s) sufficient to meet this requirement. An assertion that the source(s) supplied meet GNG is sufficient response to PROD or CSD; repeated such assertions that subsequently fail at AfD are a conduct issue.
1e. Allow articles created at scale that clearly fail to have source(s) that meet this requirement to be nominated for deletion at any scale.

Creation at scale under SNGs (one RfC)

The one-RfC option might be done this way. !Voters who supported requiring one source supporting GNG could !vote "Oppose 1, support 2a, 2c, 2d, oppose 2b" or whatever. Valereee (talk) 20:18, 4 September 2022 (UTC)

Creation at scale under SNGs that do not confer notability (option 1)

1a. Clarify at WP:N to make explicit when an SNG confers notability and to eliminate contradictions. Require all creations under SNGs that do not confer notability to have at least one source which would plausibly contribute to GNG.
1b. Create a speedy deletion criterion for articles created at scale that clearly fail to have source(s) sufficient to meet this requirement. An assertion that the source(s) supplied meet GNG is sufficient response at talk; repeated such assertions that subsequently fail at AfD are a conduct issue.
1c. Articles created at scale that clearly fail to have source(s) that meet this requirement may be nominated for deletion at any scale.
1d. Articles created at scale that clearly fail to have source(s) sufficient to meet this requirement may be nominated for deletion without an expectation of WP:BEFORE.

Creation at scale under SNGS that do not confer notability (option 2)

2a. Clarify at WP:N to make explicit when an SNG confers notability and to eliminate contradictions. Require all creations under SNGs that do not confer notability to have at least two sources which would plausibly contribute to GNG
2b. Create a speedy deletion criterion for articles created at scale that clearly fail to have source(s) sufficient to meet this requirement. An assertion that the source(s) supplied meet GNG is sufficient response at talk; repeated such assertions that subsequently fail at AfD are a conduct issue.
2c. Articles created at scale that clearly fail to have source(s) that meet this requirement may be nominated for deletion at any scale.
2d. Articles created at scale that clearly fail to have source(s) sufficient to meet this requirement may be nominated for deletion without an expectation of WP:BEFORE..
(Above combine proposals 1, 1.2, 1.3, 1.4, 3.3, 5.1, 6.1, 6.2, 7.1, 7.2, 10, issue 16, various comments)

Discussion

  • I don't see how we can possibly ask people to !vote on 1d&e until we've got the results of 1a&b. Someone might support 1a but not 1b; how can they answer 1d or 1e until they know whether 1b, which they didn't support, is going to carry the day? Valereee (talk) 16:12, 4 September 2022 (UTC)
    The alternative if we want a single RfC would be to separate these. That is, 1a+1d&e would have to be one, 1b+1d&e would have to be a second. Is that (and other similar) better than two RfCs? Valereee (talk) 17:03, 4 September 2022 (UTC)
    Worth remembering some SNGs stand independent of GNG (ie NPOL, GEOLAND); perhaps crafting a proposal that treats those differently would be useful? Vanamonde (Talk) 17:20, 4 September 2022 (UTC)
    Answering at my talk. Valereee (talk) 17:31, 4 September 2022 (UTC)
    The danger with combining these into one RfC is that we might get a massive wall of text that will put people off even trying to follow what's going on, which means fewer editors contribute perhaps. Given that we're talking about a very large number of articles when it comes to deletion, I suspect that's not a good thing. But it might not happen. Also if we have a some kind of response to creation at scale, won't that inform deletion at scale? Honestly: I can see both sides, but I'm leaning towards two RfC. Blue Square Thing (talk) 19:43, 4 September 2022 (UTC)
    Right now I've drafted the proposals into 9 proposed questions for the RfC (will probably try to post those tomorrow). Ten if we split the one above like I have for the one-RfC option. I'm actually not sure many of the others will need such splitting, still assessing. Those will be refined and possibly some won't be endorsed for inclusion. Although of course there will certainly be other proposals during the RfC. Valereee (talk) 20:31, 4 September 2022 (UTC)
  • I'm not sure where to put this, but in regards to the various options that covering addressing existing articles, I would suggest something softer that gives editors more time to improve the articles before they are deleted, along the lines of:
    Create a new form of prod. This prod will last for one year, and can only be applied to articles that clearly fail to have source(s) sufficient to meet this requirement, and can only be removed when source(s) sufficient to meet this requirement have been added.
BilledMammal (talk) 02:34, 5 September 2022 (UTC)
Put it at my talk. Valereee (talk) 19:36, 5 September 2022 (UTC)
I didn't look at the location you made this comment and I interpreted it as being in response to my question about my statistics section, hence why that is now on your talk page. BilledMammal (talk) 16:33, 6 September 2022 (UTC)
@BilledMammal, sorry, just seeing this. Yes, I think that statistics section you'd placed in your section is useful. Let's just place it in its own section on this page so others can see it. If you get around to that before I finish catching up here (been basically out for two days due to a surprise IRL), great! Valereee (talk) 11:33, 7 September 2022 (UTC)
I've added it in its own section below, feel free to insert links to it wherever you think is appropriate. Valereee (talk) 12:04, 7 September 2022 (UTC)
  • I find this section confusing. Valereee could you clarify exactly what you intend this section to be, what's open to editing, what sort of feedback you're looking for, what goes on your talk page vs. here, etc.? — Rhododendrites talk \\ 16:26, 6 September 2022 (UTC)
    @Rhododendrites, apologies for the lack of clarity. In this section I'm just looking for input on the feasibility of a single RfC to address both creation at scale and deletion at scale and was allowing threaded discussion of that here in this section. I was sending to my talk any commentary on the examples I provided as they are only drafts and I didn't want to start discussion of the wording here. Valereee (talk) 11:35, 7 September 2022 (UTC)
  • (Assuming we're talking about Proposal 18): I don't think we can have creation & deletion in one RfC. The mass-creation RfC must come first because the consensus about mass creation is necessary in order to even begin evaluating questions of mass deletion.

    An example based on my personal opinions/thought process: if the community decides that every article must have 2 plausible GNG sources at creation, then I don't care about rate limiting creations at all; editors can make as many as they want as fast as they want if they have to have two GNG sources because it can't be automated anyway. But if that's the case then I would support "unlimited PRODs" for articles lacking two sources. On the other hand, if the community did not require any GNG sources at creation, then I'd want to rate limit creation, and my opinion about rate limiting PRODs would depend upon what exact rate limit the community decided for creation. In either case, I'd want to know what the requirements for creation were before deciding on requirements for AFDs: for example, if GNG sources were required at creation, I'd support stricter BEFORE requirements for AFD noms, but probably not otherwise (AFD nominators shouldn't be under more of a burden than creators).

    That's just a few examples but I think most if not all of the deletion-related issues similarly depend upon questions such as expected-quality-at-creation and expected-creation-rate. The problem with one RfC is we'll have a lot of "if A then X, but if B then Y" votes, making an already-complex RfC even more complex (imagine: "support as first choice if 1.3 passes but if 2.8 passes then second choice behind 3.2", etc.). Levivich 04:20, 7 September 2022 (UTC)

    @Levivich, yes, that's my initial concern, too -- that to deal with complex questions handling both creation and deletion, we'd end up with multiple similar multipart questions like option 1 and option 2 in the example above. I'm working on the rest of those today and hope to have a draft of proposed questions for the RfC later today. Valereee (talk) 11:39, 7 September 2022 (UTC)
    @Levivich and Valereee: It occurs to me that the division you are discussing, and perhaps the most logical division for a two-part RfC, is not between creation and deletion, but between content and behavior. Notability thresholds, and evidentiary standards, are a content matter; rate limitations are a behavioral matter. The latter set of expectations should be crafted based on the former. Vanamonde (Talk) 17:57, 7 September 2022 (UTC)
    You're making my head hurt. :D Valereee (talk) 18:24, 7 September 2022 (UTC)
    Agree with this, per what I wrote here (and elsewhere). — Rhododendrites talk \\ 13:42, 7 September 2022 (UTC)
    Also agree with a divide between content and behaviour issues, but both are important. Some of the behavioural problems are due to disagreement on content requirements and interpretations thereof, some others are competence issues. · · · Peter Southwood (talk): 06:19, 8 September 2022 (UTC)