Wikipedia:Bots/Requests for approval

From Wikipedia, the free encyclopedia

New to bots on Wikipedia? Read these primers!

If you want to run a bot on the English Wikipedia, you must first get it approved. To do so, follow the instructions below to add a request. If you are not familiar with programming it may be a good idea to ask someone else to run a bot for you, rather than running your own.

 Instructions for bot operators

Current requests for approval

Operator: Hawkeye7 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 01:57, Wednesday, March 22, 2023 (UTC)

Function overview: Mark unassessed stub articles as stubs

Automatic, Supervised, or Manual: Automatic

Programming language(s): C#

Source code available: Not yet

Links to relevant discussions (where appropriate): Wikipedia:Bot requests#Stub assessments with ORES

Edit period(s): daily

Estimated number of pages affected: < 100 per day

Namespace(s): Talk

Exclusion compliant (Yes/No): Yes

Function details: Go through Category:Unassessed articles (only deals with articles already tagged as belonging to a project). If an unassessed article is rated as a stub by ORES, tag the article as a stub. Example


Operator: GoingBatty (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 05:04, Friday, March 17, 2023 (UTC)

Function overview: Fix state parameter in {{WikiProject U.S. Roads}}

Automatic, Supervised, or Manual: Automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Links to relevant discussions (where appropriate): Wikipedia:Bot_requests#Template:USRD topic adjustment

Edit period(s): One time run

Estimated number of pages affected: 3,451

Namespace(s): Talk pages

Exclusion compliant (Yes/No): Yes

Function details: Fix the |state= (and |type= in the {{WikiProject U.S. Roads}} as follows:

  • \| *type *= *CR *\| *state *= *(NY|FL)-CRTF|type=CR|state=$1
  • \| *state *= *(NY|FL)-CRTF(?: *\| *type *= *CR)?|state=$1|type=CR

This bot will also use the custom module User:Magioladitis/WikiProjects to standardize the WikiProject names and use AWB's general fixes. Thank you for your consideration. GoingBatty (talk) 05:04, 17 March 2023 (UTC)Reply[reply]


Operator: Eejit43 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:23, Tuesday, March 14, 2023 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AutoWikiBrowser

Source code available: N/A (AWB)

Function overview: Replaces non-plural section headers to plurals.

Links to relevant discussions (where appropriate):

Edit period(s): One time run

Estimated number of pages affected: ~2,300+

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): No

Function details: This task will use AWB to change article headers of "External link" to "External links", per MOS:LAYOUTEL, and "Reference" to "References". It will also run AWB's general fixes.


I've updated this request to include "Reference" --> "References". While I doubt it does, if this needs to be a separate request please let me know. ~ Eejit43 (talk) 00:50, 14 March 2023 (UTC)Reply[reply]

Operator: Qwerfjkl (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 11:38, Saturday, March 11, 2023 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AWB or Bandersnatch

Source code available: Regex

Function overview: Change colour schemes for weather timelines.

Links to relevant discussions (where appropriate): Wikipedia:Bot requests#WP WikiProject Weather timelines,

Edit period(s): once

Estimated number of pages affected: Hard to estimate, but only around 10,000 articles use the timeline tag.

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: The bot would use the regex

Colors *= *(\n +.+)+

to match the colour parameter, and replace it with the new colours scheme as described at Wikipedia:Bot requests#WP WikiProject Weather timelines.


What's Bandersnatch? – SD0001 (talk) 14:02, 11 March 2023 (UTC)Reply[reply]
@SD0001, w:de:Benutzer:Schnark/js/bandersnatch. — Qwerfjkltalk 14:19, 11 March 2023 (UTC)Reply[reply]

Operator: RoySmith (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 15:34, Wednesday, March 1, 2023 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available:

Function overview: Applies move protection to DYK articles which are on the main page or in the queue to be placed on the main page soon.

Links to relevant discussions (where appropriate): Wikipedia talk:Did you know/Archive 188#Move protection

Edit period(s): Continuous

Estimated number of pages affected: 10 per day

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): No

Adminbot (Yes/No): Yes

Function details:

First, some definitions:

Nomination: A nomination template, i.e. a subpage of {{Did you know nominations}}.

Hook: A string starting with "..." and ending with "?". Optionally includes a tag such as "ALT1".

Target: An article referenced from a hook using a bolded wikilink. All hooks have one or more targets.

Hookset: A template containing a collection of hooks along with other metadata. One of {{Did you know}} (i.e. the current hookset), the 7 numerically named subpages of {{Did you know/Queue}}, or the 7 numerically named {{Did you know/Preparation area 1}}, etc.

DYKToolsBot is already approved for a different task, but does not have admin rights. This new account (DYKToolsAdminBot) will handle tasks that require admin rights. They share the same code.

There are two distinct tasks proposed here, protect and unprotect. Both tasks are run as scheduled toolforge jobs. Currently both tasks run every 10 minutes, offset by a few minutes. The exact timing is not critical.

The protect task does:

Parse the main page + queue hooksets, extracting all the hooks. From the hooks, extract the targets which need protecting ("protectable targets"). These titles are indicated by wikilinks set in bold. There is typically one target per hook, but there can be more than one. For each protectable target, indef move=sysop indef protection will be applied. The page protection log messages will include a link to a page in the bot's userspace explaining the process.

The unprotect task does:

Queries the bot's user log with type=protect for the previous N days, where N is long enough to account for any hooks which have progressed through the normal promotion process plus extra time to account for intra-queue hook swapping. It's currently set to 9, but might need to be increased. The exact value is not critical. These are the "unprotectable targets". The current list of protectable targets is acquired as in the protect task. Any targets in the unprotectable set which are not also in the protectable set are unprotected.

I considered computing an expiration date and only protecting until then. The problem is that the expiration date is a moving target. Hooks often get shuffled around when problems are discovered. Sometimes hooks get unpromoted entirely after hitting a queue (or even when they're on the main page). Sometimes the queue processing schedule is disrupted by failure of the bot which manages that process (this has happened a couple of times in the past few weeks). A few times a year, queue processing toggles between 1 per day or 2 per day. Keeping track of all these possibilities and updating the expiration time would add significant complexity for no benefit. It's far simpler to use a declarative approach, in the style of puppet; periodically figure what state each target should be in right now and make it so, regardless of history.

This is currently running on testwiki. See Reviewers should feel free to exercise the bot by editing the DYK queues on testwiki.

Known problems

On rare occasions, hook targets are written as templates such as one of the (many) {{Ship}} variants. The current code does not recognize these properly (github bug) This happens infrequently enough, and it's difficult enough to do correctly (it requries a call out to Parsoid), and the consequence are mild enough (a page doesn't get the move protection it should), that I'm not going to make it a blocker for an initial deployment.

If a target was already move protected before entering the DYK pipeline, it will have that protection removed when it transitions out of DYK. The probability of this happening is so low, I'm going to ignore it. The alternative would be to maintain a database of pre-existing protections so they could be restored properly, which seems like more trouble than it's worth.

If enough protection log history isn't examined, it's possible to miss unprotecting a target which spent an abnormally long time in the DYK queues. If it happens, the target can be manually unprotected and the history window size increased.


  • If a target was already move protected before entering the DYK pipeline, it will have that protection removed when it transitions out of DYK. Why would this be the case as you say it [q]ueries the bot's user log with type=protect for the previous N days? Would it not unprotect only the pages that were protected by it? – SD0001 (talk) 03:05, 2 March 2023 (UTC)Reply[reply]
    You set move=autoconfirmed, then the bot changes that to move=sysop. It'll lose your original protection when it migrates off the main page and the bot unprotects. But this is enough of a corner case, I'm not going to worry about it. -- RoySmith (talk) 14:12, 2 March 2023 (UTC)Reply[reply]
    Every article going through DYK losing its move protection seems like a problem to be worried about. While I understand new articles are rarely protected, recently promoted GAs could be. Can this be fixed? If a database is too much trouble, you can use redis since the data here is easily represented as key-value pairs. – SD0001 (talk) 17:13, 7 March 2023 (UTC)Reply[reply]
    I need to think a bit on this. I had previously assumed existing move protection was such a rare thing, it wasn't worth worrying about much. But I just did a quick scan of WP:Recent additions and found:
    • Recent additions/2022/January, protect_count=28
    • Recent additions/2022/February, protect_count=14
    • Recent additions/2022/March, protect_count=6
    • Recent additions/2022/April, protect_count=12
    • Recent additions/2022/May, protect_count=12
    • Recent additions/2022/June, protect_count=9
    • Recent additions/2022/July, protect_count=20
    • Recent additions/2022/August, protect_count=15
    • Recent additions/2022/September, protect_count=17
    • Recent additions/2022/October, protect_count=13
    • Recent additions/2022/November, protect_count=21
    • Recent additions/2022/December, protect_count=9
    The protect_counts are how many targets had any move protection in their page protection log at all. The ones I spot-checked either had that protection already expired by the time they got to DYK, or applied after DYK was over, but it's still more than I had expected to see. I'm working on some ideas of how to deal with this. -- RoySmith (talk) 13:57, 8 March 2023 (UTC)Reply[reply]
  • On rare occasions, hook targets are written as templates such as one of the (many) {{Ship}} variants. You could parse the HTML instead of wikitext. Scanning the HTML for <a> tags leading to article namespace can be easier than parsing wikitext and doesn't require parsoid. – SD0001 (talk) 03:07, 2 March 2023 (UTC)Reply[reply]
    I'm not totally following you. The output of parsoid is HTML (sort of) so calling out to parsoid is indeed parsing HTML. But it would add complexity which isn't justified for an initial rollout. The logic for this is contained in Hook.targets(), so at least plugging it in later wouldn't be too disruptive. -- RoySmith (talk) 14:40, 2 March 2023 (UTC)Reply[reply]
    Well, it turns out this was easier to do than I thought it would be. Pywikibot's Site.expand_text() handles everything. I suspect it's calling Parsoid under the covers, but haven't gone digging to verify that. In any case, it works just fine. -- RoySmith (talk) 04:26, 7 March 2023 (UTC)Reply[reply]
    mw:API:Parse is what I meant, it gets you the HTML without going through parsoid. There's also mw:API:Expandtemplates which might be what Site.expand_text() uses under the hood. – SD0001 (talk) 15:03, 7 March 2023 (UTC)Reply[reply]
    Well, in any case, it's working. Can this be approved for a trial? -- RoySmith (talk) 15:09, 7 March 2023 (UTC)Reply[reply]
  • If enough protection log history isn't examined, it's possible to miss unprotecting a target which spent an abnormally long time in the DYK queues. Why not set a liberal value for N, say 25 - since anyway at the processing step it will skip the pages that don't have protection any longer? I'm assuming the unprotect task only needs to be run at quite a lower frequency than every 10 minutes. – SD0001 (talk) 03:12, 2 March 2023 (UTC)Reply[reply]
    Yeah, there's very little downside to making the history window longer. Setting it to 25 wouldn't be a problem. You're also correct that the unprotect task could run at a lower frequency, and that's easy to change. -- RoySmith (talk) 14:45, 2 March 2023 (UTC)Reply[reply]
  • Has WP:AN been notified of this bot task per WP:ADMINBOT? Primefac (talk) 10:30, 8 March 2023 (UTC)Reply[reply]
    Ooops, I didn't realize that was required. I just dropped a notification on WP:AN. -- RoySmith (talk) 13:48, 8 March 2023 (UTC)Reply[reply]
  • The discussion linked looks like a local consensus to me. Policy is against pre-emptive protection and move-protecting DYKs looks like a solution in search of a problem. As far as I can tell, only one example was given of an article being moved while on DYK and that was generally considered a good move. HJ Mitchell | Penny for your thoughts? 14:00, 8 March 2023 (UTC)Reply[reply]
    We already do protect some portions of the main page. For example {{DYK}} is protected, as are all images while they're on the main page. And Wikipedia:Bots/Requests for approval/TFA Protector Bot 3 appears to have established that TFA should be protected as well. -- RoySmith (talk) 14:18, 8 March 2023 (UTC)Reply[reply]
  • Is page-move vandalism of DYK articles on the main page an issue? Our protection policy dictates that we do not protect pages preemptively. It also seems like a significant net negative for all DYK articles that have been protected in good faith to become unprotected, particularly GAs and BLPs. If the bot could instead restore any pre-existing protection rather than letting it expire, that would generally solve my concerns. Ivanvector (Talk/Edits) 16:38, 8 March 2023 (UTC)Reply[reply]
  • Needs wider discussion. There's at least some level of pushback against this. It doesn't help either that the linked discussion was held at the local DYK talk page and closed by an involved editor. I suggest opening a discussion on WP:VPR to ensure this has consensus, as BRFA is not the right place for having such a discussion. – SD0001 (talk) 16:45, 8 March 2023 (UTC)Reply[reply]
    @SD0001 OK, I'll start a discussion on VPR. I'm not sure what the process is here, but let's put this BRFA on hold until the VPR discussion concludes. In the meantime, I'm going to continue to play around on testwiki to explore some possible solutions to the technical issues which have come up here. -- RoySmith (talk) 18:06, 8 March 2023 (UTC)Reply[reply]
    Posted at WP:VPR#Move protection for WP:DYK articles? -- RoySmith (talk) 18:13, 8 March 2023 (UTC)Reply[reply]
    Image-Symbol wait old.svg On hold. Till that discussion concludes. – SD0001 (talk) 16:38, 9 March 2023 (UTC)Reply[reply]
  • Question: If the bot removes move protection and an other admin applies it before N days have elapsed since the bot's protection, will the bot re-unprotect it? Animal lover |666| 17:08, 8 March 2023 (UTC)Reply[reply]

Operator: Theleekycauldron (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 05:56, Monday, February 13, 2023 (UTC)

Function overview: Updates WP:DYKPC

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: User:GalliumBot/proctor/ (watch this space)

Links to relevant discussions (where appropriate):

Edit period(s): 10 minutes

Estimated number of pages affected: 1

Namespace(s): Wikipedia

Exclusion compliant (Yes/No): n/a

Function details: Pretty simple; this bot iterates over all subcategories of Category:Passed DYK nominations, figures out who did the promotion, tallies it all up, and updates WP:DYKPC. It also maintains a couple userspace lists for convenience. theleekycauldron (talkcontribs) (she/her) 05:56, 13 February 2023 (UTC)Reply[reply]


  • Does it correctly handle the counts for users who were renamed? – SD0001 (talk) 17:51, 16 February 2023 (UTC)Reply[reply]
  • Also, is the page's lede text editable on-wiki? I suggest putting that text into a /header subpage and transcluding, so that changes won't be overwritten by the bot. – SD0001 (talk) 17:53, 16 February 2023 (UTC)Reply[reply]
    @SD0001: Header has been implemented! Yeah, promotion counts are tallied based off of version history, so it accounts for renames. theleekycauldron (talkcontribs) (she/her) 22:31, 16 February 2023 (UTC)Reply[reply]
    It also, on a case-by-case basis, accounts for sockpuppetry! theleekycauldron (talkcontribs) (she/her) 22:32, 16 February 2023 (UTC)Reply[reply]
  • Just to double-check, this only edits a single page, yes? Primefac (talk) 10:36, 8 March 2023 (UTC)Reply[reply]

Operator: Magnus Manske (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 11:17, Wednesday, November 30, 2022 (UTC)

Function overview: The bot finds pages with links to a redirect page that links back to the original page:

[[Page A]] links to [[Page B]] which redirects to [[Page A]]

The bot will try and replace the link in question with plain text.

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP

Source code available:

Links to relevant discussions (where appropriate): Diff from a recent circular redirect discussion

Edit period(s): Daily or weekly

Estimated number of pages affected: There are ~300K pages that have circular redirect links, but only ~10% (rough estimate) have a "simple" case that can be addressed by the bot as it is now. Capabilities to solve more complex cases might be added in the future.

Namespace(s): Main

Exclusion compliant Yes

Adminbot No

Function details: Example edit, all test edits.


  • information Note: This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT 11:23, 30 November 2022 (UTC)Reply[reply]
    Magnus, please do not run the bot again until it has approval to edit. Primefac (talk) 11:45, 30 November 2022 (UTC)Reply[reply]
  • Could you please point to a discussion where this is seen as a Good Thing? I seem to recall discussions in the past where circular redirects were usually acceptable as they indicated an {{r with potential}} type situation. Primefac (talk) 11:45, 30 November 2022 (UTC)Reply[reply]
  • I think that would depend on who you are discussing the matter with. (I'm actually responsible for prompting Magnus about this problem.) I think that circular redirects are worse than useless. For a reader who clicks on one, there is frustration, just as bad as a page self-link. They probably click again, using the servers uselessly. Where the circular redirect is created from a redlink, rather than a stub being created, WP loses a growth point. I do not buy the argument that {{r with potential}} is any sort of substitute for a redlink, in terms of getting articles created.
Talking to people who've considered the issue solely from a technical point of view, it seems this an "old chestnut" - no obvious fix. Looking at it socially, there is indeed no fix that does not undo some good-faith edits. But there is a large backlog, now affecting 4% of all articles I believe.
If the backlog can be cleared, I hope we can move onto a more sensible approach. By that I mean this issue is too large to be referred to Redirects for Discussion in each case. There should be some triage, because some of the redirects created are not that useful, as some of the (red)links introduced are unhelpful. But there has to be an initial clearance. Charles Matthews (talk) 15:57, 30 November 2022 (UTC)Reply[reply]
  • As a small data point, I'll add that WP:XFDC unlinks circular redirects when you close a RfD as retarget. Legoktm (talk) 22:11, 30 November 2022 (UTC)Reply[reply]
  • Why isn't it better to leave a redlink than to remove the link completely? Mike Christie (talk - contribs - library) 12:16, 1 December 2022 (UTC)Reply[reply]
    A redlink to what? A => B => A, removing link A => B, leaving plain text behind. Magnus Manske (talk) 16:10, 1 December 2022 (UTC)Reply[reply]
    I was thinking that since a circular redirect isn't red and hence appears to not require an article to be created, it would be better to make it into a red link. Of course that's nothing to do with the wikitext in the article with a redirect, it's a function of whether there's a page (redirect or not) at the target of the link. The bot would have to delete redirect pages, not edit links, to make this happen, and I understand that is not what this bot is designed to do. Mike Christie (talk - contribs - library) 22:03, 2 December 2022 (UTC)Reply[reply]
For the avoidance of doubt, this bot is not for removing redirects. Charles Matthews (talk) 21:29, 2 December 2022 (UTC)Reply[reply]
  • What about pages that link to a page which itself links to a sub-section on the original page? ProcrastinatingReader (talk) 21:34, 13 December 2022 (UTC)Reply[reply]
  • Noting that I've mass rollback'd the test edits, as several of them contained errors (where links contained a pipe, the replacement did not remove the pipe) ProcrastinatingReader (talk) 16:58, 16 December 2022 (UTC)Reply[reply]
  • @Magnus Manske: I've got a couple of random comments:
    • I'm generally opposed to using regex to parse wikitext. It's always tempting, but it's usually more complicated than it appears at first, and I strongly suspect wikitext is theoretically impossible to parse correctly in the general case with a regex. The kinds of errors spotted by ProcrastinatingReader will keep cropping up. This kind of thing should be done by a real wiki parser. I don't know what parsing tools are available in PHP, but Parsoid is always an option.
    • I'm not familiar with the history, but it sounds like this is something which has been considered before and rejected. Perhaps a slightly different take would be useful, however. Use the same code to detect when this happens, only on recent edits. Then have the bot drop a note on the talk page of the person who created the cycle: "This recent edit of yours <include link to diff> created a circular redirect. That's not always a problem, but it can be. Please take a look and see if the link you added is correct". Adjust the wording as appropriate. Keep track of how many of those alerts result in the link being removed, and come back with statistics which will tell us if this is actually useful or not. Or perhaps expose some deeper pattern which can be used to filter which cycles are OK and which are not. -- RoySmith (talk) 13:59, 28 January 2023 (UTC)Reply[reply]
    This might be helpful as a multi-fold approach, à la the status quo for disambiguation pages:
    Best, EpicPupper (talk) 04:59, 5 February 2023 (UTC)Reply[reply]
    Most cases on enwiki are ordinary wikilinks. These seem straightforward to handle with regex, since page titles cannot contain pipes, square brackets, or number signs. For example, the redirect page title Jupiter (planet) can be linked from any name matching \s*[Jj]upiter \(planet\)\s*; the variations contain leading or trailing spaces and/or a lowercase initial letter. Hypothetically, we would remove links from the target (Jupiter) using the following regex:
    • \[\[(\s*[Jj]upiter \(planet\)\s*)\]\]$1
    • \[\[\s*[Jj]upiter \(planet\)\s*\|(.*?)\]\]$1
    The tester's mistake was in forgetting to remove the pipe when delinking a piped link.
    I agree that a wikitext parser will be less error-prone, but all of the examples in the test edits could be handled using the above regex. If a circular link remover for such cases is working correctly, then the page User:LaundryPizza03/CircularRedirectTest should become identical to User:LaundryPizza03/CircularRedirectTest/expected result. –LaundryPizza03 (d) 06:56, 2 March 2023 (UTC)Reply[reply]

Operator: William Avery (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 18:08, Friday, September 2, 2022 (UTC)

Function overview: A template, {{plain row headers}}, will be placed immediately before every table that currently uses "plain row headers" styling. The name of the CSS class used to achieve the plain row headers styling will be changed from "plainrowheaders" to "plain-row-headers". If a table has the "plainrowheaders" CSS class, but contains no row headers to be thus styled, the "plainrowheaders" CSS class will be removed from the table.

For background to this, and the motivation, see Wikipedia:TemplateStyles.

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: pywikibot script

Links to relevant discussions (where appropriate):

For background see:

Edit period(s): Total edit time will be in the region of 9 days(~ 125,000 articles ÷ 600 edits per minute ÷ 24 hours a day), but will occur over a longer period than that. My plan is to concentrate on individual, heavily affected, subject areas in turn. Music, films, TV, theatre, sports, lists of historic buildings and species are areas where much data is tabulated in articles. I intend to do trials in each area before running batches of ~10,000 articles. This should also help shorten any period of watchlist disruption for individual editors down to a day or two.

After the initial processing, there will need to be further runs on a smaller scale, as editors will still be using the current construct, and pages may have been deep reverted for reasons unrelated to this processing.

Estimated number of pages affected: 125,000 per this search. Obviously there are very few pages with the {{plain row headers}} template in place as yet.

Namespace(s): Mainspace/Articles

Exclusion compliant: Yes, per pywikibot

Function details: Each table in the page is processed. This only applies to tables started with {|, not templates that output tables.

If the class attribute of the table contains the class name "plainrowheaders", that classname is replaced with "plain-row-headers".

If the table's class attribute now contains "plain-row-headers", several successive steps are taken to discover whether the table in fact makes use of the class, and therefore requires the {{plain row headers}} template.

  1. Each table header in the table, as parsed by mwparserfromhell, is examined for table headers with "scope=row".
  2. Table headers that start e.g. !scope=row{{Some template| may also be present. mwparserfromhell doesn't see the attribute because there is no following pipe. A regular expression can detect these, and output a warning with the name of the template. (Usually the template should be being invoked with a parameter such as "rowheader=true", rather than used in such a fashion.)
  3. The table body markup may contain a template that is known to emit table headers with "scope=row", such as {{Single chart}}. These can be tested for with a regular expression. Some of these templates, such as {{Episode list}}, are intended for use within templates that emit a whole table, but they turn up in plain tables.
  4. If the markup of the table body looks like it contains templates (i.e. includes "{{"), the templates can be subst'ed and the resultant markup reparsed, as at step one. In practice this is only necessary for relatively few of the tables.

Each table using plain row header styling should be preceded by its own {{plain row headers}}, so the processing keeps track of whether such a template has been encountered since the last table. It assumes hat any such template is the one that belongs to the current table.

If no table header cells with scope=row were found or the "wikitable" class name is not among the table's classes, the "plain-row-headers" class is not producing any effect, and is removed from the table. Otherwise, if none is present, the styling template {{plain row headers}} is inserted before the table.

Care has been taken to ensure that if the processing is run for a second time on a page that has already been processed, it makes no changes.

To simplify checking, no standard fixes or cosmetic code changes are included.

Test edits have been carried out in the bot's userspace. e.g. here and here.

Division of work into batches

I use my own extension of the pywikibot.pagegenerators module, which can generate lists of pages to be processed from a custom database table, in this case prh_batch. I can populate this table using The search needs to be divided into slices because there is a limit of 10,000 on results. Once the list of pages is in the database table I can run queries against the enwiki_p database replica to assign them to Wikiprojects.


Given the documented caveat that the ability to affect page content outside the template should not be relied on,[1] I do not think this task should proceed. — JJMC89(T·C) 01:19, 3 September 2022 (UTC)Reply[reply]

Right, I thought that would be discussed, hence why I added The wider community was notified and nothing of interest was discussed. In other discussion from the content transform team members, the HTML emitted does not make this an issue. to the documentation of the template. Absolute crickets on the task of interest (phab:T176272) where I made clear what I intended. When I reviewed the background of why that caveat is there, it was more of a "wait and see" than the wording implies on that template. See the revision after the one you linked to (why did you add it as a permalink excluding that revision?) which points to phab:T155813#3037085 and related discussion.
Separately, the discussion there about scoped CSS is more or less irrelevant today and barely relevant when it was first discussed as some sort of thing that would be beneficial for VE. Though it seems to be making a comeback on the design side finally ([1]), it's been practically dead since around the time TemplateStyles was first discussed. Even then, it doesn't seem like a valuable restriction for end-users like us -- it was proposed entirely as a convenience for VE, and TBH looking at what it was suggested for in that context I don't think it's all that pertinent there either.
To go a step further, there is a template (er, templates) that does similar today though at much smaller scale for this exact use case, which is Template:Import-blanktable/Template:Row hover highlight (~600 uses). Which, coincidentally, was recommended for use by one of the engineers who participated in the RFC linked above for how TemplateStyles would work.
At worst, we have to go through N number of articles to remove this in the future if it completely breaks VE or some other system at some arbitrary time in the future, or WMF can't somehow work around it. Izno (talk) 02:41, 3 September 2022 (UTC)Reply[reply]

Against, You can realize your idea simple and fast with CSS formula (as global.css for this local wikipedia or in the software WikiMedia):

th[scope="row"] {
	font-weight: normal;

✍️ Dušan Kreheľ (talk) 10:47, 28 September 2022 (UTC)Reply[reply]

Not relevant. This is a case where we are trying to move what is in MediaWiki:Common.css to WP:TemplateStyles. Izno (talk) 22:50, 28 September 2022 (UTC)Reply[reply]

{{BAG assistance needed}} User:Izno, who requested this task, has expressed a concern on the noticeboard at the length of time that this request has gone without attention from a BAG member, and a willingness to provide any further input required here. I am therefore requesting BAG assistance. William Avery (talk) 18:17, 24 October 2022 (UTC)Reply[reply]


  1. ^ mw:Extension:TemplateStyles#Caveats: Styles included by a template can currently affect content on the page outside of the content generated by that template, but this ability may be removed in the future and should not be relied upon.

Needs wider discussion. Per WP:BOTREQUIRE#4, please show broader consensus to perform this task and perform this task by bot at large scale. I do not see any immediate discussion that involves uninvolved editors expressing support or opposition to this task. I see a lot of technical details, work steps, todo lists, and work progress notifications, etc. concentrated on technical pages. Being (as far as I know) the first time a bot is "implementing" TemplateStyles this way places this BRFA as a precedent and puts an even larger onus on BAG to establish a clear consensus for the task. I see general support for enabling and careful use of TemplateStyles as a whole. I see general support for making a guideline. Since then it has been expanded to discuss a case like this with tables, although I don't see any direct discussion. It has also been expanded to include even more workflow for conversion, which is again a "how to" rather than "should". So, as far as I can locate previous discussions I can link to and understand the intent here, this task takes it several steps further from previous explicit consensus - it (1) styles outside its template (i.e. not "only style the associated template's output"), (2) styles tables (i.e. not "a specific template or group of templates"), (3) does this on a case-by-case basis (i.e. only tables that are manually and specifically classes "plainrowheaders") and (4) automates the process (i.e. currently, only this BRFA itself, which besides the proposer and implementer has 2 editors opposing based on arguments with sufficient merit to consider). I'm sure I'm grossly oversimplifying, but that's kind of the point - consensus should be clear and I shouldn't need to dig this deep to understand if the task is appropriate during WP:BOTAPPROVAL. —  HELLKNOWZ  TALK 19:38, 24 October 2022 (UTC)Reply[reply]

@Hellknowz Ok. Do you have a recommendation on where? I am inclined to WP:VPT or WT:TemplateStyles but if you think this should be done at WP:VPPRO, I can take it there.
You really did go down a rabbit hole there though. Anyway, the below is for your edification:
Regarding item 3 in that list; if it were all <table>s then MediaWiki:Common.css would be the correct home for it (which is where the relevant CSS lives today, and in MediaWiki:Mobile.css, but I think that is an artifact of when plainrowheaders was added vice when TemplateStyles was added and not any other reason). Regarding item 4 in that list, it is infeasible to do the change relevant to this BRFA any other way (well, I could do it in AWB but it would take a while and be more likely to cause errors). Regarding 2 editors opposing based on arguments with sufficient merit to consider, the latter editor's comment has 0 relevance in that it's basically like "you can put it in Common.css"... which is where it is today and which is sufficiently answered by MediaWiki talk:Common.css/to do#description.
I think item 2 of your list also isn't interesting as this is not a new addition of course, it is moving CSS from place X to place Y. And has existing precedent already in the form of a much lesser-used template.
Again, strictly for edification. I await your recommendation. :) Izno (talk) 02:54, 14 November 2022 (UTC)Reply[reply]
@Izno: I guess any VP discussion would probably be sufficient. If you want to future-proof it with an RfC or something, that's cool. It really is up to you guys - I (or, technically, you) just need a discussion or guideline/policy we can point to and say - here is consensus for this task. Also, can I strongly suggest a couple examples of exactly what the changes will look like so that no one has to guess what all these technical things mean.
Thanks for clarifying on the various points. As you can probably tell, I didn't try to conclude whether any of these points are actually fulfilled. To be clear, these are not necessarily problems, these are just questions about the scope of the task where I cannot find clear consensus (or at least an obvious answer). It's more to give you an indication of what I saw as an outside observer and what someone else may or may not support or oppose or disregard. —  HELLKNOWZ  TALK 11:01, 14 November 2022 (UTC)Reply[reply]

Bots in a trial period

Operator: EpicPupper (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 02:55, Thursday, March 2, 2023 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s):

Source code available:

Function overview: Replace AMP links in citations

Links to relevant discussions (where appropriate): BOTREQ, Village Pump

Edit period(s): Weekly

Estimated number of pages affected: Unknown, estimated to be in the range of hundreds of thousands

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Using the AmputatorBot API, replaces AMP links with canonical equivalents. This task runs on all pages with citation templates which have URL parameters (e.g. {{cite news}}, {{cite web}}, etc).


Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 10:27, 8 March 2023 (UTC)Reply[reply]

Operator: JPxG (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 01:22, Sunday, January 8, 2023 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available:

Function overview: This bot's purpose is to carry out some basic tasks for the Signpost, of which I am the editor-in-chief and publication manager. Right now, I am engaged in updating the historical indices at Module:Signpost: this includes things like adding articles that were previously not indexed, and adding tags for untagged articles. I'm doing this by using my Wegweiser scripts to compose data and manually pasting the output into the module's index pages, which is extremely tedious (i.e. I have to navigate to the 2005 index to paste in output for tagging the arbitration reports, then the 2006, then the 2007, then the 2008, etc up to 2015, then navigate back to the 2005 index to paste in output for tagging the discussion reports, etc, etc). Currently, my edits to the modules look like this. I also intend to update the indices with pageview counts for intervals like "30 days after publication", etc, which can be used to tabulate information that Template:Graph:Pageviews cannot handle (it can only generate graphs, it can't output single numbers!).

Another issue is that, currently, the indices are built using User:Mr. Stradivarius's SignpostTagger script: this means that every time we publish a new issue, the articles don't show up in the index until someone individually goes to each of them and runs the SignpostTagger userscript. This is extremely sub-optimal (single-issue comment pages, for example, won't render properly until each article is in the index) and error-prone (some inevitably get missed when individually clicking on 24 articles, opening the tagger for each of them, running it and checking for completion).

Links to relevant discussions (where appropriate): Module_talk:Signpost#Adding_authorship

Edit period(s): Manually, whenever large maintenance tasks are carried out, and at time of publication, to update indices.

Estimated number of pages affected: Unknown (there are only 19 module indices, but some work may need to be done on Signpost articles as well)

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: As described above: one function of the bot would be to integrate information into the Module:Signpost indices from the obsolete Signpost article category system (e.g. Category:Wikipedia Signpost Special report archives 2005, Category:Wikipedia Signpost Special report archives 2006, Category:Wikipedia Signpost Special report archives 2007, Category:Wikipedia Signpost Special report archives 2008, Category:Wikipedia Signpost Special report archives 2009, Category:Wikipedia Signpost Special report archives 2010, Category:Wikipedia Signpost Special report archives 2011, Category:Wikipedia Signpost Special report archives 2012, Category:Wikipedia Signpost Special report archives 2013, Category:Wikipedia Signpost Special report archives 2014, Category:Wikipedia Signpost Special report archives 2015, Category:Wikipedia Signpost Technology reports archives, Category:Wikipedia Signpost Technology reports archives 2005, Category:Wikipedia Signpost Technology reports archives 2006, Category:Wikipedia Signpost Technology reports archives 2007, Category:Wikipedia Signpost Technology reports archives 2008, Category:Wikipedia Signpost Technology reports archives 2009, Category:Wikipedia Signpost Technology reports archives 2010, Category:Wikipedia Signpost Technology reports archives 2011, Category:Wikipedia Signpost Technology reports archives 2012, Category:Wikipedia Signpost Technology reports archives 2013, Category:Wikipedia Signpost Technology reports archives 2014, Category:Wikipedia Signpost Technology reports archives 2015, ad nauseam).

The way it does this is by hitting the API to retrieve the categories' contents, formatting it into an array, then hitting the API to retrieve the contents of the Lua tables, and formatting that into JSON. Next, it compares the category contents to the Lua table's contents, checks to see if the Lua module has the appropriate tag for that category (e.g. Category:Wikipedia Signpost Arbitration report archives 2005 would be "arbitrationreport"), and if not, adds them. Then it converts this JSON back to the proper format for a Lua table, and submits this to the server as an edit.

For pageview tracking, it does the same as the above, except instead of getting page names from categories, it gets them from the action=query&list=allpages API endpoint, and instead of adding tags, it adds "views" key-value pairs to the dict.

For fleshing out indices, it does much the same, with respect to the Lua tables; the difference is that, rather than comparing those entries' tags to membership in a category, it compares them to a list of all pages from that year obtained through the action=query&list=allpages API endpoint (and adds new entries to the Lua table if they are not already present).

I have tested this set of scripts at the following diff: [2] (this output is valid and works fine with the module).


Trusted user. Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. Trial length is whatever you feel is appropriate. Enterprisey (talk!) 04:34, 8 January 2023 (UTC)Reply[reply]
@Enterprisey: Despite WegweiserBot's valiant effort, he remains unable to edit the module pages due to not being autoconfirmed. Can he get +AC (or +bot) flags? jp×g 05:13, 8 January 2023 (UTC)Reply[reply]
This appears to have been sorted. Primefac (talk) 07:24, 8 January 2023 (UTC)Reply[reply]

Operator: Wbm1058 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 02:36, Saturday, June 25, 2022 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): PHP

Source code available: refreshlinks.php, refreshmainlinks.php

Function overview: Purge pages with recursive link update in order to refresh links which are old

Links to relevant discussions (where appropriate): User talk:wbm1058#Continuing null editing, Wikipedia talk:Bot policy#Regarding WP:BOTPERF, phab:T157670, phab:T135964, phab:T159512

Edit period(s): Continuous

Estimated number of pages affected: ALL

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: This task runs two scripts to refresh English Wikipedia page links. refreshmainlinks.php null-edits mainspace pages whose page_links_updated database field is older than 32 days, and refreshlinks.php null-edits all other namespaces whose page_links_updated database field is older than 80 days. The 32- and 80-day figures may be tweaked as needed to ensure more timely refreshing of links or reduce load on the servers. Each script is configured to edit a maximum of 150,000 pages on a single run, and restart every three hours if not currently running (thus each script may run up to 8 times per day).

Status may be monitored by these Quarry queries:


I expect speedy approval, as a technical request, as this task only makes null edits. Task has been running for over a month. My main reason for filing this is to post my source code and document the process including links to the various discussions about it. – wbm1058 (talk) 03:02, 25 June 2022 (UTC)Reply[reply]

  • Comment: This is a very useful bot that works around long-standing feature requests that should have been straightforward for the MW developers to implement. It makes sure that things like tracking categories and transclusion counts are up to date, which helps gnomes fix errors. – Jonesey95 (talk) 13:30, 25 June 2022 (UTC)Reply[reply]
  • Comment: My main concerns are related to the edit filter; I'm not sure whether that looks at null edits or not. If it does, it's theoretically possible that we might suddenly be spammed by a very large number of filter log entries, if and when a filter gets added that widely matches null edits (and if null edits do get checked by the edit filter, we would want the account making them to have a high edit count and to be autoconfirmed, because for performance reasons, many filters skip users with high edit counts).

    To get some idea of the rate of null edits: the robot's maximum editing speed is 14 edits per second (150000 × 8 in a day). There are 6,633,771 articles, 57,812,376 pages total (how did we end up with almost ten times as many pages as articles?); this means that the average number of edits that need making per day is around 825000 per day, or around 9.5 per second. Wikipedia currently gets around 160000 edits per day (defined as "things that have an oldid number", so including moves, page creations, etc.), or around 2 per second. So this bot could be editing four times as fast as everyone else on Wikipedia put together (including all the other bots), which would likely be breaking new ground from the point of view of server load (although the servers might well be able to handle it anyway, and if not I guess the developers would just block its IP from making requests) – maybe a bit less, but surely a large proportion of pages rarely get edited.

    As a precaution, the bot should also avoid null-editing pages that contain {{subst: (possibly with added whitespace or comments), because null edits can change the page content sometimes in this case (feel free to null-edit User:ais523/Sandbox to see for yourself – just clicking "edit" and "save" is enough); it's very hard to get the wikitext to subst a template into a page in the first place (because it has a tendency to replace itself with the template's contents), but once you manage it, it can lay there ready to trigger and mess up null edits, and this seems like the sort of thing that might potentially happen by mistake (e.g. Module:Unsubst is playing around in similar space; although that one won't have a bad interaction with the bot, it's quite possible we'll end up creating a similar template in future and that one will cause problems). --ais523 23:06, 6 July 2022 (UTC)

    • While this task does not increase the bot's edit count, it has performed 7 other tasks and has an edit count of over 180,000 pages which should qualify as "high". wbm1058 (talk) 03:38, 8 July 2022 (UTC)Reply[reply]
    • There are far more users than articles; I believe User talk: is the largest namespace and thus the most resource-intensive to purge (albeit perhaps with a smaller average page size). wbm1058 (talk) 03:38, 8 July 2022 (UTC)Reply[reply]
    • The term "null edit" is used here for convenience and simplification; technically the bot purges the page cache and forces a recursive link update. This is about equivalent to a null edit, but I'm not sure that it's functionally exactly the same. – wbm1058 (talk) 03:38, 8 July 2022 (UTC)Reply[reply]
      • Ah; this seems to be a significant difference. A "purge with recursive link update" on my sandbox page doesn't add a new revision, even though a null edit does. Based on this, I suspect that purging pages is lighter on the server load than an actual null edit would be, and also recommend that you use "purge with recursive link update" rather than "null edit" terminology when describing the bot. --ais523 08:32, 8 July 2022 (UTC)Reply[reply]
        • Yes and just doing a recursive link update would be even lighter on the server load. The only reason my bot forces a purge is that there is currently no option in the API for only updating links. See this Phabricator discussion. – wbm1058 (talk) 12:42, 8 July 2022 (UTC)Reply[reply]
    • As I started work on this project March 13, 2022 and the oldest page_links_updated date (except for the Super Six) is April 28, 2022, I believe that every page in the database older than 72 days has now been null-edited at least once, and I've yet to see any reports of problems with unintended substitution. wbm1058 (talk) 03:38, 8 July 2022 (UTC)Reply[reply]
      • This is probably a consequence of the difference between purges and null edits; as long as you stick to purges it should be safe from the point of view of unintended substitution. --ais523 08:32, 8 July 2022 (UTC)Reply[reply]
    • To make this process more efficient the bot bundles requests into groups of 20; each request sent to the server is for 20 pages to be purged at once. wbm1058 (talk) 03:38, 8 July 2022 (UTC)Reply[reply]
  • Comment: I've worked the refreshlinks.php cutoff from 80 down to 70 days; the process may be able to hold it there. I've been trying to smooth out the load so that roughly the same number of pages are purged and link-refreshed each day. – wbm1058 (talk) 11:49, 8 July 2022 (UTC)Reply[reply]
  • Note. This process is dependent on my computer maintaining a connection with a Toolforge bastion. Occasionally my computer becomes disconnected for unknown reasons, and when I notice this I must manually log back in to the bastion. If my computer becomes disconnected from the bastion for an extended time, this process may fall behind the expected page_links_updated dates. – wbm1058 (talk) 11:55, 12 July 2022 (UTC)Reply[reply]
  • Another note. The purpose/objective of this task is to keep the pagelinks, categorylinks, and imagelinks tables reasonably-updated. Regenerating these tables for English Wikipedia using the rebuildall.php maintenance script is not practical for English Wikipedia due to its huge size. Even just running the RefreshLinks.php component of rebuildall is not practical due to the database size (it may be practical for smaller wikis). The goal of phab:T159512 (Add option to refreshLinks.php to only update pages that haven't been updated since a timestamp) is to make it practical to run RefreshLinks.php on English Wikipedia. My two scripts find the pages that haven't been updated since a timestamp, and then purge these pages with recursive link updates. Recursive link updates is what refreshLinks.php does. – wbm1058 (talk) 14:42, 16 July 2022 (UTC)Reply[reply]
  • Approved for trial (30 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Let's see if anything breaks. Primefac (talk) 16:24, 6 August 2022 (UTC)Reply[reply]
    @Primefac: This task just purges the page cache and forces recursive link updates, so there are no relevant contributions and/or diffs for me to provide a link to. But I see that text is coming from the {{BotTrial}} template, so you probably didn't intend to make that request. As to "anything breaking", the bot went down sometime after I left on wikibreak, and now that I'm back it's catching up. In other words, the task as currently configured "breaks" easily and requires a lot of minding to keep it running. Perhaps it would be more reliable if I figured out how to set it up as a tool running from my Toolforge admin console. – wbm1058 (talk) 15:11, 25 August 2022 (UTC)Reply[reply]
    To improve reliability, I suggest running the task on the toolforge grid. When running on the grid, the server running your code and the database are on the same hi-speed network. You appear to have tunnelled the toolforge database to local port 4711. This setup is only intended for development-time debugging and will be unreliable for long-running tasks, as you have discovered.
    Also, I suggest using significantly lesser limit than 150000 – that is a very large number of titles to expect from a single database call, and could cause timeouts and/or put too much pressure on the database. Instead process just 5-10k titles at a time, and run the script more frequently. – SD0001 (talk) 19:18, 29 August 2022 (UTC)Reply[reply]
    @SD0001 and Primefac: I set up now I'm trying to figure out what to do with it. Apparently Grid is legacy and deprecated, and Jobs framework and Kubernetes are preferred for new bot setups. But before I automate this task on Toolforge I need to set it up there so I can manually run it. Per the Toolforge quickstart guide (which is anything but quick for helping me get started) I created my tool's code/html root directory: mkdir public_html but I don't need to create my bot's code, I just need to copy it to that directory. One of the files needed to run my bot is the file containing login passwords and I'm leery of copying that to a directory with "public" in its name! Some guidance on how to do this would be appreciated since the quickstart authors apparently felt that wasn't necessary. Microsoft Notepad probably isn't installed on the Toolforge and I probably need Linux rather than Microsoft commands. Can I import the files from wikipages (i.e. User:Bot1058/refreshlinks.php)? wbm1058 (talk) 19:09, 31 August 2022 (UTC)Reply[reply]
    @Wbm1058. All files in the tool directory (not just public_html) are public by default. Passwords, OAuth secrets and the like can be made private by using chmod, chmod 600 file-with-password.txt.
    Since you're creating a bot and not than a webservice, the files shouldn't go into public_html. They can be in any directory. See wikitech:Help:Toolforge/Grid for submitting jobs to the grid. (The grid is legacy, yes, but the newer k8s-based Jobs framework is not that mature and can be harder to work with, especially for people not familiar with containers.)
    To copy over files from a Windows system, IMO the best tool is WinSCP (see wikitech:Help:Access to Toolforge instances with PuTTY and WinSCP). It's also possible to edit files directly on toolforge, such as by using nano. – SD0001 (talk) 20:39, 31 August 2022 (UTC)Reply[reply]
    I finally got around to installing WinSCP. That was easy since it uses PuTTY and I just told it to use my configuration that I previously installed for PuTTY. I couldn't find any of the three "Advanced Site Settings" screens; it appears those were in a previous version of WinSCP but are not in the current version 5.21.3. Not sure I really need them since the setup seems to all have been automatically imported from PuTTY. I think "Advanced Site Settings" was renamed to "Preferences". Under "Preferences"→"Environment" I see "Interface, Window, Commander, Explorer, Languages" rather than "Directories, Recycle bin, Encryption, SFTP, Shell".

    Now I see I created the directory /mnt/nfs/labstore-secondary-tools-project/refreshlinks for my first "tool",
    and the sub-directory /mnt/nfs/labstore-secondary-tools-project/refreshlinks/public_html (my tool's code/html root directory)
    I also have a personal directory /mnt/nfs/labstore-secondary-tools-home/wbm1058 which has just one file: (my database access credentials)
    and when I try to look at other user's personal directories I get "Permission denied" errors so I assume that any PHP code I put in my personal directory would be private so only I could read it. My tool also has a file which I can't read with WinSCP when logged into my personal account. But if in PuTTY I "become refreshlinks" then I can read my tool's file and see that it's different credentials than my personal file.

    All my bots use the botclasses framework (User:RMCD bot/botclasses.php). Should I create another tool named "botclasses" for my framework, to avoid the need to make separate copies for each individual tool that uses it? I see wikitech:Portal:Toolforge/Tool Accounts#Manage files in Toolforge that I may need to "take ownership" of files or "mount" them. §Sharing files via NFS (what is NFS?) says "Shared config or other files may be placed in the /data/project/shared directory, which is readable (and potentially writeable) by all Toolforge tools and users." Still trying to digest this information. – wbm1058 (talk) 17:41, 15 September 2022 (UTC)Reply[reply]
    answering my own question: NFS = Network File System, a distributed file system protocol originally developed by Sun Microsystems in 1984. – wbm1058 (talk) 19:10, 6 October 2022 (UTC)Reply[reply]
    Yes, personal user directories are private. files are different for each user and tool and have the mode -r-------- which means only the owner can read and no one can modify.
    The recommendation to use different tool accounts per "tool" is for webservices (since each tool account can have only one web domain). For bots, just use a single tool account for multiple bots – that's easier to maintain and manage. – SD0001 (talk) 05:53, 18 September 2022 (UTC)Reply[reply]
    Thanks. Then I'd like to rename refreshlinks to a more generic name that covers all my bots, but tools can't be renamed, nor can maintainers delete Tool Accounts. I will follow the steps described at Toolforge (Tools to be deleted). It should be obvious from my experience trying to get a "quick start" on Toolforge why you have such a growing list of tools that have been volunteered for deleting by their maintainers. – wbm1058 (talk) 18:11, 22 September 2022 (UTC)Reply[reply]
    @SD0001: I set up and then in PuTTY I "become billsbots" and mkdir php creating a PHP directory where I can upload needed files from the PHP directory on my Windows PC. Then I go over to WinSCP to try to upload the files. There I can upload botclasses.php into /billsbots/ root directory but I don't have permission to upload to the /billsbots/php/ sub-directory I just created. I see "tools.billbots" is the owner of the /billsbots/php/ sub-directory but wbm1058 is owner of botclasses.php. I logged into WinSCP the same way I log into PuTTY as wbm1058. Is there a way inside WinSCP to "become billsbots" analogous to the way I do that in PuTTY? I assume "tools.billbots" should be the owner of its public PHP files and not "wbm1058"? Also unsure of what rights settings the php directory and the files in that directory that don't house passwords should have. Right now they just are the default from mkdir php and the upload. –wbm1058 (talk) 18:52, 24 September 2022 (UTC)Reply[reply]
    There's no need to become the tool in WinSCP – group permissions can be used instead of owner permissions. The group tools.billsbot includes the user wbm1058. Problem in this case is that the group doesn't have write permission. See wikitech:Help:Access_to_Toolforge_instances_with_PuTTY_and_WinSCP#Troubleshooting_permissions_errors. Files which don't have passwords typically should have 774 (owner+group can do everything, public can read) perms. – SD0001 (talk) 05:38, 25 September 2022 (UTC)Reply[reply]

@SD0001: Thank you so much for your help. I've now successfully manually run refreshlinks.php from the command prompt in PuTTY. I need to be logged in as myself for it to work, and not as my tool, because I own and have read permission for my password file, and my tool does not. Per wikitech:Help:Toolforge/Grid#Submitting simple one-off jobs using 'jsub' when I become my tool then

jsub -N refreshlinks php /mnt/nfs/labstore-secondary-tools-project/billsbots/php/refreshlinks.php and I got this in my refreshlinks.out file:
Warning: include(/mnt/nfs/labstore-secondary-tools-project/billsbots/php/logininfo.php): failed to open stream: Permission denied in /mnt/nfs/labstore-secondary-tools-project/billsbots/php/refreshlinks.php on line 28

wbm1058 (talk) 15:32, 1 October 2022 (UTC)Reply[reply]

@Wbm1058 become the tool, take the file (transfers ownership to tool) and then do chmod 660 – that would give access to both yourself and the tool. – SD0001 (talk) 18:20, 1 October 2022 (UTC)Reply[reply]
  • @SD0001 and Primefac:I just got an email notice for Phabricator T319590: Migrate billsbots from Toolforge GridEngine to Toolforge Kubernetes. Damn, I haven't even gotten anything running on an automated basis yet, just a few one-time runs as I try to familiarize myself with how the GridEngine works, and already I have a bureaucratic nag! I knew going into this that establishing my bots on Toolforge would not be easy, and my expectations have been exceeded! Maybe I just need to bite the bullet and learn how to use the "not that mature" and possibly "harder to work with" Jobs framework, and familiarize myself with containers. – wbm1058 (talk) 16:35, 6 October 2022 (UTC)Reply[reply]
    @Wbm1058 Looks like that was part of mass-creation of tickets so nothing to urgently worry about (they've covered A to D only so my tool hasn't come up yet!). If they're becoming pushy about this, I suppose the Jobs framework is mature now, though there are quite a few things it doesn't support.
    It should be easy enough to migrate - instead of putting a jsub command in crontab for scheduling, use toolforge-jobs command, passing --image as tf-php74. – SD0001 (talk) 17:53, 6 October 2022 (UTC)Reply[reply]
  • Just noticed now that I got an email on October 9 which I overlooked at first because I didn't recognize the sender.
    sftp-server killed by Wheel of Misfortune on tools bastion
    From Root <>

Your process `sftp-server` has been killed on tools-sgebastion-10 by the Wheel of Misfortune script.

You are receiving this email because you are listed as the shell user running the killed process or as a maintainer of the tool that was.

Long-running processes and services are intended to be run on the either the Kubernetes environment or the job grid not on the bastion servers themselves. In order to ensure that login servers don't get heavily burdened by such processes, this script selects long-running processes at random for destruction.

See <> for more information on this initative. You are invited to provide constructive feedback about the importance of particular types long running processes to your work in support of the Wikimedia movement.

For further support, visit #wikimedia-cloud on or <>

I guess that explains why the task as currently configured "breaks" easily and requires a lot of minding to keep it running. Thanks, I guess, for this belated message that came only 3 12 months after I got my automated process running this way. So I suppose speedy approval isn't merited and won't be forthcoming. I did not know that I was running a process named sftp-server. What is that, and what is it doing? Most of this bot's process is still running on my own PC. Every few hours when a new script-run starts, it logs into the replica database and does a query which, even when it returns 150K results, takes only a couple of minutes. Then it logs out. It's not like this is constantly hitting on bastion resources. The only reason I need to be logged into the bastion 24×7 (via PuTTY) is that, if I'm not, then my bot, when it starts, will not be able to "tunnel" and thus will fail. The vast majority of the time I'm logged into the bastion, I'm just sitting there idle, doing nothing. Not "heavily burdening" the login server. I need to "tunnel" because there is no MediaWiki API for the database query I need to make. Otherwise I don't need the Toolforge because there is an API for making the "null edit" purges. – wbm1058 (talk) 15:53, 14 October 2022 (UTC)Reply[reply]
I think the Wheel of Misfortune sftp-server kills are from my open WinSCP session. I didn't get WinSCP installed and running until September 15, and the first email I saw from the Wheel of Misfortune was sent on October 9 (and I've received several since then). I keep WinSCP open on my desktop for my convenience. I just saw there is a "Disconnect Session" option on the "Session" tab in WinSCP and I just clicked on it. Hopefully that will stop the Wheel of Misfortune's anger. Now I can just click "Reconnect Session" when I go back to use WinSCP again – which saves me the trouble of needing to close and reopen the entire app. As far as I know the Wheel of Misfortune has never actually shut down my bot itself, perhaps because individual bot runs are not sufficiently long-running processes to draw the attention of the "Wheel". Even runs that purge 150,000 pages run in a matter of hours, not days. – wbm1058 (talk) 17:21, 19 December 2022 (UTC)Reply[reply]
  • Perhaps helpful to see how other bots running on Toolforge are configured to find a template for how to set mine up. – wbm1058 (talk) 22:45, 14 October 2022 (UTC)Reply[reply]
    Here's how I set my PHP bots up: User:Novem Linguae/Essays/Toolforge bot tutorial#Running at regular intervals (cronjob, kubernetes, grid). I found kubernetes to have a heavy learning curve, but I suppose getting the code off your local computer and onto Toolforge is the "proper" way to do things. Another method might be setting up a webserver on Toolforge/kubernetes that is an API for the query you need to make. Hope this helps. –Novem Linguae (talk) 08:35, 15 October 2022 (UTC)Reply[reply]
    Being connected to the bastion 24x7 is a no-no. Ideally, the bot process should run on toolforge itself so that no connection is needed at all between your local system and toolforge. If you really want to run the bot on local system, the tunnel connection to the database should be made only when required, and closed immediately after. Creating temporary new connections is cheap, leaving them open indefinitely is not. – SD0001 (talk) 16:51, 16 October 2022 (UTC)Reply[reply]
  • I've got my first Kubernetes one-off job running now, to refresh 40,000 pages. Commands I used to get it started:
wbm1058@tools-sgebastion-10:~$ become billsbots
tools.billsbots@tools-sgebastion-10:~$ toolforge-jobs run refreshlinks-k8s --command "php ./php/refreshlinks.php" --image tf-php74 --wait
ERROR: timed out 300 seconds waiting for job 'refreshlinks-k8s' to complete:
| Job name:  | refreshlinks-k8s                                                |
| Command:   | php ./php/refreshlinks.php                                      |
| Job type:  | normal                                                          |
| Image:     | tf-php74                                                        |
| File log:  | yes                                                             |
| Emails:    | none                                                            |
| Resources: | default                                                         |
| Status:    | Running                                                         |
| Hints:     | Last run at 2022-11-03T16:53:38Z. Pod in 'Running' phase. State |
|            | 'running'. Started at '2022-11-03T16:53:40Z'.                   |
tools.billsbots@tools-sgebastion-10:~$ toolforge-jobs list
Job name:         Job type:    Status:
----------------  -----------  ---------
refreshlinks-k8s  normal       Running

Will wait a bit for new emails or Phabricators to come in telling me what I'm still doing wrong, before proceeding to the next step, creating scheduled jobs (cron jobs). – wbm1058 (talk) 19:12, 3 November 2022 (UTC)Reply[reply]

One thing I'm apparently still doing wrong is Login to Wikipedia as Bot1058 from a device you have not recently used. That's the title of an email I get every time I run a one-off job on Toolforge. The message says "Someone (probably you) recently logged in to your account from a new device. If this was you, then you can disregard this message. If it wasn't you, then it's recommended that you change your password, and check your account activity." The Help button at the bottom of the email message links to mw:Help:Login notifications, which says "this feature relies on cookies to keep track of the devices you have used to log in". I'm guessing that cookies are not working in my Toolforge account.
The code I use to log in is:
$objwiki = new wikipedia();
$objwiki->login($user, $pass);
     * This function takes a username and password and logs you into wikipedia.
     * @param $user Username to login as.
     * @param $pass Password that corrisponds to the username.
     * @return array
    function login ($user,$pass) {
    	$post = array('lgname' => $user, 'lgpassword' => $pass);
        $ret = $this->query('?action=query&meta=tokens&type=login&format=json');
        /* This is now required - see */
        $post['lgtoken'] = $ret['query']['tokens']['logintoken'];
        $ret = $this->query( '?action=login&format=json', $post );

        if ($ret['login']['result'] != 'Success') {
            echo "Login error: \n";
        } else {
            return $ret;
These emails will get very annoying pretty fast if I get this task set up to run frequent, small jobs rather than infrequent, large jobs – as @SD0001: suggests. Help please! wbm1058 (talk) 13:52, 4 November 2022 (UTC)Reply[reply]
The login code looks ok to me. Not sure why the emails didn't stop coming after the first few times, but if necessary you can disable them from Special:Preferences notifications tab. My general tip for botops is to use OAuth, which avoids this and several other problems. – SD0001 (talk) 19:11, 4 November 2022 (UTC)Reply[reply]
I found a relevant Phabricator task and added my issue there. – wbm1058 (talk) 13:08, 6 November 2022 (UTC)Reply[reply]
I think I solved this. Per comments in the Phab, as my bot only logged in and didn't make any edits, the IP(s) weren't recorded in the CheckUser table and every log in was treated as being from a "new" IP. To work around this, I did some one-off runs of another task this bot has which does actually make edits. After running that bot task a few times on the Toolforge, the emails stopped coming, even for the task that just refreshes links and doesn't make any edits.
But in the meantime before I figured that out, I searched for OAuth "quick start" links, and am posting my finds here:
At some point while navigating this forest of links, my mind exploded. I'm putting OAuth on my back burner now, to focus on creating scheduled jobs. Meanwhile I have these links saved here so I may come back to this at some point. – wbm1058 (talk) 15:46, 10 November 2022 (UTC)Reply[reply]

Job logs

On my way to creating scheduled jobs, I ran into another issue. Per wikitech:Help:Toolforge/Jobs framework#Job logs, Subsequent same-name job runs will append to the same files... there is no automatic way to prune log files, so tool users must take care of such files growing too large. What?! How hard can it be to offer a "supersede" option to override the default "append"? – wbm1058 (talk) 22:07, 12 November 2022 (UTC)Reply[reply]

I've raised this issue in T301901. – wbm1058 (talk) 09:59, 13 November 2022 (UTC)Reply[reply]
A "supersede" option sounds like a bad idea as that would mean you can only ever see the logs of the latest job run. – SD0001 (talk) 14:00, 25 January 2023 (UTC)Reply[reply]
@SD0001: I get your point, but wikitech:Help:Toolforge/Jobs framework#Job logs says Log generation can be disabled with the --no-filelog parameter when creating a new job. If it makes sense to sometimes disable logs entirely, why wouldn't it also make sense to sometimes supersede them? All logs for bots running on my desktop PC are always superseded. That's usually not a problem, but sometimes it would be nice to be able to go back and look at a previous log to see what happened on the run where a bug first surfaced. The logs for this task are quite long though.
I've successfully started running this bot's tasks 3, 4, and 5 as a scheduled hourly task on the jobs framework, see User:Bot1058#Tasks. The logs for those tasks are usually pretty short though, so it does make sense to append there. – wbm1058 (talk) 16:10, 27 January 2023 (UTC)Reply[reply]
Abandoned complicated workaround after T301901 closed

@SD0001: I'm trying to implement the somewhat complicated workaround given at wikitech:Help:Toolforge/Jobs framework#Custom log files. I've added some explanations to this section (see the edit history) so let me know if I added anything that's not correct. I take the following as instructions to type the following directly from my PuTTY keyboard.

If you save this file as, give it execution permissions:

tools.mytool@tools-sgebastion-11:~$ cat > <<EOF
> #!/bin/sh
> jobname=$1
> command=$2
> mkdir -p logs
> sh -c $command 1>>logs/${jobname}.log 2>>logs/${jobname}.log
tools.mytool@tools-sgebastion-11:~$ chmod a+x

After doing that I notice that the $1 and $2, and $command and ${jobname}, were eaten somehow. The contents of my file are:

mkdir -p logs
sh -c  1>>logs/.log 2>>logs/.log

which doesn't seem right to me. Of course I can just copy-paste the contents of the file from the Help: page directly with WinSCP, rather than type them in with PuTTY (which I did). If this Help: page isn't giving instructions that work, it should be corrected. I've made a couple of unsuccessful attempts, and something was obviously wrong with my syntax. – wbm1058 (talk) 19:06, 17 November 2022 (UTC)Reply[reply]

./php/refreshlinks.php: 1: cannot open ?php: No such file
./php/refreshlinks.php: 2: /bin: Permission denied
./php/refreshlinks.php: 3: not found
./php/refreshlinks.php: 4: not found
./php/refreshlinks.php: 5: Syntax error: word unexpected (expecting ")")
Kubernetes' beta phase has been declared done and the new phab:T327254 "next steps in grid engine deprecation" has opened. But Job logs still says there is no automatic way to prune log files, so tool users must take care of such files growing too large. Huh? I guess I made the mistake of trying to piggyback on an existing Phab rather than opening a new one. – wbm1058 (talk) 13:29, 25 January 2023 (UTC)Reply[reply]
@Wbm1058 For now, I would suggest not worrying about pruning log files. It would take a long time before the logs grow big enough to be of any concern, at which time you could just delete or truncate it manually. – SD0001 (talk) 14:01, 25 January 2023 (UTC)Reply[reply]

OK, I think I'm close to having this wired. Per the advice above to process just 5-10k titles at a time, and run the script more frequently I've set the LIMIT for database lookups to 10000 and am now running the refreshlinks script as a continuous job. If the number of pages processed in the previous run is less than 250, then it sleeps 20 minutes before hitting the database again; otherwise it just sleeps for two minutes. Command I used to get it started:

toolforge-jobs run refreshlinks --command "php ./php/refreshlinks.php" --image php7.4 -o ./logs/refreshlinks.log -e ./logs/refreshlinks.log --continuous

I don't get the impression it runs any faster on the Toolforge than it runs on my 12-yr old desktop; if anything it seems to be running a little slower on Toolforge.

If this setup is OK then I'll set up my other script refreshmainlinks to run this way too. At the moment that one is still running on my desktop as it has been since I filed this BRFA. – wbm1058 (talk) 21:49, 9 February 2023 (UTC)Reply[reply]

Tracking the run times for processing 10,000 pages (extracted from the job logs):

  • 1:22:00
  • 1:44:50
  • 1:05:38
  • 1:31:03
  • 1:09:35
  • 1:31:15
  • 1:38:56
  • 1:41:52
  • 1:40:20
  • 1:38:02
  • 1:36:09
  • 1:24:08
  • 1:23:27
  • 1:24:41
  • 1:23:19
  • 1:26:14
  • 1:12:31
  • 1:17:15
  • 1:18:33
  • 1:08:25
  • 1:09:03
  • 1:07:40
  • 1:11:57
  • 1:09:44
  • 1:07:25
  • 1:20:23
  • 1:13:55
  • 1:12:13
  • 1:11:36
  • 1:08:31
  • 1:02:30
  • 1:12:53

It doesn't seem like I've gained anything from having the server running your code and the database are on the same hi-speed network. I haven't looked into how my process on Toolforge may be resource-limited and how to request more resources. I haven't really noticed any reliability issues running on my desktop, at least not as much as I had last August. – wbm1058 (talk) 13:50, 10 February 2023 (UTC)Reply[reply]

Bots that have completed the trial period

Operator: GoingBatty (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 19:18, Wednesday, February 15, 2023 (UTC)

Function overview: Remove WikiProject |needs-infobox= parameters from talk pages where the article has an infobox

Automatic, Supervised, or Manual: Automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Links to relevant discussions (where appropriate): Wikipedia:Bot_requests#Category:Wikipedia infobox backlog - articles needing infoboxes

Edit period(s): Monthly

Estimated number of pages affected: Hundreds

Namespace(s): Article talk pages

Exclusion compliant (Yes/No): Yes

Function details: The goal of this bot task is to reduce the number of talk pages in Category:Wikipedia infobox backlog by removing the WikiProject |needs-infobox= parameters if the associated article contains an infobox (e.g. this edit). The bot will achieve this in the following manner:

  1. Load a list of talk pages from a subset of subcategories of Category:Wikipedia infobox backlog
  2. Convert the list of talk pages to articles
  3. Preparse the list to skip those articles that do not contain an infobox
  4. Convert the list of articles back to talk pages
  5. Load the module User:Magioladitis/WikiProjects to convert redirects to WikiProject templates
  6. Run the job to remove WikiProject |needs-infobox= parameters and also general fixes

Thank you for your consideration. GoingBatty (talk) 19:18, 15 February 2023 (UTC)Reply[reply]


There is a possibilty that some Wikiprojects would like to see a specific infobox. Example, the bio articles of WP:Football articles would like "infobox footballer", here is a bad example of incorrect usage of infobox person. Not a mojor thing to consider though. Pelmeen10 (talk) 20:28, 15 February 2023 (UTC)Reply[reply]

@Pelmeen10: One could use their watchlist to monitor a category (e.g. Category:Football articles needing infoboxes) to see which pages were removed from the category by the bot. GoingBatty (talk) 03:57, 17 February 2023 (UTC)Reply[reply]
Sorry I don't know how to use a watchlist to monitor a category unless i have the pages added to my watchlist. Pelmeen10 (talk) 04:07, 17 February 2023 (UTC)Reply[reply]
@Pelmeen10: You would add the category to your watchlist, and then change the filters to "Changes by others", "Bot", and "Category changes". GoingBatty (talk) 04:47, 17 February 2023 (UTC)Reply[reply]
Okay, thanks! Pelmeen10 (talk) 12:37, 17 February 2023 (UTC)Reply[reply]

Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 10:34, 8 March 2023 (UTC)Reply[reply]

@Primefac: Trial complete. See these 50 edits. Thanks! GoingBatty (talk) 04:06, 13 March 2023 (UTC)Reply[reply]

Operator: GoingBatty (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 05:14, Friday, March 3, 2023 (UTC)

Function overview: Replace {{Empty section}} with {{No plot}}

Automatic, Supervised, or Manual: Automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Links to relevant discussions (where appropriate): Wikipedia:Templates for discussion/Log/2023 March 2#Template:Empty section

Edit period(s): Monthly

Estimated number of pages affected: 500

Namespace(s): Mainspace

Exclusion compliant (Yes/No): Yes

Function details: Using the database dump, find sections titled "Plot" with {{Empty section}}, and replace the template with {{No plot}}, along with AWB's general fixes. This will change the categorization to Category:Wikipedia articles without plot summaries, and hopefully will result in another editor adding a plot. Thank you for your consideration. GoingBatty (talk) 05:14, 3 March 2023 (UTC)Reply[reply]


Approved for trial (edits[100). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Just to clarify, this is not a replacement of the template because of the TFD, but because it is replacing a less-specific template with a more-specific template. Please correct me if I am wrong. Primefac (talk) 10:24, 8 March 2023 (UTC)Reply[reply]

@Primefac: Trial complete. See these edits. You are correct that, regardless of the TfD (in which I did not express an opinion for or against), this is replacing a less-specific template with a more-specific template, with the goal of encouraging another editor to add a plot. This is consistent with the goal of many of my other bot tasks. Thanks! GoingBatty (talk) 02:31, 13 March 2023 (UTC)Reply[reply]

Operator: Qwerfjkl (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 16:54, Thursday, March 2, 2023 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: Pywikibot

Function overview: Automaticly notify editors when they add CS1 errors to a page.

Links to relevant discussions (where appropriate): Help talk:Citation Style 1#Bot to notify users when they add CS1 errors, Wikipedia:Bot requests#Bot to notify users when they add CS1 errors

Edit period(s): continuous

Estimated number of pages affected: ~100/day

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: The bot willl detect when pages are added to Category:CS1 errors: missing periodical, Category:CS1 errors: generic title, Category:CS1 errors: missing title, and Category:CS1 errors: bare URL (though more categories may be added in the future). It will then check if the categories are still present after 15 minutes (to allow the user time to fix mistakes), and then post a notice on their talk page, notifying them of any error categories added, as well as how to fix the errors. See User:Qwerfjkl (bot)/CS1 errors for some basic examples of how this would work.


Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 10:26, 8 March 2023 (UTC)Reply[reply]

@Primefac, Trial complete. See these 50 contributions. — Qwerfjkltalk 16:35, 9 March 2023 (UTC)Reply[reply]
@Qwerfjkl: On future bot edits, could you please change the typo in the edit summary from "CS!" to "CS1"? Thanks for all your hard work on this! GoingBatty (talk) 22:05, 9 March 2023 (UTC)Reply[reply]
@GoingBatty, yes, I noticed that and fixed it. — Qwerfjkltalk 07:03, 10 March 2023 (UTC)Reply[reply]
I have a concern about warning users about a hidden warning (missing periodical) because they could not have known about it otherwise (whether that specific message should be hidden is connected to the big hullabaloo a few years ago regarding CS1 things). The existence of the {{cite document}} redirect to {{cite journal}} may cue you in to the separate problem that not all documents have a periodical to go with them; there's probably a potential problem on that front as well with a task like this.
I also have an itch about missing title, but I'm not sure exactly what. Only one or two templates have a work around for the case where the work itself doesn't have a title. I know we have gone around a few times on Help talk:CS1 about some parameter to support some descriptive title, and we have the |title=none keyword in particularly {{cite journal}}, but the other template {{cite magazine}} has no such support. (Whether it should or not.)
I otherwise endorse this activity and would like to see it expanded to other categories in the batch if possible/feasible/desired by others.
Last note: Please have the bot link to the BRFA in your edit summaries for new tasks, and add a short "task N" for the summaries for old tasks. I had to chase down this activity because it was not listed at the bot's user page. Izno (talk) 08:27, 10 March 2023 (UTC)Reply[reply]
@Izno, I've made those modifications. — Qwerfjkltalk 16:37, 10 March 2023 (UTC)Reply[reply]
The bot messages can be changed at User:Qwerfjkl (bot)/inform/middle. — Qwerfjkltalk 16:17, 10 March 2023 (UTC)Reply[reply]

Operator: Edibobb (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 01:03, Wednesday, March 8, 2023 (UTC)

Function overview: appears to have long been used as the authoritative reference for Carabidae taxa. has disabled direct links used in references. Qbugbot 6 would replace references with dead links with corresponding references to Catalogue of Life.

Automatic, Supervised, or Manual: Automatic

Programming language(s): VB

Source code available: Yes

Links to relevant discussions (where appropriate): Wikipedia_talk:WikiProject_Tree_of_Life#Carabidae.org_has_left_the_building

Edit period(s): One time run, possibly in segments, with a few small additional runs to pick up missed pages.

Estimated number of pages affected: 6,800 - 7,500

Namespace(s): Mainspace/Articles

Exclusion compliant (Yes/No): Yes

Function details:

1. Qbugbot 6 will process Wikipedia articles for Carabidae taxa that are also found in Catalogue of life.
2. If a page has a "{{Cite web}}" reference that includes a link to, a dead link, it will be replaced with a reference to that taxa in Catalogue of Life. There are 6,837 articles that meet this criteria. Most are stub class articles.
3. Some articles with disambiguation titles (such as Dolichus (beetle)) may be missed, and some of those will be collected for processing.
4. Articles not found in Catalogue of Life may be for invalid taxa, or may be valid but missing from Catalogue of Life. These will be handled manually when possible.
5. 288 articles have valid reference links to that go to a subscription page, ( for example). Tentatively, these will be left alone, but it's a simple matter to replace them with a Catalogue of Life reference if that's what people prefer.


Excellent. Thanks for taking this on. I would suggest that there is no benefit in keeping any links to at all, such as the one to Anthia, as the paywall prevents display of all relevant information. Just switch them all to CoL? --Elmidae (talk · contribs) 09:33, 8 March 2023 (UTC)Reply[reply]

Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 10:20, 8 March 2023 (UTC)Reply[reply]
Trial complete. Diffs: User_talk:Qbugbot#Qbugbot_6 No problems. There are a few cases (1 of 100 in this sample, Carenum obsoletum) where duplicate Carabidae of the World references or existing Catalog of Life references may cause a conflict with reference names. The potential conflicts are flagged to be manually checked (by me). Bob Webster (talk) 08:01, 9 March 2023 (UTC)Reply[reply]

A user has requested the attention of a member of the Bot Approvals Group. Once assistance has been rendered, please deactivate this tag by replacing it with {{t|BAG assistance needed}}.

Operator: Sheep8144402 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 15:29, Saturday, February 25, 2023 (UTC)

Function overview: Fix Linter errors in TWA pages

Automatic, Supervised, or Manual: supervised

Programming language(s): AWB

Source code available: User:SheepLinterBot/2

Links to relevant discussions (where appropriate): 1

Edit period(s): one-time run

Estimated number of pages affected: 35,15833,641, 21,942 User and 13,216 11,699 User talk (note: some false positives) edited 20:49, 8 March 2023 (UTC) to remove false positives for user talk

Namespace(s): User/User talk

Exclusion compliant (Yes/No): no

Function details: (Originally planned for task 1 but was postponed to task 2.) Fixes linter errors (missing end tags/stripped tags) by removing the set of three div tags on the top of the page, then adding the set to the bottom of the page. This is moving the set. However, a large portion of the pages has more than one set of three div tags, which will create false positives. This is best done as supervised so that FPs can be discarded and no extra lint can be added.

Assume that User pages have 6 Linter errors and User talk pages have 7, this task will fix approximately 21,942 * 6 + 13,216 11,699 * 7 = 224,164 213,545 Linter errors.

Edit on February 28, 2023: The search mentioned above has false positives, so the bot may edit ~33-34k 32-33k pages or ~200k-210k 190-205k errors fixed. edited 20:49, 8 March 2023 (UTC)


Can we please not approve more linter bots until Wikipedia:Village pump (policy)#RFC: Clarifications to WP:COSMETICBOT for fixing deprecated HTML tags is closed? * Pppery * it has begun... 16:36, 25 February 2023 (UTC)Reply[reply]

Given the long timeline in bot approvals these days, I think it would be reasonable to begin a trial so that when the RFC is closed in a week or two, this BRFA has made some progress. – Jonesey95 (talk) 17:05, 25 February 2023 (UTC)Reply[reply]
This task mainly deals with missing end tag and stripped tag errors, something which isn't covered by the RFC. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 17:40, 25 February 2023 (UTC)Reply[reply]
Agree with the above. The RfC specifically asked two questions are deprecated HTML tags considered egregiously invalid? and should WP:COSMETICBOT be updated? both are unrelated to missing end tags or stripped tags. Gonnym (talk) 11:10, 26 February 2023 (UTC)Reply[reply]
Agree. Not related to deprecated HTML tags. Afernand74 (talk) 11:31, 28 February 2023 (UTC)Reply[reply]

Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please run 50 in each affected namespace. Primefac (talk) 10:33, 8 March 2023 (UTC)Reply[reply]

Trial complete. 50 edits for User and 50 for User talk. In the case that there is more than one set of three div tags, I remove the div tag changes other than the first. Sheep (talkhe/him) 21:11, 8 March 2023 (UTC)Reply[reply]

Operator: Legoktm (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 01:55, Thursday, September 8, 2022 (UTC)

Function overview: Semi-protect TFAs

Automatic, Supervised, or Manual: Automatic

Programming language(s): Rust

Source code available: [3]

Links to relevant discussions (where appropriate):

Edit period(s): Daily

Estimated number of pages affected: 1 per day

Namespace(s): mainspace

Exclusion compliant (Yes/No): No

Adminbot (Yes/No): Yes

Function details: Note: This has only been approved for a 30-day trial, at which point it would need further community consensus to keep running AIUI.

  • This is fully independent of the move protection the bot already applies
  • At 23:00 UTC, get the next day's TFA (following any redirect)
  • Get edit protection status:
    • If protection is indefinite, do nothing
    • If protection expires after the article is off TFA, do nothing
    • If protection expires before the article is off TFA, extend the current protection until it is off TFA (keeps existing protection level)
    • If there is no protection, apply semi-protection until it is off TFA

I ran a simulation of the next 30ish TFAs (full logs): here's an example of a page that has no edit protection:

INFO tfa_semi_prot: 55 Wall Street needs protection!
INFO tfa_semi_prot: Protection options: [["action","protect"],["title","55 Wall Street"],["protections","edit=autoconfirmed|move=sysop"],["expiry","2022-09-12T00:00:00Z|2022-09-12T00:00:00Z"],["reason","Upcoming TFA ([[WP:BOT|bot protection]])"]]

And here's an example of a page that has semi-protection, but it needs to be extended:

INFO tfa_semi_prot: A.C. Monza needs protection to be extended!
INFO tfa_semi_prot: Protection options: [["action","protect"],["title","A.C. Monza"],["protections","edit=autoconfirmed|move=sysop"],["expiry","2022-09-21T00:00:00Z|2022-09-21T00:00:00Z"],["reason","Upcoming TFA ([[WP:BOT|bot protection]])"]]


Notifications: Wikipedia_talk:Today's_featured_article#TFA_bot_semi-protection, @Hog Farm, Sdkb, ProcrastinatingReader, SD0001, and Peacemaker67:. Legoktm (talk) 02:09, 8 September 2022 (UTC)Reply[reply]

Thanks very much Legoktm. I can't speak for the code, but your efforts to operationalise this for the trial period is greatly appreciated. Regards, Peacemaker67 (click to talk to me) 03:46, 8 September 2022 (UTC)Reply[reply]
Thank you as well. This was very much needed. Hog Farm Talk 13:47, 8 September 2022 (UTC)Reply[reply]

The RfC closer approved a 30-day trial, and then evaluate how well it went, presumably culminating with another RfC. To do so, we need a mostly equivalent 30-day period we can compare against. I'm not sure we can look look to the previous month, since it could be impacted by seasonal events (e.g. vandalism goes down when school starts) nor the same time in the previous year (COVID, etc.). One idea I had last night was to run the trial over the next 60 days, only semi-protecting every other day. I think that would give us a reasonable sample of data to compare and evaluate the effectiveness of the protection. Legoktm (talk) 16:35, 8 September 2022 (UTC)Reply[reply]

That sounds reasonable. Hog Farm Talk 20:58, 8 September 2022 (UTC)Reply[reply]
Every other day over 60 days sounds reasonable to me. Maybe drop a note saying this at Wikipedia_talk:Today's_featured_article so interested parties are aware? If no objections are forthcoming then I think it's good to proceed with that plan. ProcrastinatingReader (talk) 13:22, 10 September 2022 (UTC)Reply[reply]
Done. Legoktm (talk) 06:02, 11 September 2022 (UTC)Reply[reply]

{{BotTrial}} Trial for 30 days of bot protection, done every other day, as discussed above. Thanks Legoktm, let me know how it goes. ProcrastinatingReader (talk) 18:47, 16 September 2022 (UTC)Reply[reply]

Great! Set the cron for 0 23 */2 * * and a calendar reminder to turn it off in mid-November. Legoktm (talk) 06:36, 18 September 2022 (UTC)Reply[reply]
Comment: It seems that this trial is largely working and I support the idea. But it seems that lately there has been an LTA who is vandalizing TFAs with autoconfirmed accounts, which results in ECP. wizzito | say hello! 23:49, 14 October 2022 (UTC)Reply[reply]

{{bot trial complete}} I've turned off the semi-protecting job. For next steps, we need to finish the data collection/analysis that was started at User:TFA Protector Bot/Semi-protection trial (I will aim to make some time in the next few days to start updating that again). Then hold an RfC for discussion on the long-term future of this task. Legoktm (talk) 20:06, 17 November 2022 (UTC)Reply[reply]

Sounds good; thanks! {{u|Sdkb}}talk 22:13, 17 November 2022 (UTC)Reply[reply]

Image-Symbol wait old.svg On hold. Marking this as as On Hold for the duration of the RFC. Feel free to disable the template once the RFC happened. Headbomb {t · c · p · b} 04:00, 18 November 2022 (UTC)Reply[reply]

@Headbomb: Apologies if I shouldn't comment here, however is there any update on this? ― Blaze WolfTalkBlaze Wolf#6545 19:26, 21 February 2023 (UTC)Reply[reply]
@Blaze Wolf, unless I'm forgetting something (this has been a many-stage saga...), it looks like this got stuck. If you have the inclination to do, feel free to open the RfC, linking to the last one and the trial results data as background (maybe with pings to past participants and definitely with {{Please see}} notices to relevant places), and asking if the task should be kept going. Cheers, {{u|Sdkb}}talk 00:05, 1 March 2023 (UTC)Reply[reply]
Sounds good. I might get some help with opening up the RFC since I've never done so before. ― Blaze WolfTalkBlaze Wolf#6545 00:18, 1 March 2023 (UTC)Reply[reply]

Approved requests

Bots that have been approved for operations after a successful BRFA will be listed here for informational purposes. No other approval action is required for these bots. Recently approved requests can be found here (edit), while old requests can be found in the archives.

Denied requests

Bots that have been denied for operations will be listed here for informational purposes for at least 7 days before being archived. No other action is required for these bots. Older requests can be found in the Archive.

Expired/withdrawn requests

These requests have either expired, as information required by the operator was not provided, or been withdrawn. These tasks are not authorized to run, but such lack of authorization does not necessarily follow from a finding as to merit. A bot that, having been approved for testing, was not tested by an editor, or one for which the results of testing were not posted, for example, would appear here. Bot requests should not be placed here if there is an active discussion ongoing above. Operators whose requests have expired may reactivate their requests at any time. The following list shows recent requests (if any) that have expired, listed here for informational purposes for at least 7 days before being archived. Older requests can be found in the respective archives: Expired, Withdrawn.