Wikipedia:Miscellany for deletion/Second batch of mass-created portals based on a single navbox/Selection process
Selection process for this nomination
editThis requires a programmer's text editor and WP:AWB. Some modest proficiency is needed with AWB's list-making features, but no great wizardry.
This list aims to include pages which meet all the following criteria
- single-navbox portal
- created by @The Transhumanist between 13 August 2018 and 22:41, 12 September 2018. That is the period from the start of TTH's mass creation until the oldest of the portals included in the first nomination
- Not a redirect or duplicate of another page on the list
- Not already tagged for MFD
I took 10 steps. Several of the steps could be merged, but I prefer to do tasks like this one step a time.
- View contribs. Display the user contributions by @The Transhumanist, selecting portal namespace, Only show edits that are page creations, 500 pages at a time — start here
Note that this will no longer show the same results as when I did it, because some pages have since been deleted. However, I think that if you use the same cutoff dates, the end results should be identical, because the pages which have since been deleted were excluded in later steps. - Screenscrape. Copy several screenfulls to a text editor (I was using Notepad++).
- Strip irrelevancies. On each line strip off everything except the page name. This can be done accurately thanks to hidden characters which were included in the screengrabs. I can't display the hidden chars (value U+200E: LEFT-TO-RIGHT MARK [LRM]) so I have replaced them with X, but with that modification the two regexes are:
s/^.+\+[\d,]+X N //
ands/X .+$//
- Linkify. Use a regex to convert each line to a link:
s/^(.+)$/# [[:\1]]/
. Save list as a file (total: 2,053 pages) - Remove duplicates. Load list into WP:AWB. Use List->Remove duplicates. Save list (total: 1,901 pages)
- Remove redirects and non-existent pages. Use the skip tab to skip "page is rediect" and "doesn't exist". Save list (total: 1,737 pages)
- Remove pages already tagged for MFD. Use AWB in list-making ("pre-parse") mode to skip any page which matches
/{{mfd/i
. Save list. (total: 1,182 pages) - Remove non-automated. Use AWB's "List comparer" tool" to keep only pages which transclude Template:Transclude list item excerpts as random slideshow or Template:Transclude linked excerpts as random slideshow. Save list. (total 1,175 pages)
- Keep only single-navbox selections. I used an AWB custom module (see my module to identify pages where the only unnamed parameter for {{Transclude list item excerpts as random slideshow}} or {{Transclude linked excerpts as random slideshow}} is a single template; other pages were skipped. Save list. (total 1,144 pages)
- Error-checking. Use AWB's list comparer to find any in of 1,144 pages which are not in the tracking Category:Automated portals with article list built solely from one template. That category is populated by Module:Excerpt slideshow, so every page in this list shoukd be in the category.
Result: found 5 pages which were not in the category: Portal:Ezra Pound, Portal:Howard University, Portal:Ghosts, Portal:Belgrade, Portal:Boston Red Sox, Portal:Exoplanets. Manually checked each one, found that in every case my AWB run was correct and the problem was Lua modules being overloaded caused a Lua failure.
So the final list is the 1,144 pages identified in step 9.
I would welcome any checks by others on this process. --BrownHairedGirl (talk) • (contribs) 19:23, 14 April 2019 (UTC)