Page MenuHomePhabricator

MediaWiki-Recent-changesComponent
ActivePublic

Members (3)

Watchers (1)

Details

Description

Related to recent changes feature in MediaWiki core. Including Special:RecentChanges/Special:NewPages, the recentchanges API module, RCFeed, and the recentchanges database table.

For the edit patrolling feature, which is often shown within parts of Recent changes features, use MediaWiki-Patrolling instead.

Maintained by the Moderator-Tools-Team. Please see this page for guidance on requesting code review.

Recent Activity

Yesterday

Etonkovidova added a comment to T164550: [1.29.0-wmf.21] RC filters - Monobook UI issues with highlights.

The current check in beta labs shows that there are some improvements

  • x in filters' bubbles are aligned properly
  • the checkboxes are present
  • the pencil icon look/scale is improved

Screen Shot 2025-05-23 at 3.54.38 PM.png (1×1 px, 270 KB)
Fri, May 23, 10:56 PM · MonoBook, Moderator-Tools-Team, MediaWiki-Recent-changes, Growth-Team-Filtering, Growth-Team, Edit-Review-Improvements-RC-Page
fnegri changed the status of T395122: Run maintain-views to create new ORES tables, a subtask of T391103: DBA Review of Tables that ORES Extension will create, from Open to In Progress.
Fri, May 23, 12:53 PM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
isarantopoulos added a comment to T391103: DBA Review of Tables that ORES Extension will create.

Yes that makes sense. Once again thank you both!

Fri, May 23, 12:10 PM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
taavi removed a project from T391103: DBA Review of Tables that ORES Extension will create: cloud-services-team.

I split that to T395122 since this task is going to get very confusing otherwise.

Fri, May 23, 12:10 PM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
Marostegui added a project to T391103: DBA Review of Tables that ORES Extension will create: cloud-services-team.
Fri, May 23, 12:01 PM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
Ladsgroup added a comment to T391103: DBA Review of Tables that ORES Extension will create.

I ran it with multiple db options and only lawiki was run. So the cookbook must be either run one by one (or xargs) or the old fashioned way. I leave that to WMCS to handle.

Fri, May 23, 11:56 AM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
ops-monitoring-bot added a comment to T391103: DBA Review of Tables that ORES Extension will create.

Cookbook cookbooks.sre.wikireplicas.update-views started by ladsgroup completed:

  • an-redacteddb1001.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --databases lawiki'
Fri, May 23, 11:51 AM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
ops-monitoring-bot added a comment to T391103: DBA Review of Tables that ORES Extension will create.

Cookbook cookbooks.sre.wikireplicas.update-views run by ladsgroup: Started updating wiki replica views

Fri, May 23, 11:47 AM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
Ladsgroup added a comment to T391103: DBA Review of Tables that ORES Extension will create.

Could someone run the maintain-views script for the above? Thank youu 🙏

First the tables must be created in production, if they are not created the maintain-views will be noop. Let me know once you created the tables and I can run it (also I think you can just directly create the tables without needing to enable the extension or deploy anything, it'll be sitting there empty but that's fine)

Fri, May 23, 11:41 AM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
Ladsgroup added a comment to T391103: DBA Review of Tables that ORES Extension will create.

Could someone run the maintain-views script for the above? Thank youu 🙏

Fri, May 23, 11:41 AM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
kostajh added a comment to T178902: Filter on age in new recent changes.

The idea is especially interesting while patrolling new articles made by new editors, to give them time to finish the article before it is tagged with maintenance templates.

Fri, May 23, 9:36 AM · Moderator-Tools-Team, MediaWiki-Recent-changes, Growth-Team-Filtering, Growth-Team, Edit-Review-Improvements-RC-Page
isarantopoulos updated subscribers of T391103: DBA Review of Tables that ORES Extension will create.

From the above initial list the extension is already enabled in the following wikis:

simplewiki
trwiki

And we have also enabled it in idwiki.
I have created the tables for the remaining list of wikis (ores_classification and ores_model tables).
This is the list of the remaining wikis, as this step precedes the extension installation:

cywiki
bewiki
kkwiki
nnwiki
mkwiki
lawiki
afwiki
tewiki
mrwiki
swwiki
mlwiki
iswiki
pawiki
hawiki
tlwiki
bnwiki
azwiki

Could someone run the maintain-views script for the above? Thank youu 🙏
cc: @Ladsgroup @taavi

Fri, May 23, 8:35 AM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team
Marostegui edited projects for T391103: DBA Review of Tables that ORES Extension will create, added: Data-Persistence; removed DBA.
Fri, May 23, 8:05 AM · Data-Persistence, Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes, MediaWiki-extensions-ORES, Machine-Learning-Team

Thu, May 22

A_smart_kitten added a comment to T389976: Recent Changes Tagged Edit Filters help icon causes the search input to clear and opens the default menu.

Turns out that the patch here seems to have also fixed T210816: [1.33.0-wmf.6 - regression] RC/Watchlist - red outline for "oo-ui-inputWidget-input" at the same time :D

Thu, May 22, 9:05 PM · MW-1.45-notes (1.45.0-wmf.1; 2025-05-13), Moderator-Tools-Team (Kanban), MediaWiki-Recent-changes
bd808 added a comment to T304085: PHP Deprecated: Caller ignored an error raised from SpecialRecentChanges::doMainQuery (max_statement_time exceeded).

Still seeing a steady drip of these.

Thu, May 22, 8:56 PM · Moderator-Tools-Team, MediaWiki-Recent-changes, Wikimedia-production-error
A_smart_kitten reassigned T210816: [1.33.0-wmf.6 - regression] RC/Watchlist - red outline for "oo-ui-inputWidget-input" from Etonkovidova to Kgraessle.
Thu, May 22, 8:43 PM · good first task, MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Edit-Review-Improvements-RC-Page, Regression, Growth-Team
Etonkovidova closed T210816: [1.33.0-wmf.6 - regression] RC/Watchlist - red outline for "oo-ui-inputWidget-input" as Resolved.

@Etonkovidova Huh, seems like it might have been resolved since my previous comment! I can reproduce the issue on a newly-created patch demo wiki running REL1_44, but - following the steps in the task description - I am also unable to reproduce the bug on testwiki running 1.45.0-wmf.2.

From using git bisect, it seems like it may have been fixed by https://gerrit.wikimedia.org/r/1138844 (cc @Kgraessle), which was merged as the fix for T389976: Recent Changes Tagged Edit Filters help icon causes the search input to clear and opens the default menu.

Thu, May 22, 8:33 PM · good first task, MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Edit-Review-Improvements-RC-Page, Regression, Growth-Team
A_smart_kitten updated subscribers of T210816: [1.33.0-wmf.6 - regression] RC/Watchlist - red outline for "oo-ui-inputWidget-input".

@Etonkovidova Huh, seems like it might have been resolved since my previous comment! I can reproduce the issue on a newly-created patch demo wiki running REL1_44, but - following the steps in the task description - I am also unable to reproduce the bug on testwiki running 1.45.0-wmf.2.

Thu, May 22, 8:25 PM · good first task, MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Edit-Review-Improvements-RC-Page, Regression, Growth-Team
Etonkovidova added a comment to T210816: [1.33.0-wmf.6 - regression] RC/Watchlist - red outline for "oo-ui-inputWidget-input".

I can still reproduce (on 1.44.0-wmf.28) following the steps in the task description.
(Side note, I'm not sure when this red border is intended to appear - figuring that out might require an archaeological dive into some historical Phabricator tickets!)

Thu, May 22, 6:43 PM · good first task, MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Edit-Review-Improvements-RC-Page, Regression, Growth-Team
Kgraessle added a project to T200972: Modal for result and period control on new Watchlist/RCFilters missing bottom padding : good first task.

Thank you for tagging this task with good first task for Wikimedia newcomers!

Thu, May 22, 5:40 PM · good first task, Moderator-Tools-Team, Design, MediaWiki-Recent-changes, Edit-Review-Improvements-RC-Page, MediaWiki-Watchlist
Kgraessle added a project to T210816: [1.33.0-wmf.6 - regression] RC/Watchlist - red outline for "oo-ui-inputWidget-input": good first task.

Thank you for tagging this task with good first task for Wikimedia newcomers!

Thu, May 22, 5:39 PM · good first task, MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Edit-Review-Improvements-RC-Page, Regression, Growth-Team
Kgraessle closed T132568: [Task] Refactor Special:Recentchanges to use WatchedItemStore as Resolved.
Thu, May 22, 5:37 PM · Technical-Debt, MediaWiki-Recent-changes, Moderator-Tools-Team, User-Addshore, MW-1.27-release (WMF-deploy-2016-04-26_(1.27.0-wmf.22)), MW-1.27-release-notes, TCB-Team-Sprint-2016-04-13, MediaWiki-Watchlist, German-Community-Wishlist
Kgraessle closed T158956: The filter panel for Recent changes does not adapt well to narrow windows as Resolved.

This looks to be resolved already.

Thu, May 22, 5:36 PM · Moderator-Tools-Team, MediaWiki-Recent-changes, Edit-Review-Improvements-RC-Page
Kgraessle moved T160682: It is unclear what the 'Hide patrolled pages from new page list' setting applies to from Inbox to Triaged on the Moderator-Tools-Team board.
Thu, May 22, 5:34 PM · Beta-Cluster-reproducible, MediaWiki-Core-Preferences, MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Edit-Review-Improvements-RC-Page, Growth-Team
Kgraessle moved T161176: [minor] All rclinks have the same tooltip 'Special:RecentChanges' from Inbox to Maintenance priorities on the Moderator-Tools-Team board.
Thu, May 22, 5:32 PM · Moderator-Tools-Team, MediaWiki-Recent-changes
Kgraessle moved T168615: Surface sub-categories when filtering Recent Changes from Inbox to Triaged on the Moderator-Tools-Team board.
Thu, May 22, 5:31 PM · Moderator-Tools-Team, MediaWiki-Recent-changes, Edit-Review-Improvements-Integrated-Filters
Kgraessle added a comment to T168615: Surface sub-categories when filtering Recent Changes.

This is stalled on T307328: Scalability issues of recentchanges table.

Thu, May 22, 5:31 PM · Moderator-Tools-Team, MediaWiki-Recent-changes, Edit-Review-Improvements-Integrated-Filters
Kgraessle moved T178902: Filter on age in new recent changes from Inbox to Triaged on the Moderator-Tools-Team board.
Thu, May 22, 5:29 PM · Moderator-Tools-Team, MediaWiki-Recent-changes, Growth-Team-Filtering, Growth-Team, Edit-Review-Improvements-RC-Page
Kgraessle added a comment to T178902: Filter on age in new recent changes.

We may be stalled working on this due to T307328: Scalability issues of recentchanges table,

Thu, May 22, 5:29 PM · Moderator-Tools-Team, MediaWiki-Recent-changes, Growth-Team-Filtering, Growth-Team, Edit-Review-Improvements-RC-Page
Kgraessle moved T183729: Allow the 'Live updates' feature to be turned on by default from Inbox to Triaged on the Moderator-Tools-Team board.
Thu, May 22, 5:24 PM · MediaWiki-Watchlist, MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Growth-Team, Edit-Review-Improvements-RC-Page
Kgraessle moved T200972: Modal for result and period control on new Watchlist/RCFilters missing bottom padding from Inbox to To be estimated on the Moderator-Tools-Team board.
Thu, May 22, 5:24 PM · good first task, Moderator-Tools-Team, Design, MediaWiki-Recent-changes, Edit-Review-Improvements-RC-Page, MediaWiki-Watchlist
Kgraessle moved T210816: [1.33.0-wmf.6 - regression] RC/Watchlist - red outline for "oo-ui-inputWidget-input" from Inbox to To be estimated on the Moderator-Tools-Team board.
Thu, May 22, 5:22 PM · good first task, MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Edit-Review-Improvements-RC-Page, Regression, Growth-Team
Kgraessle moved T394379: Mediawiki 1.43: PageUpdater::pageIdentity inconsistent with PageUpdater::wikiPage in PageUpdater::doCreate leading to failed assertion in RecentChange::notifyNew from Inbox to Triaged on the Moderator-Tools-Team board.
Thu, May 22, 5:18 PM · Moderator-Tools-Team, MediaWiki-Recent-changes, MediaWiki-Page-derived-data
Kgraessle moved T164550: [1.29.0-wmf.21] RC filters - Monobook UI issues with highlights from Design backlog to To be estimated on the Moderator-Tools-Team board.
Thu, May 22, 5:16 PM · MonoBook, Moderator-Tools-Team, MediaWiki-Recent-changes, Growth-Team-Filtering, Growth-Team, Edit-Review-Improvements-RC-Page
Kgraessle moved T164550: [1.29.0-wmf.21] RC filters - Monobook UI issues with highlights from Inbox to Design backlog on the Moderator-Tools-Team board.
Thu, May 22, 5:16 PM · MonoBook, Moderator-Tools-Team, MediaWiki-Recent-changes, Growth-Team-Filtering, Growth-Team, Edit-Review-Improvements-RC-Page
Kgraessle moved T173259: Clarify terminology around "tag", "filter" with old page and new page from Inbox to Triaged on the Moderator-Tools-Team board.
Thu, May 22, 5:13 PM · MediaWiki-Recent-changes, Moderator-Tools-Team, Growth-Team-Filtering, Growth-Team, Edit-Review-Improvements-RC-Page
Kgraessle moved T394840: Edit not saved with the "mw-changed-redirect-target" tag or in RecentChanges from Inbox to Triaged on the Moderator-Tools-Team board.
Thu, May 22, 5:10 PM · MediaWiki-Recent-changes, MediaWiki-Change-tagging, Moderator-Tools-Team
Kgraessle moved T394939: Decommission RecentChanges experiment platform instrument and analyze the results from Inbox to To be estimated on the Moderator-Tools-Team board.
Thu, May 22, 5:09 PM · MediaWiki-Recent-changes, Moderator-Tools-Team
Kgraessle moved T394937: Create and implement an instrument for running an experiment on RecentChanges from Inbox to To be estimated on the Moderator-Tools-Team board.
Thu, May 22, 5:09 PM · MediaWiki-Watchlist, MediaWiki-Recent-changes, Moderator-Tools-Team
Kgraessle moved T394935: Configure the metrics platform stream for an experiment on RecentChanges from Code review requests to To be estimated on the Moderator-Tools-Team board.
Thu, May 22, 5:09 PM · MediaWiki-Watchlist, MediaWiki-Recent-changes, Moderator-Tools-Team
Kgraessle moved T394935: Configure the metrics platform stream for an experiment on RecentChanges from Inbox to Code review requests on the Moderator-Tools-Team board.
Thu, May 22, 5:09 PM · MediaWiki-Watchlist, MediaWiki-Recent-changes, Moderator-Tools-Team
Kgraessle closed T375280: PopulateDatabase errors out and stops processing revisions when any revertRiskLiftWingRequest request fails, a subtask of T391964: [Epic] Recent Changes ORES Enabled Revert Risk Powered Filters Rollout Plan, as Resolved.
Thu, May 22, 4:06 PM · Epic, MediaWiki-extensions-ORES, MediaWiki-Recent-changes, DBA, Moderator-Tools-Team, Machine-Learning-Team
Kgraessle closed T394455: Ensure all ORES i18n messages are available for idwiki, a subtask of T391964: [Epic] Recent Changes ORES Enabled Revert Risk Powered Filters Rollout Plan, as Resolved.
Thu, May 22, 4:04 PM · Epic, MediaWiki-extensions-ORES, MediaWiki-Recent-changes, DBA, Moderator-Tools-Team, Machine-Learning-Team
Ladsgroup placed T304085: PHP Deprecated: Caller ignored an error raised from SpecialRecentChanges::doMainQuery (max_statement_time exceeded) up for grabs.

This is not really a database issue. It's an rdbms library issue.

Thu, May 22, 11:52 AM · Moderator-Tools-Team, MediaWiki-Recent-changes, Wikimedia-production-error
gkyziridis added a comment to T392148: Run analysis to retrieve thresholds for high impact wikis to deploy recent changes revert risk language agnostic filters to.

RevertRisk Thresholds Analysis for all wikis

Using this notebook or this python_script I generated revert risk thresholds for all wikis in one go loading data for each wiki iteratively on memory. You can check the plots under the main section at the bottom of the notebook. The script provides the option to run the analysis on single wiki from user input either on all wikis.
The script ran at eqiad8 on jupyterlab, you can find the results in this paste

1============ - cywiki - ============
2 - Raw data shape: (200511, 17)
3 - Duplicate rows found and removed: 1265
4 - Clean data shape: (199246, 17)
5 - Unique revision_ids: 199246 | Data Shape: 199246 | Same? : -> True
6 - Removing edits that are reverts from df | New Shape: (190747, 17)
7 - Is any revert_risk_score NA? : False
8 - Is any user_edit_count NA? : False
9 - Is any time_to_revert NA? : False
10 - ROC_cywiki.png saved!
11 - Optimal threshold for 15.0% FPR is: 0.11076630651950836
12 - confusion_matrix_cywiki.png saved!
13 - False Positive Rate is: 0.14999340149691756
14 - CONFUSION MATRIX -
15
16Predicted not reverted reverted
17Actual
18not reverted 135259 23868
19reverted 15169 16451
20
21
22============ - simplewiki - ============
23 - Raw data shape: (312209, 17)
24 - Duplicate rows found and removed: 41914
25 - Clean data shape: (270295, 17)
26 - Number of duplicated revision_ids found: 9
27 - Unique revision_ids: 270289 | Data Shape: 270289 | Same? : -> True
28 - Removing edits that are reverts from df | New Shape: (246893, 17)
29 - Is any revert_risk_score NA? : False
30 - Is any user_edit_count NA? : False
31 - Is any time_to_revert NA? : False
32 - ROC_simplewiki.png saved!
33 - Optimal threshold for 15.0% FPR is: 0.9065559506416321
34 - confusion_matrix_simplewiki.png saved!
35 - False Positive Rate is: 0.1500198513981056
36 - CONFUSION MATRIX -
37
38Predicted not reverted reverted
39Actual
40not reverted 179832 31740
41reverted 13622 21699
42
43
44============ - bewiki - ============
45 - Raw data shape: (80609, 17)
46 - Duplicate rows found and removed: 6221
47 - Clean data shape: (74388, 17)
48 - Unique revision_ids: 74388 | Data Shape: 74388 | Same? : -> True
49 - Removing edits that are reverts from df | New Shape: (73969, 17)
50 - Is any revert_risk_score NA? : False
51 - Is any user_edit_count NA? : False
52 - Is any time_to_revert NA? : False
53 - ROC_bewiki.png saved!
54 - Optimal threshold for 15.0% FPR is: 0.572495698928833
55 - confusion_matrix_bewiki.png saved!
56 - False Positive Rate is: 0.15014515086058638
57 - CONFUSION MATRIX -
58
59Predicted not reverted reverted
60Actual
61not reverted 61770 10913
62reverted 183 1103
63
64
65============ - kkwiki - ============
66 - Raw data shape: (82268, 17)
67 - Duplicate rows found and removed: 16708
68 - Clean data shape: (65560, 17)
69 - Unique revision_ids: 65560 | Data Shape: 65560 | Same? : -> True
70 - Removing edits that are reverts from df | New Shape: (64276, 17)
71 - Is any revert_risk_score NA? : False
72 - Is any user_edit_count NA? : False
73 - Is any time_to_revert NA? : False
74 - ROC_kkwiki.png saved!
75 - Optimal threshold for 15.0% FPR is: 0.6048707962036133
76 - confusion_matrix_kkwiki.png saved!
77 - False Positive Rate is: 0.14999318290272
78 - CONFUSION MATRIX -
79
80Predicted not reverted reverted
81Actual
82not reverted 49875 8801
83reverted 1475 4125
84
85
86============ - nnwiki - ============
87 - Raw data shape: (25248, 17)
88 - Duplicate rows found and removed: 4213
89 - Clean data shape: (21035, 17)
90 - Unique revision_ids: 21035 | Data Shape: 21035 | Same? : -> True
91 - Removing edits that are reverts from df | New Shape: (20392, 17)
92 - Is any revert_risk_score NA? : False
93 - Is any user_edit_count NA? : False
94 - Is any time_to_revert NA? : False
95 - ROC_nnwiki.png saved!
96 - Optimal threshold for 15.0% FPR is: 0.4162459373474121
97 - confusion_matrix_nnwiki.png saved!
98 - False Positive Rate is: 0.15001312680493567
99 - CONFUSION MATRIX -
100
101Predicted not reverted reverted
102Actual
103not reverted 16188 2857
104reverted 48 1299
105
106
107============ - mkwiki - ============
108 - Raw data shape: (54028, 17)
109 - Duplicate rows found and removed: 8215
110 - Clean data shape: (45813, 17)
111 - Unique revision_ids: 45813 | Data Shape: 45813 | Same? : -> True
112 - Removing edits that are reverts from df | New Shape: (44585, 17)
113 - Is any revert_risk_score NA? : False
114 - Is any user_edit_count NA? : False
115 - Is any time_to_revert NA? : False
116 - ROC_mkwiki.png saved!
117 - Optimal threshold for 15.0% FPR is: 0.505830705165863
118 - confusion_matrix_mkwiki.png saved!
119 - False Positive Rate is: 0.15001292809627906
120 - CONFUSION MATRIX -
121
122Predicted not reverted reverted
123Actual
124not reverted 36161 6382
125reverted 507 1535
126
127
128============ - lawiki - ============
129 - Raw data shape: (27151, 17)
130 - Duplicate rows found and removed: 3948
131 - Clean data shape: (23203, 17)
132 - Unique revision_ids: 23203 | Data Shape: 23203 | Same? : -> True
133 - Removing edits that are reverts from df | New Shape: (22893, 17)
134 - Is any revert_risk_score NA? : False
135 - Is any user_edit_count NA? : False
136 - Is any time_to_revert NA? : False
137 - ROC_lawiki.png saved!
138 - Optimal threshold for 15.0% FPR is: 0.6340628266334534
139 - confusion_matrix_lawiki.png saved!
140 - False Positive Rate is: 0.14979973297730306
141 - CONFUSION MATRIX -
142
143Predicted not reverted reverted
144Actual
145not reverted 19104 3366
146reverted 80 343
147
148
149============ - afwiki - ============
150 - Raw data shape: (21996, 17)
151 - Duplicate rows found and removed: 3614
152 - Clean data shape: (18382, 17)
153 - Unique revision_ids: 18382 | Data Shape: 18382 | Same? : -> True
154 - Removing edits that are reverts from df | New Shape: (17768, 17)
155 - Is any revert_risk_score NA? : False
156 - Is any user_edit_count NA? : False
157 - Is any time_to_revert NA? : False
158 - ROC_afwiki.png saved!
159 - Optimal threshold for 15.0% FPR is: 0.7369382977485657
160 - confusion_matrix_afwiki.png saved!
161 - False Positive Rate is: 0.1498459410174181
162 - CONFUSION MATRIX -
163
164Predicted not reverted reverted
165Actual
166not reverted 13520 2383
167reverted 214 1651
168
169============ - tewiki - ============
170 - Raw data shape: (97488, 17)
171 - Duplicate rows found and removed: 5150
172 - Clean data shape: (92338, 17)
173 - Unique revision_ids: 92338 | Data Shape: 92338 | Same? : -> True
174 - Removing edits that are reverts from df | New Shape: (91883, 17)
175 - Is any revert_risk_score NA? : False
176 - Is any user_edit_count NA? : False
177 - Is any time_to_revert NA? : False
178 - ROC_tewiki.png saved!
179 - Optimal threshold for 15.0% FPR is: 0.36725547909736633
180 - confusion_matrix_tewiki.png saved!
181 - False Positive Rate is: 0.1500330323717243
182 - CONFUSION MATRIX -
183
184Predicted not reverted reverted
185Actual
186not reverted 77194 13626
187reverted 242 821
188
189
190============ - mrwiki - ============
191 - Raw data shape: (42535, 17)
192 - Duplicate rows found and removed: 3930
193 - Clean data shape: (38605, 17)
194 - Number of duplicated revision_ids found: 2
195 - Unique revision_ids: 38604 | Data Shape: 38604 | Same? : -> True
196 - Removing edits that are reverts from df | New Shape: (37677, 17)
197 - Is any revert_risk_score NA? : False
198 - Is any user_edit_count NA? : False
199 - Is any time_to_revert NA? : False
200 - ROC_mrwiki.png saved!
201 - Optimal threshold for 15.0% FPR is: 0.8673532605171204
202 - confusion_matrix_mrwiki.png saved!
203 - False Positive Rate is: 0.15002842524161455
204 - CONFUSION MATRIX -
205
206Predicted not reverted reverted
207Actual
208not reverted 29902 5278
209reverted 575 1922
210
211
212============ - swwiki - ============
213 - Raw data shape: (10831, 17)
214 - Duplicate rows found and removed: 682
215 - Clean data shape: (10149, 17)
216 - Unique revision_ids: 10149 | Data Shape: 10149 | Same? : -> True
217 - Removing edits that are reverts from df | New Shape: (9971, 17)
218 - Is any revert_risk_score NA? : False
219 - Is any user_edit_count NA? : False
220 - Is any time_to_revert NA? : False
221 - ROC_swwiki.png saved!
222 - Optimal threshold for 15.0% FPR is: 0.7482092380523682
223 - confusion_matrix_swwiki.png saved!
224 - False Positive Rate is: 0.1493978517955951
225 - CONFUSION MATRIX -
226
227Predicted not reverted reverted
228Actual
229not reverted 7840 1377
230reverted 322 432
231
232
233============ - mlwiki - ============
234 - Raw data shape: (32931, 17)
235 - Duplicate rows found and removed: 5245
236 - Clean data shape: (27686, 17)
237 - Unique revision_ids: 27686 | Data Shape: 27686 | Same? : -> True
238 - Removing edits that are reverts from df | New Shape: (27032, 17)
239 - Is any revert_risk_score NA? : False
240 - Is any user_edit_count NA? : False
241 - Is any time_to_revert NA? : False
242 - ROC_mlwiki.png saved!
243 - Optimal threshold for 15.0% FPR is: 0.9362650513648987
244 - confusion_matrix_mlwiki.png saved!
245 - False Positive Rate is: 0.15019467495182287
246 - CONFUSION MATRIX -
247
248Predicted not reverted reverted
249Actual
250not reverted 21608 3819
251reverted 810 795
252
253
254============ - iswiki - ============
255 - Raw data shape: (17947, 17)
256 - Duplicate rows found and removed: 4129
257 - Clean data shape: (13818, 17)
258 - Unique revision_ids: 13818 | Data Shape: 13818 | Same? : -> True
259 - Removing edits that are reverts from df | New Shape: (13452, 17)
260 - Is any revert_risk_score NA? : False
261 - Is any user_edit_count NA? : False
262 - Is any time_to_revert NA? : False
263 - ROC_iswiki.png saved!
264 - Optimal threshold for 15.0% FPR is: 0.8892603516578674
265 - confusion_matrix_iswiki.png saved!
266 - False Positive Rate is: 0.15050732807215333
267 - CONFUSION MATRIX -
268
269Predicted not reverted reverted
270Actual
271not reverted 10549 1869
272reverted 430 604
273
274
275============ - pawiki - ============
276 - Raw data shape: (20662, 17)
277 - Duplicate rows found and removed: 2817
278 - Clean data shape: (17845, 17)
279 - Unique revision_ids: 17845 | Data Shape: 17845 | Same? : -> True
280 - Removing edits that are reverts from df | New Shape: (17030, 17)
281 - Is any revert_risk_score NA? : False
282 - Is any user_edit_count NA? : False
283 - Is any time_to_revert NA? : False
284 - ROC_pawiki.png saved!
285 - Optimal threshold for 15.0% FPR is: 0.5458896160125732
286 - confusion_matrix_pawiki.png saved!
287 - False Positive Rate is: 0.14929753356228537
288 - CONFUSION MATRIX -
289
290Predicted not reverted reverted
291Actual
292not reverted 13624 2391
293reverted 88 927
294
295
296============ - hawiki - ============
297 - Raw data shape: (142582, 17)
298 - Duplicate rows found and removed: 11926
299 - Clean data shape: (130656, 17)
300 - Unique revision_ids: 130656 | Data Shape: 130656 | Same? : -> True
301 - Removing edits that are reverts from df | New Shape: (130286, 17)
302 - Is any revert_risk_score NA? : False
303 - Is any user_edit_count NA? : False
304 - Is any time_to_revert NA? : False
305 - ROC_hawiki.png saved!
306 - Optimal threshold for 15.0% FPR is: 0.4823181927204132
307 - confusion_matrix_hawiki.png saved!
308 - False Positive Rate is: 0.15009778484218073
309 - CONFUSION MATRIX -
310
311Predicted not reverted reverted
312Actual
313not reverted 109079 19264
314reverted 1329 614
315
316
317============ - tlwiki - ============
318 - Raw data shape: (29823, 17)
319 - Duplicate rows found and removed: 2465
320 - Clean data shape: (27358, 17)
321 - Unique revision_ids: 27358 | Data Shape: 27358 | Same? : -> True
322 - Removing edits that are reverts from df | New Shape: (26356, 17)
323 - Is any revert_risk_score NA? : False
324 - Is any user_edit_count NA? : False
325 - Is any time_to_revert NA? : False
326 - ROC_tlwiki.png saved!
327 - Optimal threshold for 15.0% FPR is: 0.607416570186615
328 - confusion_matrix_tlwiki.png saved!
329 - False Positive Rate is: 0.15018641595072135
330 - CONFUSION MATRIX -
331
332Predicted not reverted reverted
333Actual
334not reverted 20970 3706
335reverted 176 1504
336
337
338============ - bnwiki - ============
339 - Raw data shape: (330764, 17)
340 - Duplicate rows found and removed: 29591
341 - Clean data shape: (301173, 17)
342 - Number of duplicated revision_ids found: 10
343 - Unique revision_ids: 301166 | Data Shape: 301166 | Same? : -> True
344 - Removing edits that are reverts from df | New Shape: (292405, 17)
345 - Is any revert_risk_score NA? : False
346 - Is any user_edit_count NA? : False
347 - Is any time_to_revert NA? : False
348 - ROC_bnwiki.png saved!
349 - Optimal threshold for 15.0% FPR is: 0.6465859413146973
350 - confusion_matrix_bnwiki.png saved!
351 - False Positive Rate is: 0.15002477700693756
352 - CONFUSION MATRIX -
353
354Predicted not reverted reverted
355Actual
356not reverted 233274 41174
357reverted 4019 13938
358
359
360============ - trwiki - ============
361 - Raw data shape: (749190, 17)
362 - Duplicate rows found and removed: 116314
363 - Clean data shape: (632876, 17)
364 - Number of duplicated revision_ids found: 4
365 - Unique revision_ids: 632874 | Data Shape: 632874 | Same? : -> True
366 - Removing edits that are reverts from df | New Shape: (581675, 17)
367 - Is any revert_risk_score NA? : False
368 - Is any user_edit_count NA? : False
369 - Is any time_to_revert NA? : False
370 - ROC_trwiki.png saved!
371 - Optimal threshold for 15.0% FPR is: 0.6082413196563721
372 - confusion_matrix_trwiki.png saved!
373 - False Positive Rate is: 0.14998876387080776
374 - CONFUSION MATRIX -
375
376Predicted not reverted reverted
377Actual
378not reverted 438769 77423
379reverted 10108 55375
380
381
382============ - azwiki - ============
383 - Raw data shape: (224309, 17)
384 - Duplicate rows found and removed: 23643
385 - Clean data shape: (200666, 17)
386 - Number of duplicated revision_ids found: 4
387 - Unique revision_ids: 200664 | Data Shape: 200664 | Same? : -> True
388 - Removing edits that are reverts from df | New Shape: (194127, 17)
389 - Is any revert_risk_score NA? : False
390 - Is any user_edit_count NA? : False
391 - Is any time_to_revert NA? : False
392 - ROC_azwiki.png saved!
393 - Optimal threshold for 15.0% FPR is: 0.5366107821464539
394 - confusion_matrix_azwiki.png saved!
395 - False Positive Rate is: 0.14990706525346845
396 - CONFUSION MATRIX -
397
398Predicted not reverted reverted
399Actual
400not reverted 153673 27099
401reverted 2012 11343
402
403
404Optimal Thresholds calculated at 16-05-2025T19:24:01
405{'cywiki': 0.11076631, 'simplewiki': 0.90655595, 'bewiki': 0.5724957, 'kkwiki': 0.6048708, 'nnwiki': 0.41624594, 'mkwiki': 0.5058307, 'lawiki': 0.6340628, 'afwiki': 0.7369383, 'tewiki': 0.36725548, 'mrwiki': 0.86735326, 'swwiki': 0.74820924, 'mlwiki': 0.93626505, 'iswiki': 0.88926035, 'pawiki': 0.5458896, 'hawiki': 0.4823182, 'tlwiki': 0.60741657, 'bnwiki': 0.64658594, 'trwiki': 0.6082413, 'azwiki': 0.5366108}
406
407Time taken: 4278.109 secs

The execution takes around 70 minutes.

Optimal Thresholds

WikiOptimalThreshold
cywiki0.11076631
simplewiki0.90655595
bewiki0.5724957
kkwiki0.6048708
nnwiki0.41624594
mkwiki0.5058307
lawiki0.6340628
afwiki0.7369383
tewiki0.36725548
mrwiki0.86735326
swwiki0.74820924
mlwiki0.93626505
iswiki0.88926035
pawiki0.545889
hawiki0.4823182
tlwiki0.60741657
bnwiki0.64658594
trwiki0.6082413
azwiki0.5366108

@Kgraessle would it be possible to review/check the above results? Do they seem correct based on your experience ? Do they meet the the expected outcome for each wiki based on our intuition?
For more "in depth" evaluation you can have a look at the plots for RocCurve and ConfusionMatrix at the bottom of this notebook .

Yeah I can take a look, is it ok if I prioritize this for next week?

Thu, May 22, 10:01 AM · Moderator-Tools-Team, Machine-Learning-Team, MediaWiki-Recent-changes

Wed, May 21

Kgraessle added a comment to T392148: Run analysis to retrieve thresholds for high impact wikis to deploy recent changes revert risk language agnostic filters to.

RevertRisk Thresholds Analysis for all wikis

Using this notebook or this python_script I generated revert risk thresholds for all wikis in one go loading data for each wiki iteratively on memory. You can check the plots under the main section at the bottom of the notebook. The script provides the option to run the analysis on single wiki from user input either on all wikis.
The script ran at eqiad8 on jupyterlab, you can find the results in this paste

1============ - cywiki - ============
2 - Raw data shape: (200511, 17)
3 - Duplicate rows found and removed: 1265
4 - Clean data shape: (199246, 17)
5 - Unique revision_ids: 199246 | Data Shape: 199246 | Same? : -> True
6 - Removing edits that are reverts from df | New Shape: (190747, 17)
7 - Is any revert_risk_score NA? : False
8 - Is any user_edit_count NA? : False
9 - Is any time_to_revert NA? : False
10 - ROC_cywiki.png saved!
11 - Optimal threshold for 15.0% FPR is: 0.11076630651950836
12 - confusion_matrix_cywiki.png saved!
13 - False Positive Rate is: 0.14999340149691756
14 - CONFUSION MATRIX -
15
16Predicted not reverted reverted
17Actual
18not reverted 135259 23868
19reverted 15169 16451
20
21
22============ - simplewiki - ============
23 - Raw data shape: (312209, 17)
24 - Duplicate rows found and removed: 41914
25 - Clean data shape: (270295, 17)
26 - Number of duplicated revision_ids found: 9
27 - Unique revision_ids: 270289 | Data Shape: 270289 | Same? : -> True
28 - Removing edits that are reverts from df | New Shape: (246893, 17)
29 - Is any revert_risk_score NA? : False
30 - Is any user_edit_count NA? : False
31 - Is any time_to_revert NA? : False
32 - ROC_simplewiki.png saved!
33 - Optimal threshold for 15.0% FPR is: 0.9065559506416321
34 - confusion_matrix_simplewiki.png saved!
35 - False Positive Rate is: 0.1500198513981056
36 - CONFUSION MATRIX -
37
38Predicted not reverted reverted
39Actual
40not reverted 179832 31740
41reverted 13622 21699
42
43
44============ - bewiki - ============
45 - Raw data shape: (80609, 17)
46 - Duplicate rows found and removed: 6221
47 - Clean data shape: (74388, 17)
48 - Unique revision_ids: 74388 | Data Shape: 74388 | Same? : -> True
49 - Removing edits that are reverts from df | New Shape: (73969, 17)
50 - Is any revert_risk_score NA? : False
51 - Is any user_edit_count NA? : False
52 - Is any time_to_revert NA? : False
53 - ROC_bewiki.png saved!
54 - Optimal threshold for 15.0% FPR is: 0.572495698928833
55 - confusion_matrix_bewiki.png saved!
56 - False Positive Rate is: 0.15014515086058638
57 - CONFUSION MATRIX -
58
59Predicted not reverted reverted
60Actual
61not reverted 61770 10913
62reverted 183 1103
63
64
65============ - kkwiki - ============
66 - Raw data shape: (82268, 17)
67 - Duplicate rows found and removed: 16708
68 - Clean data shape: (65560, 17)
69 - Unique revision_ids: 65560 | Data Shape: 65560 | Same? : -> True
70 - Removing edits that are reverts from df | New Shape: (64276, 17)
71 - Is any revert_risk_score NA? : False
72 - Is any user_edit_count NA? : False
73 - Is any time_to_revert NA? : False
74 - ROC_kkwiki.png saved!
75 - Optimal threshold for 15.0% FPR is: 0.6048707962036133
76 - confusion_matrix_kkwiki.png saved!
77 - False Positive Rate is: 0.14999318290272
78 - CONFUSION MATRIX -
79
80Predicted not reverted reverted
81Actual
82not reverted 49875 8801
83reverted 1475 4125
84
85
86============ - nnwiki - ============
87 - Raw data shape: (25248, 17)
88 - Duplicate rows found and removed: 4213
89 - Clean data shape: (21035, 17)
90 - Unique revision_ids: 21035 | Data Shape: 21035 | Same? : -> True
91 - Removing edits that are reverts from df | New Shape: (20392, 17)
92 - Is any revert_risk_score NA? : False
93 - Is any user_edit_count NA? : False
94 - Is any time_to_revert NA? : False
95 - ROC_nnwiki.png saved!
96 - Optimal threshold for 15.0% FPR is: 0.4162459373474121
97 - confusion_matrix_nnwiki.png saved!
98 - False Positive Rate is: 0.15001312680493567
99 - CONFUSION MATRIX -
100
101Predicted not reverted reverted
102Actual
103not reverted 16188 2857
104reverted 48 1299
105
106
107============ - mkwiki - ============
108 - Raw data shape: (54028, 17)
109 - Duplicate rows found and removed: 8215
110 - Clean data shape: (45813, 17)
111 - Unique revision_ids: 45813 | Data Shape: 45813 | Same? : -> True
112 - Removing edits that are reverts from df | New Shape: (44585, 17)
113 - Is any revert_risk_score NA? : False
114 - Is any user_edit_count NA? : False
115 - Is any time_to_revert NA? : False
116 - ROC_mkwiki.png saved!
117 - Optimal threshold for 15.0% FPR is: 0.505830705165863
118 - confusion_matrix_mkwiki.png saved!
119 - False Positive Rate is: 0.15001292809627906
120 - CONFUSION MATRIX -
121
122Predicted not reverted reverted
123Actual
124not reverted 36161 6382
125reverted 507 1535
126
127
128============ - lawiki - ============
129 - Raw data shape: (27151, 17)
130 - Duplicate rows found and removed: 3948
131 - Clean data shape: (23203, 17)
132 - Unique revision_ids: 23203 | Data Shape: 23203 | Same? : -> True
133 - Removing edits that are reverts from df | New Shape: (22893, 17)
134 - Is any revert_risk_score NA? : False
135 - Is any user_edit_count NA? : False
136 - Is any time_to_revert NA? : False
137 - ROC_lawiki.png saved!
138 - Optimal threshold for 15.0% FPR is: 0.6340628266334534
139 - confusion_matrix_lawiki.png saved!
140 - False Positive Rate is: 0.14979973297730306
141 - CONFUSION MATRIX -
142
143Predicted not reverted reverted
144Actual
145not reverted 19104 3366
146reverted 80 343
147
148
149============ - afwiki - ============
150 - Raw data shape: (21996, 17)
151 - Duplicate rows found and removed: 3614
152 - Clean data shape: (18382, 17)
153 - Unique revision_ids: 18382 | Data Shape: 18382 | Same? : -> True
154 - Removing edits that are reverts from df | New Shape: (17768, 17)
155 - Is any revert_risk_score NA? : False
156 - Is any user_edit_count NA? : False
157 - Is any time_to_revert NA? : False
158 - ROC_afwiki.png saved!
159 - Optimal threshold for 15.0% FPR is: 0.7369382977485657
160 - confusion_matrix_afwiki.png saved!
161 - False Positive Rate is: 0.1498459410174181
162 - CONFUSION MATRIX -
163
164Predicted not reverted reverted
165Actual
166not reverted 13520 2383
167reverted 214 1651
168
169============ - tewiki - ============
170 - Raw data shape: (97488, 17)
171 - Duplicate rows found and removed: 5150
172 - Clean data shape: (92338, 17)
173 - Unique revision_ids: 92338 | Data Shape: 92338 | Same? : -> True
174 - Removing edits that are reverts from df | New Shape: (91883, 17)
175 - Is any revert_risk_score NA? : False
176 - Is any user_edit_count NA? : False
177 - Is any time_to_revert NA? : False
178 - ROC_tewiki.png saved!
179 - Optimal threshold for 15.0% FPR is: 0.36725547909736633
180 - confusion_matrix_tewiki.png saved!
181 - False Positive Rate is: 0.1500330323717243
182 - CONFUSION MATRIX -
183
184Predicted not reverted reverted
185Actual
186not reverted 77194 13626
187reverted 242 821
188
189
190============ - mrwiki - ============
191 - Raw data shape: (42535, 17)
192 - Duplicate rows found and removed: 3930
193 - Clean data shape: (38605, 17)
194 - Number of duplicated revision_ids found: 2
195 - Unique revision_ids: 38604 | Data Shape: 38604 | Same? : -> True
196 - Removing edits that are reverts from df | New Shape: (37677, 17)
197 - Is any revert_risk_score NA? : False
198 - Is any user_edit_count NA? : False
199 - Is any time_to_revert NA? : False
200 - ROC_mrwiki.png saved!
201 - Optimal threshold for 15.0% FPR is: 0.8673532605171204
202 - confusion_matrix_mrwiki.png saved!
203 - False Positive Rate is: 0.15002842524161455
204 - CONFUSION MATRIX -
205
206Predicted not reverted reverted
207Actual
208not reverted 29902 5278
209reverted 575 1922
210
211
212============ - swwiki - ============
213 - Raw data shape: (10831, 17)
214 - Duplicate rows found and removed: 682
215 - Clean data shape: (10149, 17)
216 - Unique revision_ids: 10149 | Data Shape: 10149 | Same? : -> True
217 - Removing edits that are reverts from df | New Shape: (9971, 17)
218 - Is any revert_risk_score NA? : False
219 - Is any user_edit_count NA? : False
220 - Is any time_to_revert NA? : False
221 - ROC_swwiki.png saved!
222 - Optimal threshold for 15.0% FPR is: 0.7482092380523682
223 - confusion_matrix_swwiki.png saved!
224 - False Positive Rate is: 0.1493978517955951
225 - CONFUSION MATRIX -
226
227Predicted not reverted reverted
228Actual
229not reverted 7840 1377
230reverted 322 432
231
232
233============ - mlwiki - ============
234 - Raw data shape: (32931, 17)
235 - Duplicate rows found and removed: 5245
236 - Clean data shape: (27686, 17)
237 - Unique revision_ids: 27686 | Data Shape: 27686 | Same? : -> True
238 - Removing edits that are reverts from df | New Shape: (27032, 17)
239 - Is any revert_risk_score NA? : False
240 - Is any user_edit_count NA? : False
241 - Is any time_to_revert NA? : False
242 - ROC_mlwiki.png saved!
243 - Optimal threshold for 15.0% FPR is: 0.9362650513648987
244 - confusion_matrix_mlwiki.png saved!
245 - False Positive Rate is: 0.15019467495182287
246 - CONFUSION MATRIX -
247
248Predicted not reverted reverted
249Actual
250not reverted 21608 3819
251reverted 810 795
252
253
254============ - iswiki - ============
255 - Raw data shape: (17947, 17)
256 - Duplicate rows found and removed: 4129
257 - Clean data shape: (13818, 17)
258 - Unique revision_ids: 13818 | Data Shape: 13818 | Same? : -> True
259 - Removing edits that are reverts from df | New Shape: (13452, 17)
260 - Is any revert_risk_score NA? : False
261 - Is any user_edit_count NA? : False
262 - Is any time_to_revert NA? : False
263 - ROC_iswiki.png saved!
264 - Optimal threshold for 15.0% FPR is: 0.8892603516578674
265 - confusion_matrix_iswiki.png saved!
266 - False Positive Rate is: 0.15050732807215333
267 - CONFUSION MATRIX -
268
269Predicted not reverted reverted
270Actual
271not reverted 10549 1869
272reverted 430 604
273
274
275============ - pawiki - ============
276 - Raw data shape: (20662, 17)
277 - Duplicate rows found and removed: 2817
278 - Clean data shape: (17845, 17)
279 - Unique revision_ids: 17845 | Data Shape: 17845 | Same? : -> True
280 - Removing edits that are reverts from df | New Shape: (17030, 17)
281 - Is any revert_risk_score NA? : False
282 - Is any user_edit_count NA? : False
283 - Is any time_to_revert NA? : False
284 - ROC_pawiki.png saved!
285 - Optimal threshold for 15.0% FPR is: 0.5458896160125732
286 - confusion_matrix_pawiki.png saved!
287 - False Positive Rate is: 0.14929753356228537
288 - CONFUSION MATRIX -
289
290Predicted not reverted reverted
291Actual
292not reverted 13624 2391
293reverted 88 927
294
295
296============ - hawiki - ============
297 - Raw data shape: (142582, 17)
298 - Duplicate rows found and removed: 11926
299 - Clean data shape: (130656, 17)
300 - Unique revision_ids: 130656 | Data Shape: 130656 | Same? : -> True
301 - Removing edits that are reverts from df | New Shape: (130286, 17)
302 - Is any revert_risk_score NA? : False
303 - Is any user_edit_count NA? : False
304 - Is any time_to_revert NA? : False
305 - ROC_hawiki.png saved!
306 - Optimal threshold for 15.0% FPR is: 0.4823181927204132
307 - confusion_matrix_hawiki.png saved!
308 - False Positive Rate is: 0.15009778484218073
309 - CONFUSION MATRIX -
310
311Predicted not reverted reverted
312Actual
313not reverted 109079 19264
314reverted 1329 614
315
316
317============ - tlwiki - ============
318 - Raw data shape: (29823, 17)
319 - Duplicate rows found and removed: 2465
320 - Clean data shape: (27358, 17)
321 - Unique revision_ids: 27358 | Data Shape: 27358 | Same? : -> True
322 - Removing edits that are reverts from df | New Shape: (26356, 17)
323 - Is any revert_risk_score NA? : False
324 - Is any user_edit_count NA? : False
325 - Is any time_to_revert NA? : False
326 - ROC_tlwiki.png saved!
327 - Optimal threshold for 15.0% FPR is: 0.607416570186615
328 - confusion_matrix_tlwiki.png saved!
329 - False Positive Rate is: 0.15018641595072135
330 - CONFUSION MATRIX -
331
332Predicted not reverted reverted
333Actual
334not reverted 20970 3706
335reverted 176 1504
336
337
338============ - bnwiki - ============
339 - Raw data shape: (330764, 17)
340 - Duplicate rows found and removed: 29591
341 - Clean data shape: (301173, 17)
342 - Number of duplicated revision_ids found: 10
343 - Unique revision_ids: 301166 | Data Shape: 301166 | Same? : -> True
344 - Removing edits that are reverts from df | New Shape: (292405, 17)
345 - Is any revert_risk_score NA? : False
346 - Is any user_edit_count NA? : False
347 - Is any time_to_revert NA? : False
348 - ROC_bnwiki.png saved!
349 - Optimal threshold for 15.0% FPR is: 0.6465859413146973
350 - confusion_matrix_bnwiki.png saved!
351 - False Positive Rate is: 0.15002477700693756
352 - CONFUSION MATRIX -
353
354Predicted not reverted reverted
355Actual
356not reverted 233274 41174
357reverted 4019 13938
358
359
360============ - trwiki - ============
361 - Raw data shape: (749190, 17)
362 - Duplicate rows found and removed: 116314
363 - Clean data shape: (632876, 17)
364 - Number of duplicated revision_ids found: 4
365 - Unique revision_ids: 632874 | Data Shape: 632874 | Same? : -> True
366 - Removing edits that are reverts from df | New Shape: (581675, 17)
367 - Is any revert_risk_score NA? : False
368 - Is any user_edit_count NA? : False
369 - Is any time_to_revert NA? : False
370 - ROC_trwiki.png saved!
371 - Optimal threshold for 15.0% FPR is: 0.6082413196563721
372 - confusion_matrix_trwiki.png saved!
373 - False Positive Rate is: 0.14998876387080776
374 - CONFUSION MATRIX -
375
376Predicted not reverted reverted
377Actual
378not reverted 438769 77423
379reverted 10108 55375
380
381
382============ - azwiki - ============
383 - Raw data shape: (224309, 17)
384 - Duplicate rows found and removed: 23643
385 - Clean data shape: (200666, 17)
386 - Number of duplicated revision_ids found: 4
387 - Unique revision_ids: 200664 | Data Shape: 200664 | Same? : -> True
388 - Removing edits that are reverts from df | New Shape: (194127, 17)
389 - Is any revert_risk_score NA? : False
390 - Is any user_edit_count NA? : False
391 - Is any time_to_revert NA? : False
392 - ROC_azwiki.png saved!
393 - Optimal threshold for 15.0% FPR is: 0.5366107821464539
394 - confusion_matrix_azwiki.png saved!
395 - False Positive Rate is: 0.14990706525346845
396 - CONFUSION MATRIX -
397
398Predicted not reverted reverted
399Actual
400not reverted 153673 27099
401reverted 2012 11343
402
403
404Optimal Thresholds calculated at 16-05-2025T19:24:01
405{'cywiki': 0.11076631, 'simplewiki': 0.90655595, 'bewiki': 0.5724957, 'kkwiki': 0.6048708, 'nnwiki': 0.41624594, 'mkwiki': 0.5058307, 'lawiki': 0.6340628, 'afwiki': 0.7369383, 'tewiki': 0.36725548, 'mrwiki': 0.86735326, 'swwiki': 0.74820924, 'mlwiki': 0.93626505, 'iswiki': 0.88926035, 'pawiki': 0.5458896, 'hawiki': 0.4823182, 'tlwiki': 0.60741657, 'bnwiki': 0.64658594, 'trwiki': 0.6082413, 'azwiki': 0.5366108}
406
407Time taken: 4278.109 secs

The execution takes around 70 minutes.

Optimal Thresholds

WikiOptimalThreshold
cywiki0.11076631
simplewiki0.90655595
bewiki0.5724957
kkwiki0.6048708
nnwiki0.41624594
mkwiki0.5058307
lawiki0.6340628
afwiki0.7369383
tewiki0.36725548
mrwiki0.86735326
swwiki0.74820924
mlwiki0.93626505
iswiki0.88926035
pawiki0.545889
hawiki0.4823182
tlwiki0.60741657
bnwiki0.64658594
trwiki0.6082413
azwiki0.5366108

@Kgraessle would it be possible to review/check the above results? Do they seem correct based on your experience ? Do they meet the the expected outcome for each wiki based on our intuition?
For more "in depth" evaluation you can have a look at the plots for RocCurve and ConfusionMatrix at the bottom of this notebook .

Wed, May 21, 4:25 PM · Moderator-Tools-Team, Machine-Learning-Team, MediaWiki-Recent-changes
Kgraessle added projects to T394937: Create and implement an instrument for running an experiment on RecentChanges: MediaWiki-Recent-changes, MediaWiki-Watchlist.
Wed, May 21, 4:23 PM · MediaWiki-Watchlist, MediaWiki-Recent-changes, Moderator-Tools-Team
Kgraessle created T394939: Decommission RecentChanges experiment platform instrument and analyze the results .
Wed, May 21, 4:05 PM · MediaWiki-Recent-changes, Moderator-Tools-Team
Kgraessle created T394935: Configure the metrics platform stream for an experiment on RecentChanges.
Wed, May 21, 3:59 PM · MediaWiki-Watchlist, MediaWiki-Recent-changes, Moderator-Tools-Team
HCoplin-WMF removed a project from T159773: Support all parameters in ApiFeedRecentChanges: MW-Interfaces-Team.

Looks like Moderator-Tools-Team is on top of this. We are removing the MW-Interfaces-Team tag for now.

Wed, May 21, 1:15 PM · MediaWiki-Action-API, MediaWiki-Recent-changes, Moderator-Tools-Team, Edit-Review-Improvements