Regarding Being able to access model outputs at scale would likely unlock additioanal use cases for liftwing for the mobile apps and other teams in the long term:
- This would be a more sustainable fix for the article-country model where we need country predictions for all of an article's links. We are currently using a static database hosted on LiftWing but that means it's constantly out-of-date and a bulkier image and more complicated pipelines for "re-training" the model: T385970.
- Over the years, I've built little prototype user-scripts (details) that can query model outputs for all of the links in a given article in order to visualize them as you browse. For example, highlighting links based on whether they were biographies of men, women, or non-binary folks and displaying statistics about the distribution. This allows for easily visualizing gender bias in links. I also had one for showing article quality predictions for links so editors could see which articles to prioritize for improvement that are relevant to a given topic. I was just running offline bulk predictions and caching them in a database hosted on Cloud VPS but this would be a cool use-case to support officially.
I personally would love a solution based on Search because we already use their index for hosting predictions for many recommendation use-cases because it's highly accessible and they already handle the messiness of updating indexes to keep up with edits. In theory if they make the cirrusdoc (or similar) endpoint efficient for this sort of use-case, you also get some nice behavior for free such as the ability to use generator queries -- e.g., a single query to get topic predictions for links from the River Nile page: https://en.wikipedia.org/w/api.php?action=query&generator=links&titles=River_Nile&prop=cirrusdoc&format=json&formatversion=2&cdincludes=weighted_tags&gplnamespace=0&gpllimit=100&redirects. This is currently an expensive query for them and not meant to be used in production use-cases but the simplicity of it is quite beautiful. If the solution is a LiftWing cache, that's okay too but we might consider if there are ways to make it accessible in a similar way.
There's a new topic model under discussion (report) with meetings to hopefully kick off soon about next steps for bringing to production. I personally think it'd be great to have this new model for this use-case too as it would e.g., give reliable data on the gender of biographies that folks are reading. We also are in early discussions about what it would mean to incorporate a "time" element into the model so that e.g., you could see if folks were reading predominantly about more current or historical topics. The main blocker to this time topic is also a question of how to efficiently serve the data from the Search index so it's worth considering alongside this ask. There's no phab ticket yet but I've talked with the Search Platform about it.

Fri, May 16, 1:42 PM · Machine-Learning-Team

Thu, May 15

Isaac added a comment to T383178: Document results on Reference Quality.

Updating deadline to June 30th -- this ended up being submitted to ACL's Industry Track but had a single bad review. We're assessing options for whether to incorporate into T382727 or keep as a standalone paper. We should make a decision by the end of the quarter.

Thu, May 15, 2:11 PM · Research

Isaac changed Due Date from Jan 30 2025, 11:00 PM to Jun 30 2025, 10:00 PM on T383178: Document results on Reference Quality.

Thu, May 15, 2:10 PM · Research

Isaac added a comment to T382727: Publicize results from SDS 1.2.1 B.

Moving the deadline back one week -- I'm currently reviewing the paper and we'll make some decisions shortly on next steps.

Thu, May 15, 2:06 PM · Research, User-notice

Isaac changed Due Date from Fri, May 16, 4:00 AM to Fri, May 23, 4:00 AM on T382727: Publicize results from SDS 1.2.1 B.

Thu, May 15, 2:06 PM · Research, User-notice

Wed, May 14

Isaac added a subtask for T392210: [WE1.5.3] Wikipedia Patrolling Measurement: T394065: [FY25-WE1.5.3] HTML wiki content dataset to support Wikipedia Patrolling Measurement.

Wed, May 14, 4:45 PM · OKR-Work, Research, Knowledge-Integrity

Isaac added a parent task for T394065: [FY25-WE1.5.3] HTML wiki content dataset to support Wikipedia Patrolling Measurement: T392210: [WE1.5.3] Wikipedia Patrolling Measurement.

Wed, May 14, 4:45 PM · Research

Isaac updated the task description for T394065: [FY25-WE1.5.3] HTML wiki content dataset to support Wikipedia Patrolling Measurement.

Wed, May 14, 4:45 PM · Research

Tue, May 13

Isaac moved T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models from Backlog to In Progress on the Research board.

Tue, May 13, 1:11 PM · Research

Isaac triaged T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models as High priority.

Tue, May 13, 1:11 PM · Research

Fri, May 9

Isaac added a comment to T377264: Moderator tools baseline satisfaction surveys.

Prioritization update:

The RecentChanges/Watchlist survey has been deprioritized as it's unlikely that the technical interventions will generate a change in the satisfaction metrics due to a focus on smaller wikis. Given that the RecentChanges/Watchlist survey has been deployed, we are going to leave it open for the standard window. We won't be prioritizing the analysis right now but believe that the data can still be valuable outside of measuring satisfaction changes (e.g., % of highly-active editors who say they do patrolling/moderation tasks). We will pick those basic analyses up perhaps in Q1 if we have more time. We won't run any follow-up surveys though.
This does not affect the Nuke surveys, which will still proceed as expected.

Fri, May 9, 9:28 PM · Research (FY2024-25-Research-April-June), OKR-Work

Isaac updated the task description for T377264: Moderator tools baseline satisfaction surveys.

Fri, May 9, 8:50 PM · Research (FY2024-25-Research-April-June), OKR-Work

Wed, May 7

Isaac updated the task description for T391717: [Q4 FY 24-25 Applied Science] Knowledge Integrity Research.

Wed, May 7, 6:31 PM · Research (FY2024-25-Research-April-June)

Isaac triaged T393634: Research support for NPOV workstreams as High priority.

Wed, May 7, 6:31 PM · Research

Isaac created T393634: Research support for NPOV workstreams.

Wed, May 7, 6:30 PM · Research

Tue, May 6

Isaac edited projects for T378617: Update mwedittypes to handle HTML diffs, added: Research-Freezer; removed Research, OKR-Work.

Still worthwhile long-term task but moving to freezer until we have an more urgent need for it.

Tue, May 6, 3:51 PM · Research-Freezer, Research-engineering

Isaac moved T331154: The Knowledge Gap Index from Backlog to Epics on the Research board.

Tue, May 6, 3:49 PM · Epic, Research

Isaac added a comment to T361970: Follow-up investigation on dewiki demographics.

Yu-Ming and I discussed -- leaving in backlog for another month because we might want to pick up if there's an opportunity and ultimately the decision brief should tell us whether we'll be picking up reader survey work soon. I'll be mindful though and move to decline if that becomes the direction we're going.

Tue, May 6, 2:19 PM · Essential-Work, Research

Isaac triaged T393472: Attend ICWSM 2025 conference as High priority.

Tue, May 6, 2:13 PM · Knowledge-Integrity, Research-outreach, Research

Isaac moved T393472: Attend ICWSM 2025 conference from Backlog to Staged on the Research board.

Tue, May 6, 2:13 PM · Knowledge-Integrity, Research-outreach, Research

Wed, Apr 30

Isaac added a comment to T391679: [onboarding] Improving language agnostic articlequality model + service.

I'm late on my acknowledgement but thanks both for engaging here and being open to the feedback!

Wed, Apr 30, 8:48 PM · Patch-For-Review, Lift-Wing, Machine-Learning-Team

Fri, Apr 25

Isaac updated the task description for T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .

Fri, Apr 25, 7:58 PM · Research

Isaac renamed T392210: [WE1.5.3] Wikipedia Patrolling Measurement from [WE1.5.4] Wikipedia Patrolling Measurement to [WE1.5.3] Wikipedia Patrolling Measurement.

Fri, Apr 25, 6:46 PM · OKR-Work, Research, Knowledge-Integrity

Isaac added a comment to T354303: Exploration: How might we use data to identify policies/guidelines newcomers commonly break/defy?.

A few notes from a discussion yesterday with myself, @Pablo, @MNeisler, and @ppelberg:

Context: research is doing some work adjacent to this space in T392210: [WE1.5.3] Wikipedia Patrolling Measurement and we wanted to be aware of opportunities for overlap/collaboration.
The two areas where that work might be able to feed in nicely here:
- @Pablo has already done some work around detecting policy mentions in edit summaries and we will try to incorporate that into our patrolling dataset work. It not only is useful for understanding what policies are being broken but also for understanding how often editors receive that direct feedback via edit summaries when reverted.
- @Pablo developed code for tracking changes to issue templates (citation needed etc.) in T384600. If there are good opportunities to expand out the edit types we track beyond this, that also could be helpful per Peter's comment above about the role edit-types could play
Scope of interest to Editing:
- Top-20 largest languages (where the largest moderator burden is likely to exist)
- Newcomer essentially means <100 edits here.
- Focus on main namespace. In this case, also focus on VisualEditor
- Focus on edits that elicit a "negative" response -- reverts are the most obvious of these but messages to user/talk pages could be another indication (though harder to measure).
- This data would likely be useful in Q1 as decisions start to be made about edit checks to consider in Q2. The Editing team is pretty open at the moment though ideally the Edit Checks that they work on relate to core Wikimedia policies that have salience across many wikis (even if their implementation/interpretation varies).
Outcome: at this point, @MNeisler isn't planning on picking up this ticket and Research isn't intending to answer it directly but we're hopeful that the dataset that comes out of T392210 can be used for these purposes or easily extended to answer these questions. We'll keep each other in the loop where relevant.

Fri, Apr 25, 6:42 PM · EditCheck, Product-Analytics, Editing-team (Tracking)

Isaac added a comment to T392210: [WE1.5.3] Wikipedia Patrolling Measurement.

A thought: we might also consider adding editing interface as a facet of the edit in case certain decisions around gaps need to take that into consideration -- e.g., like how Editing works with VE so has that as their focus (e.g., T354303). This should be pretty easy to add in later so not something we need to decide now.

Fri, Apr 25, 6:35 PM · OKR-Work, Research, Knowledge-Integrity

Isaac updated the task description for T391717: [Q4 FY 24-25 Applied Science] Knowledge Integrity Research.

Fri, Apr 25, 3:10 PM · Research (FY2024-25-Research-April-June)

Isaac renamed T392210: [WE1.5.3] Wikipedia Patrolling Measurement from Metrics on Wikipedia Patrolling Work to [WE1.5.4] Wikipedia Patrolling Measurement.

Fri, Apr 25, 3:09 PM · OKR-Work, Research, Knowledge-Integrity

Thu, Apr 24

Isaac added a comment to T392624: Strange statistics for language variants for languages other than zh and sr.

Adding some context from an internal discussion to hopefully help whoever picks this up:

Thu, Apr 24, 8:46 PM · Data-Engineering-Radar, Analytics-Data-Problem, Data-Engineering

Apr 23 2025

Isaac added a comment to T391679: [onboarding] Improving language agnostic articlequality model + service.

@OKarakaya-WMF a few questions and thoughts to help us figure out how to proceed below. Apologies for the mountain of text but I'm trying to write down my thinking. A general theme is that I didn't design this model to be optimal in performance (because quality labels are inherently messy to begin with) but rather optimal in interpretability/usability. I think Sucheta is helping us to get a lot better at being explicit about some of these design constraints with the newer models but this one originally came from Research's needs followed by an internship and so was a bit more haphazard in its development/hand-off. I don't want to turn this project into something bigger than you all want it to be but if we want to make broader changes, we might want to consider getting some input from Enterprise (so it's not just me saying this and that) and try to write some of our guardrails/needs down more clearly.

Apr 23 2025, 9:57 PM · Patch-For-Review, Lift-Wing, Machine-Learning-Team

Apr 21 2025

Isaac closed T215616: Improve interlingual links across wikis through Wikidata IDs as Resolved.

Stumbled across this and boldly resolving -- we have the https://wikitech.wikimedia.org/wiki/Data_Platform/Data_Lake/Content/Wikidata_item_page_link table now and it looks like folks decided that we didn't need historical mapping information. I'd still love one day for us to publish a regular dump of this table for external users but that's a different task: T258514

Apr 21 2025, 1:03 PM · Data-Engineering-Icebox, Data-Engineering, Analytics-Radar, Research-Freezer, MediaWiki-General, Wikidata

Isaac added a project to T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models : Research.

Apr 21 2025, 11:35 AM · Research

Apr 18 2025

Isaac added a comment to T383202: Check home/HDFS leftovers of aitolkyn.

@MoritzMuehlenhoff I think we can decline this task now as I think this is being tracked under T389809?

Apr 18 2025, 10:22 PM · Data-Platform-SRE (2025.05.02 - 2025.05.23)

Isaac added a comment to T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .

Thanks @diego for putting this together! I'll work on prioritizing. A few thoughts / questions in the meantime to consider:

What factors should we hold steady? Presumably a uniform number of training examples? Are they randomly sampled from all data before the cut-off though or is there some sort of stratification by time or other approach that should be used?
Some stretch ideas once the basic system is working:
- Do you want to explore strategies for reducing the impact of model drift? For instance, testing whether training the model with the most recent data for a particular cut-off as opposed to a more uniform distribution over time? Other data sampling strategies? I wonder if it would make sense to try different types of pre-trained models (for the multilingual revert risk) as well to see if larger/newer base models perhaps are less sensitive to drift? I guess probably can't find good one-to-one comparisons in that regard though so hard to know how much can be learned from switching out the base models.
- I wonder if there's any reason to also explore knowledge cut-offs in the sense of when the article was created? Maybe new articles introduce new vocabulary that throws off the multilingual revert risk model and degrades performance? I wrote up some ideas around how to approach this idea in T383090 and you could just use article creation date.
- I also would be curious to see if we could test whether you see stable patterns as far as which revisions the models get wrong -- e.g., the newest 3 models correctly predicting a particular test row but the oldest 7 models not?

Apr 18 2025, 10:12 PM · Research

Isaac added a comment to T377012: Isaac support for WikiNLP Workshop.

Updates:

We moved back the deadline by a week (now April 30th)
Offering office hours explicitly to folks intending to submit has been helpful I think. I've had two folks use them now and I was able to give some ideas of work to be aware of and important framing considerations for the workshop.
About to do another round of reminders about the workshop

Apr 18 2025, 5:34 PM · Research (FY2024-25-Research-April-June), Essential-Work

Isaac added a comment to T391679: [onboarding] Improving language agnostic articlequality model + service.

The other idea is to train a regression model by converting labels into numbers.
I think I'm in favor of the first approach but I think getting some evaluation scores from both approaches would help to decide.

Yeah, the first model that used wikitext was a regression model where I converted the labels into floats essentially based on their frequency (see this notebook). I found that it didn't work as well (comparison) so I switched to the OrderedModel, which felt a bit more direct/appropriate to me and I still was able to come up with a reasonable way to convert the label probabilities back into a single [0-1] score. But if you can find a way to make the regression one work better, go for it!

Apr 18 2025, 3:16 PM · Patch-For-Review, Lift-Wing, Machine-Learning-Team

Isaac added a comment to T391112: Quant work to support T390257 (reader discovery).

when we hear back about whether the ideas for the quant exploration are on the right track, would you be able to provide a bit of further scoping to this task (updating the description) and confirm who will be working on it? (tentatively next week) Thanks for your help

Acknowledging and yes, I can take that on once we get the okay to proceed with what's been proposed.

Apr 18 2025, 3:01 PM · Research, address-knowledge-gaps, Design-Research

Apr 17 2025

Isaac added a comment to T392210: [WE1.5.3] Wikipedia Patrolling Measurement.

FYI T348863: Baseline: Size of content moderation backlog - FlaggedRevs by KC also has some notebooks for patrolling data that might be of use.

Apr 17 2025, 4:51 PM · OKR-Work, Research, Knowledge-Integrity

Isaac moved T392210: [WE1.5.3] Wikipedia Patrolling Measurement from FY2024-25-Research-April-June to In Progress on the Research board.

Apr 17 2025, 3:55 PM · OKR-Work, Research, Knowledge-Integrity

Isaac updated the task description for T384860: Explore activity at different levels of patrolling on Wikipedia.

Apr 17 2025, 3:55 PM · research-ideas

Isaac removed Due Date on T384860: Explore activity at different levels of patrolling on Wikipedia.

Apr 17 2025, 3:54 PM · research-ideas

Isaac placed T384860: Explore activity at different levels of patrolling on Wikipedia up for grabs.

Unassigning you Pablo and moving this back to research-ideas in the meantime as you'll be working on the related task: T392210

Apr 17 2025, 3:54 PM · research-ideas

Isaac triaged T392210: [WE1.5.3] Wikipedia Patrolling Measurement as High priority.

Apr 17 2025, 3:51 PM · OKR-Work, Research, Knowledge-Integrity

Isaac added a comment to T391679: [onboarding] Improving language agnostic articlequality model + service.

Hey -- glad to see you working on this @OKarakaya-WMF ! A few pointers in case they're helpful and don't hesitate to ask other questions:

For the model in production, the code for training can be found here: https://public-paws.wmcloud.org/User:Isaac%20(WMF)/Quality/html-qual-exploration.ipynb. I think you're using an old script from an earlier version of the model. If the documentation is out of date somewhere, let me know and I'll fix (apologies)!
We've long avoided features related to edits/editors when it comes to quality based on this work: https://grouplens.org/site-content/uploads/2013/09/wikisym2013_warnckewang-cosley-riedl.pdf. Essentially, those features might improve performance but they're fragile because they aren't really tied to the content itself and also not helpful feedback to an editor -- e.g., telling an editor that the way to improve a given article is to edit it more. It could be useful data to compare performance with and without editor-related features but I'd be hesitant to deploy to production unless we could make a strong case.
I considered template counts in the past but dropped it because it wasn't adding much signal and I also struggled with justifying why more templates should be associated with higher quality -- i.e. similar to above, what would it mean to tell an editor that to improve the quality of the article, they should add more templates? I added the infobox feature as one common template that felt reasonable to include as a feature and valid recommendation to editors. The messagebox feature is also a template but a negative feature (because it indicates some sort of quality issue with the article). Curious to see what you come up with there.
I could easily be wrong but I'm pretty sure we're already using the new API endpoints for fetching HTML instead of the old RESTBase ones. See: https://www.mediawiki.org/wiki/RESTBase/service_migration#Parsoid_endpoints but the Content Transform folks would know better.

Apr 17 2025, 2:20 PM · Patch-For-Review, Lift-Wing, Machine-Learning-Team

Apr 14 2025

Isaac updated subscribers of T388805: Load test the language agnostic article-quality model.

Glad to see the latency dropping! One thought: I suspect if we further instrumented the preprocess step, much of the latency is from how long it takes to get the HTML for the revision. @cscott gave that great DPE Deep Dive talk a month ago or so about Parser cache and how it works so tagging him to hopefully help clarify my guesses. The LiftWing model calls the "https://{lang}.wikipedia.org/w/rest.php/v1/revision/{revid}/html" endpoint (code) when assessing the quality for a given revision. I think that means:

Presuming that these are old revids that are being used in the load testing, my understanding is that they probably weren't cached for the initial load test. They may or may not have been cached then for the follow-up requests which would greatly speed up responses.
My guess is that Enterprise is intending to hit the LiftWing API for new revisions so presumably that means the HTML will be cached already so an accurate assessment might be achieved by running a "warm-up" test on the revision IDs used in testing first to force them into the Parser cache and then run the actual load tests. But also, if you want to know worst-case, you might want to switch it to choose random (old) revision IDs or something like that instead.

Apr 14 2025, 12:21 PM · Wikimedia Enterprise - Content Integrity, Lift-Wing, Machine-Learning-Team

Apr 11 2025

Isaac moved T384860: Explore activity at different levels of patrolling on Wikipedia from Backlog to Staged on the Research board.

Apr 11 2025, 7:05 PM · research-ideas

Isaac renamed T384860: Explore activity at different levels of patrolling on Wikipedia from [long] Explore activity at different levels of patrolling on Wikipedia to Explore activity at different levels of patrolling on Wikipedia.

Apr 11 2025, 7:05 PM · research-ideas

Isaac updated the task description for T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Apr 11 2025, 7:03 PM · Research (FY2024-25-Research-April-June)

Isaac moved T382727: Publicize results from SDS 1.2.1 B from FY2024-25-Research-April-June to In Progress on the Research board.

Apr 11 2025, 7:01 PM · Research, User-notice

Isaac moved T387462: [FY26-WE.1.3] User research to inform centralising contributions for moderators from FY2024-25-Research-April-June to In Progress on the Research board.

Apr 11 2025, 7:00 PM · Research, Design-Research

Isaac moved T382611: Readers Survey 2024 Analysis from FY2024-25-Research-April-June to In Progress on the Research board.

Apr 11 2025, 7:00 PM · Research, address-knowledge-gaps, Essential-Work

Isaac moved T391125: Applied Science Survey Planning Decision Brief from FY2024-25-Research-April-June to In Progress on the Research board.

Apr 11 2025, 7:00 PM · Research, Essential-Work

Isaac moved T391499: Develop proposal for understanding how to support moderator motivations from FY2024-25-Research-April-June to In Progress on the Research board.

Apr 11 2025, 7:00 PM · Research

Isaac moved T391707: [Q4 FY 24-25 Applied Science] Knowledge Gaps Research from Backlog to FY2024-25-Research-April-June on the Research board.

Apr 11 2025, 7:00 PM · Research (FY2024-25-Research-April-June)

Isaac moved T391717: [Q4 FY 24-25 Applied Science] Knowledge Integrity Research from Backlog to FY2024-25-Research-April-June on the Research board.

Apr 11 2025, 7:00 PM · Research (FY2024-25-Research-April-June)

Isaac moved T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research from Backlog to FY2024-25-Research-April-June on the Research board.

Apr 11 2025, 7:00 PM · Research (FY2024-25-Research-April-June)

Isaac triaged T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research as High priority.

Apr 11 2025, 6:59 PM · Research (FY2024-25-Research-April-June)

Isaac created T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Apr 11 2025, 6:59 PM · Research (FY2024-25-Research-April-June)

Isaac triaged T391717: [Q4 FY 24-25 Applied Science] Knowledge Integrity Research as High priority.

Apr 11 2025, 6:54 PM · Research (FY2024-25-Research-April-June)

Isaac created T391717: [Q4 FY 24-25 Applied Science] Knowledge Integrity Research.

Apr 11 2025, 6:54 PM · Research (FY2024-25-Research-April-June)

Isaac triaged T391707: [Q4 FY 24-25 Applied Science] Knowledge Gaps Research as High priority.

Apr 11 2025, 5:31 PM · Research (FY2024-25-Research-April-June)

Isaac updated the task description for T391707: [Q4 FY 24-25 Applied Science] Knowledge Gaps Research.

Apr 11 2025, 5:31 PM · Research (FY2024-25-Research-April-June)

Isaac created T391707: [Q4 FY 24-25 Applied Science] Knowledge Gaps Research.

Apr 11 2025, 5:26 PM · Research (FY2024-25-Research-April-June)

Apr 10 2025

Isaac moved T391146: Developer Satisfaction Survey: Assist in prep and presentation of slide deck from Support Needed to FY2024-25-Research-April-June on the Research board.

Apr 10 2025, 6:06 PM · Research (FY2024-25-Research-April-June), Essential-Work

Isaac moved T391146: Developer Satisfaction Survey: Assist in prep and presentation of slide deck from FY2024-25-Research-April-June to Support Needed on the Research board.

Apr 10 2025, 6:05 PM · Research (FY2024-25-Research-April-June), Essential-Work

Apr 9 2025

Isaac moved T391499: Develop proposal for understanding how to support moderator motivations from Backlog to FY2024-25-Research-April-June on the Research board.

Apr 9 2025, 5:03 PM · Research

Isaac triaged T391499: Develop proposal for understanding how to support moderator motivations as High priority.

Apr 9 2025, 5:03 PM · Research

Isaac created T391499: Develop proposal for understanding how to support moderator motivations.

Apr 9 2025, 5:02 PM · Research

Mar 28 2025

Isaac closed T383365: Build a taxonomy of issue templates, a subtask of T371865: Who are moderators?, as Resolved.

Mar 28 2025, 10:19 PM · Research, Epic

Isaac closed T383365: Build a taxonomy of issue templates as Resolved.

@Isaac please review for sign-off

Thank you - done!

Mar 28 2025, 10:19 PM · Research, Essential-Work

Isaac added a comment to T384600: Crowdsourced content moderation metrics.

If you choose to create a separate task for the paper writing, feel free to resolve this one (thanks!). Otherwise, let's keep it open for the next week and then resolve it when the paper is submitted. I did a quick read-through of the paper as of Friday morning my time. A few thoughts:

I know you plan to do some clean-up of the paper so I didn't pay too much attention to particulars. One thing: Footnote 8 about HTML dataset. It actually was collected from the APIs by Fabian and not the dumps (because we needed individual revisions not a snapshot in time). That dumps endpoint for the Enterprise Dumps are being deprecated too in favor of their Snapshot APIs, so I'd either link to there (https://enterprise.wikimedia.com/api/) or documentation about Parsoid's APIs (https://www.mediawiki.org/wiki/RESTBase/service_migration#Parsoid_endpoints).
My major point would be to motivate why maintenance tagging is important to understand upfront in the Introduction (beyond it not receiving much study). A few potential ideas:
- You highlight usage of the templates within ML research and it might be worth raising that up -- i.e. these templates are a source of labels for training classification models and so it's important to understand how they're used in practice and whether this tagging extends across many language communities. We also provide a more scalable data collection approach that could be used for those ends.
- These templates are also used as a source of tasks for editors themselves via recommender systems. You could cite SuggestBot as well as Newcomer Tasks.
- Highlight the importance of having these templates as a pathway separate from reverting content. This offers Wikipedians the ability to flag issues without reverting edits. This is an important alternative remedy given that reverting edits can have a negative impact on newcomer retention (Rise and decline). You might note too though that other work has not shown that tagging necessarily leads to change in the editors who are being flagged: https://dl.acm.org/doi/abs/10.1145/3274406.
- I think it's worthwhile to mention the similarity between this particular system and the turn to crowd-sourced moderation on X/Facebook. Just something like: "Given the growing interest in crowd-sourced moderation of this style in social media platforms like X (maybe cite something like https://dl.acm.org/doi/abs/10.1145/3686967) or Facebook (not sure if there's a research paper yet but citing a news article could work), it's important to understand how it works on platforms with a long history of community moderation like Wikipedia."

Mar 28 2025, 10:14 PM · Research (FY2024-25-Research-April-June), Essential-Work

Isaac updated the task description for T382727: Publicize results from SDS 1.2.1 B.

Mar 28 2025, 9:02 PM · Research, User-notice

Isaac moved T377264: Moderator tools baseline satisfaction surveys from In Progress to FY2024-25-Research-April-June on the Research board.

Mar 28 2025, 8:58 PM · Research (FY2024-25-Research-April-June), OKR-Work

Isaac moved T382611: Readers Survey 2024 Analysis from In Progress to FY2024-25-Research-April-June on the Research board.

Mar 28 2025, 8:58 PM · Research, address-knowledge-gaps, Essential-Work

Isaac moved T382727: Publicize results from SDS 1.2.1 B from In Progress to FY2024-25-Research-April-June on the Research board.

Mar 28 2025, 8:56 PM · Research, User-notice

Isaac moved T384600: Crowdsourced content moderation metrics from In Progress to FY2024-25-Research-April-June on the Research board.

Mar 28 2025, 8:56 PM · Research (FY2024-25-Research-April-June), Essential-Work

Isaac updated the task description for T382611: Readers Survey 2024 Analysis.

Mar 28 2025, 8:22 PM · Research, address-knowledge-gaps, Essential-Work

Isaac added a comment to T382611: Readers Survey 2024 Analysis.

Switching to expectation of reporting on Meta -> Officewiki so we have more time to share out with relevant parties and incorporated any of their feedback before publishing. Rather than resolve this and open a new task for the remaining tasks, I'm going to extend the deadline to the Youth Conference (when we definitely want a public report) and add a new Meta milestone.

Mar 28 2025, 8:18 PM · Research, address-knowledge-gaps, Essential-Work

Isaac updated the task description for T382611: Readers Survey 2024 Analysis.

Mar 28 2025, 8:15 PM · Research, address-knowledge-gaps, Essential-Work

Isaac closed T383594: [Q3 FY 24-25 Applied Sciences] Survey Research as Resolved.

Thanks @YLiou_WMF !

Mar 28 2025, 8:15 PM · Research (FY2024-25-Research-January-March)

Isaac updated the task description for T383594: [Q3 FY 24-25 Applied Sciences] Survey Research.

Mar 28 2025, 8:14 PM · Research (FY2024-25-Research-January-March)

Isaac closed T383610: [Q3 FY 24-25 Applied Science] Moderation Research as Resolved.

Thanks @Pablo ! I'd talk with Diego but for the meta page, it might be worthwhile to make it a sub-page of https://meta.wikimedia.org/wiki/Research:Develop_a_working_definition_for_moderation_activity_and_moderators to preserve the continuity. In theory, we'll continue to do deeper dives into other aspects of moderation that were defined in that original report and it might be nice to keep them together.

Mar 28 2025, 8:09 PM · Research (FY2024-25-Research-January-March)

Isaac closed T383612: [Q3 FY 24-25 Applied Science] AI Research as Resolved.

Thanks @MGerlach !

Mar 28 2025, 8:07 PM · Research (FY2024-25-Research-January-March)

Isaac updated the task description for T383612: [Q3 FY 24-25 Applied Science] AI Research.

Mar 28 2025, 8:06 PM · Research (FY2024-25-Research-January-March)

Isaac updated the task description for T361637: Support for topic infrastructure work.

Mar 28 2025, 7:19 PM · Research, OKR-Work

Isaac closed T343241: Build a taxonomy for "impactful topics", a subtask of T331154: The Knowledge Gap Index, as Resolved.

Mar 28 2025, 7:16 PM · Epic, Research

Isaac closed T343241: Build a taxonomy for "impactful topics" as Resolved.

Doing a little phab clean-up and resolving this. Focus groups wrapped up and resulted in the following artifacts:

Follow-up work is being discussed to establish a hypothesis for making the updates to the production model and incorporating it into our recommender systems.

Mar 28 2025, 7:16 PM · OKR-Work, Research

Isaac closed T383614: [Q3 FY 24-25 Applied Science] Knowledge Gaps Research as Resolved.

I added a final status to each of the project in the task description. A few notable things:

@CMyrick-WMF will put together a final status report for the Language Gaps metrics work by April 7th -- that will be added to T348246 which can then be resolved.
We wrapped up the focus group phase of the topic model V2 project. This led to a project report and updated prototype and description of taxonomy change.
@TAndic 's Editor Metric Consultation work has been slightly extended into April by Movement Insights but should wrap up then.
We will reopen consultation support for Small Language Projects and Identify Web Scraping as needed in Q4 though nothing specific expected at this time.
We should receive initial paper notification for the Epistemic Injustice Paper on April 8th but no further work should be required there.

Mar 28 2025, 7:13 PM · Research (FY2024-25-Research-January-March)

Isaac set Due Date to Apr 7 2025, 4:00 AM on T348246: Develop Metrics for the Language Gap.

Mar 28 2025, 7:06 PM · Movement-Insights, Epic, Research

Isaac updated the task description for T383614: [Q3 FY 24-25 Applied Science] Knowledge Gaps Research.

Mar 28 2025, 6:57 PM · Research (FY2024-25-Research-January-March)

Isaac moved T377012: Isaac support for WikiNLP Workshop from FY2024-25-Research-January-March to FY2024-25-Research-April-June on the Research board.

Mar 28 2025, 6:48 PM · Research (FY2024-25-Research-April-June), Essential-Work

Mar 26 2025

Isaac set Due Date to Apr 7 2025, 4:00 AM on T384600: Crowdsourced content moderation metrics.

Mar 26 2025, 9:17 PM · Research (FY2024-25-Research-April-June), Essential-Work

Isaac created T390104: [Research Engineering Request] Add vital-article language gap to Content Gap Metrics.

Mar 26 2025, 8:11 PM · Research-engineering, Research

Isaac added a comment to T381544: Write Research Report No 11.

Thanks @DDeSouza! I noticed a few small things that got missed on my original copyedit pass so I went ahead and fixed them. If that merge request looks good to you, then I think we're ready to publish.

Mar 26 2025, 12:23 PM · Research (FY2024-25-Research-January-March)

Mar 25 2025

Isaac edited projects for T374279: Develop Metrics for the Language Gap: Develop metrics for language coverage on Wiki Commons (descriptions), added: Research-Freezer; removed Research.

Mar 25 2025, 2:22 PM · Research-Freezer

Isaac (Isaac Johnson)
Research Scientist

Projects

Calendar

Today

Tomorrow

Monday

User Details

Recent Activity
View All

Mon, May 19

Fri, May 16

Thu, May 15

Wed, May 14

Tue, May 13

Fri, May 9

Wed, May 7

Tue, May 6

Wed, Apr 30

Fri, Apr 25

Thu, Apr 24

Apr 23 2025

Apr 21 2025

Apr 18 2025

Apr 17 2025

Apr 14 2025

Apr 11 2025

Apr 10 2025

Apr 9 2025

Mar 28 2025

Mar 26 2025

Mar 25 2025

Isaac (Isaac Johnson)Research Scientist

Projects

Calendar

Today

Tomorrow

Monday

User Details

Recent ActivityView All

Mon, May 19

Fri, May 16

Thu, May 15

Wed, May 14

Tue, May 13

Fri, May 9

Wed, May 7

Tue, May 6

Wed, Apr 30

Fri, Apr 25

Thu, Apr 24

Apr 23 2025

Apr 21 2025

Apr 18 2025

Apr 17 2025

Apr 14 2025

Apr 11 2025

Apr 10 2025

Apr 9 2025

Mar 28 2025

Mar 26 2025

Mar 25 2025

Isaac (Isaac Johnson)
Research Scientist

Recent Activity
View All