Page MenuHomePhabricator

Scott_French (Scott French)
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Jan 18 2024, 5:33 PM (70 w, 2 d)
Availability
Available
LDAP User
Scott French
MediaWiki User
SFrench-WMF [ Global Accounts ]

Recent Activity

Thu, May 22

Scott_French edited projects for T395052: Stale labels applied when the pod IP of a terminated k8s job is reused, added: SRE Observability; removed observability.

Alright, I think https://gerrit.wikimedia.org/r/1149505 should do what's needed, in the three "sufficiently broad" scrape configs where this issue is likely to be relevant.

Thu, May 22, 10:47 PM · SRE Observability, Patch-For-Review, serviceops
Scott_French updated the task description for T395052: Stale labels applied when the pod IP of a terminated k8s job is reused.
Thu, May 22, 10:39 PM · SRE Observability, Patch-For-Review, serviceops
Scott_French updated the task description for T394556: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition.
Thu, May 22, 9:34 PM · MediaWiki-Engineering, serviceops
Scott_French closed T391057: Turn down MediaWiki image builds for PHP 7.4, a subtask of T319432: Migrate WMF production from PHP 7.4 to PHP 8.1, as Resolved.
Thu, May 22, 9:34 PM · Data-Engineering-Radar, Data-Engineering, Dumps-Generation, MediaWiki-Platform-Team, serviceops
Scott_French closed T391057: Turn down MediaWiki image builds for PHP 7.4 as Resolved.

Since a bit before 17:30 UTC today, we are no longer building PHP 7.4 ("publish" flavour) MediaWiki images during scap deployments. No further work is tracked here, so I'm marking this resolved.

Thu, May 22, 9:33 PM · serviceops
Scott_French added a comment to T395052: Stale labels applied when the pod IP of a terminated k8s job is reused.

Just came across https://github.com/prometheus/prometheus/issues/10755, which indeed recommends using __meta_kubernetes_pod_phase to skip terminated pods returned by service discovery.

Thu, May 22, 9:22 PM · SRE Observability, Patch-For-Review, serviceops
Scott_French updated the task description for T395052: Stale labels applied when the pod IP of a terminated k8s job is reused.
Thu, May 22, 9:12 PM · SRE Observability, Patch-For-Review, serviceops
Scott_French renamed T395052: Stale labels applied when the pod IP of a terminated k8s job is reused from Stale labels applies upon terminated job pod IP reuse to Stale labels applied when the pod IP of a terminated k8s job is reused.
Thu, May 22, 7:14 PM · SRE Observability, Patch-For-Review, serviceops
Scott_French created T395052: Stale labels applied when the pod IP of a terminated k8s job is reused.
Thu, May 22, 4:06 PM · SRE Observability, Patch-For-Review, serviceops
Scott_French updated subscribers of T393803: Create redirect from tj.*.org to tg.*.org.

I just chatted with @jasmine_, who is interested in helping to deploy this change. Many thanks for preparing a patch, @Dzahn!

Thu, May 22, 12:04 AM · Patch-For-Review, serviceops, Traffic, SRE, DNS

Wed, May 21

Scott_French added a comment to T352245: Migrate etcd::tlsproxy Nginx certs and etcd itself to PKI.

Thank you very much, @Vgutierrez - this is great, and thank you for your offer to assist with validation.

Wed, May 21, 11:21 PM · Patch-For-Review, serviceops
Scott_French updated subscribers of T394556: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition.

After spending some time reading through T292552 and the code in uppercaseTitlesForUnicodeTransition.php [0], I think I better understand the process here.

Wed, May 21, 10:58 PM · MediaWiki-Engineering, serviceops
Scott_French added a comment to T391057: Turn down MediaWiki image builds for PHP 7.4.

Last step here is to merge https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/172 and scap sync-world to integrate it. I'll aim to do that either today or (more likely) tomorrow (May 22nd).

Wed, May 21, 7:47 PM · serviceops
Scott_French updated the task description for T391057: Turn down MediaWiki image builds for PHP 7.4.
Wed, May 21, 7:46 PM · serviceops
Scott_French added a comment to T388534: Migrate updatequerypages/update_special_pages/initsitestats jobs to mw-cron.

Remaining update-special-pages jobs have been migrated, following a successful manual test of the s6 shard earlier today. First scheduled run for all will happen at 5:00 UTC on the 22nd (i.e., soon).

Wed, May 21, 5:39 PM · serviceops, MediaWiki-Special-pages
Scott_French updated the task description for T388534: Migrate updatequerypages/update_special_pages/initsitestats jobs to mw-cron.
Wed, May 21, 5:38 PM · serviceops, MediaWiki-Special-pages

Mon, May 19

Scott_French updated the task description for T388534: Migrate updatequerypages/update_special_pages/initsitestats jobs to mw-cron.
Mon, May 19, 8:30 PM · serviceops, MediaWiki-Special-pages

Sun, May 18

Scott_French updated the task description for T394609: Silence RESTGatewayBackendErrorsHigh for envoy_cluster_name: mobileapps_cluster.
Sun, May 18, 5:38 PM · serviceops, SRE
Scott_French updated the task description for T394609: Silence RESTGatewayBackendErrorsHigh for envoy_cluster_name: mobileapps_cluster.
Sun, May 18, 5:36 PM · serviceops, SRE
Scott_French triaged T394610: mobileapps returns 500 error in response to malformed /v1/page/summary path as High priority.
Sun, May 18, 5:34 PM · serviceops, SRE, Page Content Service
Scott_French created T394610: mobileapps returns 500 error in response to malformed /v1/page/summary path.
Sun, May 18, 5:34 PM · serviceops, SRE, Page Content Service
Scott_French created T394609: Silence RESTGatewayBackendErrorsHigh for envoy_cluster_name: mobileapps_cluster.
Sun, May 18, 5:23 PM · serviceops, SRE

Fri, May 16

Scott_French added a comment to T394556: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition.

Going by the git history on UcfirstOverrides.php, T292552: Rename articles and users to update our case mapping to PHP 7.4 and Unicode 11 appears to contain the most recent prior art on this process.

Fri, May 16, 9:46 PM · MediaWiki-Engineering, serviceops
Scott_French added a subtask for T319432: Migrate WMF production from PHP 7.4 to PHP 8.1: T394556: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition.
Fri, May 16, 9:30 PM · Data-Engineering-Radar, Data-Engineering, Dumps-Generation, MediaWiki-Platform-Team, serviceops
Scott_French added a parent task for T394556: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition: T319432: Migrate WMF production from PHP 7.4 to PHP 8.1.
Fri, May 16, 9:30 PM · MediaWiki-Engineering, serviceops
Scott_French updated the task description for T394556: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition.
Fri, May 16, 9:29 PM · MediaWiki-Engineering, serviceops
Scott_French created T394556: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition.
Fri, May 16, 9:28 PM · MediaWiki-Engineering, serviceops
Scott_French added a comment to T391057: Turn down MediaWiki image builds for PHP 7.4.

Given that we're well on our way to completing the periodic jobs migration and have not run into showstopper 8.1-compatibility issues, and we've removed the 7.4 fallback functionality from mwscript-k8s, I think it make sense to move forward with this.

Fri, May 16, 5:48 PM · serviceops

Wed, May 14

Scott_French closed T385866: Migrate CentralAuth maintenance jobs to mw-cron, a subtask of T341555: Implement periodic maintenance scripts for mw-on-k8s, as Resolved.
Wed, May 14, 6:10 PM · Patch-For-Review, serviceops, MW-on-K8s
Scott_French closed T385866: Migrate CentralAuth maintenance jobs to mw-cron as Resolved.

The first post-migration hourly runs of centralauth-backfilllocalaccounts.php-loginwiki and centralauth-backfilllocalaccounts.php-metawiki have completed successfully. The logs contain roughly the same content as that from the last last pre-migration timer runs (i.e., reporting autoCreateUser failures due to IP blocks).

Wed, May 14, 6:10 PM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s
Scott_French updated the task description for T385866: Migrate CentralAuth maintenance jobs to mw-cron.
Wed, May 14, 6:09 PM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s
Scott_French added a comment to T385866: Migrate CentralAuth maintenance jobs to mw-cron.

The first post-migration run of purge-expired-userrights succeeded earlier today. I also triggered a manual run of purge-expired-global-rights, which also succeeded.

Wed, May 14, 4:08 PM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s
Scott_French closed T388535: Migrate flaggedrevs jobs to mw-cron, a subtask of T341555: Implement periodic maintenance scripts for mw-on-k8s, as Resolved.
Wed, May 14, 12:28 AM · Patch-For-Review, serviceops, MW-on-K8s
Scott_French closed T388535: Migrate flaggedrevs jobs to mw-cron as Resolved.

The first run at 00:08 today (May 14th) completed without issue, and with very similar total elapsed time to the bare-metal case (a bit more than 6m). I've also spot-checked Special:ValidationStatistics on a handful of wikis, and the new stats values look reasonable relative to the prior ones (generated by the update-special-pages runs earlier today).

Wed, May 14, 12:28 AM · serviceops, FlaggedRevs

Tue, May 13

Scott_French added a comment to T385866: Migrate CentralAuth maintenance jobs to mw-cron.

Updates:

  • The first run of purge-temporary-accounts appears to have completed successfully earlier today.
  • The purge-expired-userrights and purge-expired-global-rights jobs have now been migrated as well.
    • Their next scheduled executions are on the 14th and 17th, respectively.
    • I'll validate the former tomorrow once it has run. For the latter, we may want to trigger a manual run so as to avoid having the first run fall on a weekend.
  • Next-up: the hourly loginwiki and metawiki account backfill jobs in profile::mediawiki::maintenance::backfill_localaccounts.
Tue, May 13, 6:03 PM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s
Scott_French updated the task description for T385866: Migrate CentralAuth maintenance jobs to mw-cron.
Tue, May 13, 5:55 PM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s
Scott_French claimed T388535: Migrate flaggedrevs jobs to mw-cron.

The update-flaggedrev-stats job has been migrated. I'll hold onto this until I'm able to confirm the first run is successful later today.

Tue, May 13, 5:55 PM · serviceops, FlaggedRevs
Scott_French updated the task description for T388535: Migrate flaggedrevs jobs to mw-cron.
Tue, May 13, 5:53 PM · serviceops, FlaggedRevs
Scott_French closed T388530: Migrate MediaWiki-Page-derived-data jobs to mw-cron, a subtask of T341555: Implement periodic maintenance scripts for mw-on-k8s, as Resolved.
Tue, May 13, 5:48 PM · Patch-For-Review, serviceops, MW-on-K8s
Scott_French closed T388530: Migrate MediaWiki-Page-derived-data jobs to mw-cron as Resolved.

The remaining shards of the (renamed) job have been migrated. Given what we saw with the pilot on s6, I'm optimistic that these will all work without issue. However, I'll still plan to keep an eye on things as we enter June (and will set some calendar reminders for this).

Tue, May 13, 5:48 PM · serviceops, MediaWiki-Page-derived-data
Scott_French updated the task description for T388530: Migrate MediaWiki-Page-derived-data jobs to mw-cron.
Tue, May 13, 5:45 PM · serviceops, MediaWiki-Page-derived-data
Scott_French added a comment to T352245: Migrate etcd::tlsproxy Nginx certs and etcd itself to PKI.

Thank you both! In short, confd uses go.etcd.io/etcd just like Liberica, and thus will pick up the WMF root PKI CA cert from /etc/ssl/certs without intervention - and, importantly for all these cases, we're going to be presenting the certificate bundle that contains the PKI intermediate. Apologies for not writing this all down already and saving you all the digging.

Tue, May 13, 3:05 PM · Patch-For-Review, serviceops

Mon, May 12

Scott_French added a comment to T388535: Migrate flaggedrevs jobs to mw-cron.

From a quick read of [0] and (more importantly) [1], this looks relatively low-risk to migrate.

Mon, May 12, 9:59 PM · serviceops, FlaggedRevs
Scott_French added a comment to T373752: Build php-uuid package, and add to WMF production and CI.

Given the progress made so far in the periodic maintenance jobs migration to k8s (and thus 8.1), it probably does not make sense to support this on 7.4, unless of course there is some urgent need of which I'm not aware (my read of T373752#10756865 is that there is not, but I might be misunderstanding).

Mon, May 12, 8:43 PM · Patch-For-Review, serviceops
Scott_French updated the task description for T385866: Migrate CentralAuth maintenance jobs to mw-cron.
Mon, May 12, 8:01 PM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s
Scott_French updated the task description for T388530: Migrate MediaWiki-Page-derived-data jobs to mw-cron.
Mon, May 12, 6:49 PM · serviceops, MediaWiki-Page-derived-data
Scott_French added a comment to T388530: Migrate MediaWiki-Page-derived-data jobs to mw-cron.

After the s6 shard was migrated to k8s, I stated a manual pilot run of the job using this procedure. Thanks to @Krinkle for the discussion on safety concerns around this.

Mon, May 12, 6:20 PM · serviceops, MediaWiki-Page-derived-data
Scott_French claimed T385866: Migrate CentralAuth maintenance jobs to mw-cron.

purge_temporary_accounts is now migrated, which I'll verify tomorrow after the first scheduled k8s-based run (14:27).

Mon, May 12, 5:57 PM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s

Fri, May 9

Scott_French added a comment to T388761: scap needs to be k8s-cluster aware.

Finally had a chance to polish my MR a bit and post it today (draft). I'll take one more look on Monday before sending it for review for real-real, but folks are welcome to take a look before that if so inclined.

Fri, May 9, 9:45 PM · Patch-For-Review, Release-Engineering-Team, Dumps-Generation, Scap
Scott_French updated subscribers of T352245: Migrate etcd::tlsproxy Nginx certs and etcd itself to PKI.

Alright, after a quick review, the main thing that's changed since November is the migration of all (high-traffic) LVS hosts outside core DCs to Liberica. This has a couple of implications:

  1. We no longer need to deal with pybal in those sites. This also means that conf1009 is no longer an unusually risky host to update (other than being the etcd-mirror source host).
  2. We need to figure out how best to validate that the Liberica control-plane continues to operate as expected after the update.
  3. We need to be aware of the fact that Liberica operates like confd, in that all etcd nodes in the associated core DC (profile::liberica::etcd_config > conftool_domain) are considered, rather than one.
    • As a corollary, if we think it's still valuable for only one of ulsfo and eqsin to be exposed to the initial update, the only way to achieve that would be to temporarily shift one to eqiad.
    • IMO, precisely because Liberica considers all nodes, I don't think that's worth it anymore (i.e., errors communicating with one node no longer means "cannot talk to etcd at all" as it did with pybal).
Fri, May 9, 6:47 PM · Patch-For-Review, serviceops
Scott_French added a comment to T352245: Migrate etcd::tlsproxy Nginx certs and etcd itself to PKI.

Now that the PHP 8.1 migration is winding down, this is near the top of my list of items to pick back up. If you have the cycles for it, I'd definitely be interested in having a second pair of eyes / hands on this.

Fri, May 9, 4:26 PM · Patch-For-Review, serviceops
Scott_French added a comment to T393727: Prometheus black-box probes for all puppetmaster hosts are failing.

Thank you both for the follow-up! If acks were the solution before and they're now back in place, then by all means let's resolve this :)

Fri, May 9, 2:41 PM · observability, Infrastructure-Foundations

Thu, May 8

Scott_French updated the task description for T391057: Turn down MediaWiki image builds for PHP 7.4.
Thu, May 8, 11:07 PM · serviceops
Scott_French added a comment to T385866: Migrate CentralAuth maintenance jobs to mw-cron.

Great, thanks for confirming, @Tgr - I'll get started migrating these first thing next week, and I'll keep that in mind as an option.

Thu, May 8, 9:59 PM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s
Scott_French added a comment to T393612: db1247 crash or restart - 15:29 on 2025-05-07.

I've extended the downtime to 7 days (from now), as it's unlikely this host will be returned to service before the original one would have expired tomorrow.

Thu, May 8, 9:36 PM · DBA
Scott_French updated subscribers of T393727: Prometheus black-box probes for all puppetmaster hosts are failing.

@fgiunchedi - I'm having a hard time sorting out what the outcome w.r.t. these probe failures was from T373369 and / or T326657. Was there a long-term silence that might have recently expired?

Thu, May 8, 6:22 PM · observability, Infrastructure-Foundations
Scott_French updated the task description for T393727: Prometheus black-box probes for all puppetmaster hosts are failing.
Thu, May 8, 6:20 PM · observability, Infrastructure-Foundations
Scott_French created T393727: Prometheus black-box probes for all puppetmaster hosts are failing.
Thu, May 8, 6:12 PM · observability, Infrastructure-Foundations
Scott_French added a comment to T393296: db1246 crashed yet again.

I've re-added a 1w downtime, as the earlier one was removed as a side-effect of the reimage. If we expect the host to be powered on for ongoing work, and also expect that work to extend past 1w from now, it may be a better option to set profile::monitoring::notifications_enabled: false in hiera.

Thu, May 8, 4:36 PM · SRE, DC-Ops, ops-eqiad, DBA
Scott_French added a comment to T385866: Migrate CentralAuth maintenance jobs to mw-cron.

I've posted a handful of patches to migrate the periodic jobs tracked here.

Thu, May 8, 1:33 AM · MediaWiki-Platform-Team (Radar), MediaWiki-extensions-CentralAuth, serviceops, MW-on-K8s

Wed, May 7

Scott_French added a comment to T388530: Migrate MediaWiki-Page-derived-data jobs to mw-cron.

I'll be driving the migration of these jobs.

Wed, May 7, 8:38 PM · serviceops, MediaWiki-Page-derived-data
Scott_French closed T392938: Remove PHP 7.4 from deployment hosts, a subtask of T319432: Migrate WMF production from PHP 7.4 to PHP 8.1, as Resolved.
Wed, May 7, 7:26 PM · Data-Engineering-Radar, Data-Engineering, Dumps-Generation, MediaWiki-Platform-Team, serviceops
Scott_French closed T392938: Remove PHP 7.4 from deployment hosts as Resolved.

This was completed by ~ 17:30 UTC today. No issues encountered on either host during the update, and PHP appears to work via extremely basic tests:

Wed, May 7, 7:26 PM · serviceops
Scott_French added a comment to T393612: db1247 crash or restart - 15:29 on 2025-05-07.

FYI, the downtime I've applied is only 2 days, on the suspicion that the host is fine (e.g., just needs a clean bill of health before being returned to service). This may need extended if that's not the case.

Wed, May 7, 3:49 PM · DBA
Scott_French renamed T393612: db1247 crash or restart - 15:29 on 2025-05-07 from db1247 crash - 15:29 on 2025-05-07 to db1247 crash or restart - 15:29 on 2025-05-07.
Wed, May 7, 3:44 PM · DBA
Scott_French created T393612: db1247 crash or restart - 15:29 on 2025-05-07.
Wed, May 7, 3:38 PM · DBA

Tue, May 6

Scott_French closed T393395: Migrate moderator-tools jobs to mw-cron, a subtask of T341555: Implement periodic maintenance scripts for mw-on-k8s, as Resolved.
Tue, May 6, 3:49 PM · Patch-For-Review, serviceops, MW-on-K8s
Scott_French closed T393395: Migrate moderator-tools jobs to mw-cron as Resolved.

The changes to notification settings are now live.

Tue, May 6, 3:49 PM · serviceops, Moderator-Tools-Team
Scott_French added a comment to T393296: db1246 crashed yet again.

FYI, I've silenced notifications from this host for the next week, to avoid repeated pages while work is ongoing. These will need cleared if the host is returned to service earlier than that.

Tue, May 6, 3:47 PM · SRE, DC-Ops, ops-eqiad, DBA

Mon, May 5

Scott_French closed T388536: Migrate community-tech jobs to mw-cron as Resolved.

Both jobs have now had a successful first run:

Mon, May 5, 11:11 PM · serviceops, Community-Tech
Scott_French closed T388536: Migrate community-tech jobs to mw-cron, a subtask of T341555: Implement periodic maintenance scripts for mw-on-k8s, as Resolved.
Mon, May 5, 11:11 PM · Patch-For-Review, serviceops, MW-on-K8s
Scott_French updated the task description for T388536: Migrate community-tech jobs to mw-cron.
Mon, May 5, 11:11 PM · serviceops, Community-Tech
Scott_French updated the task description for T388536: Migrate community-tech jobs to mw-cron.
Mon, May 5, 8:50 PM · serviceops, Community-Tech
Scott_French added a comment to T390630: Alert when disk space utilization on sessionstore nodes is trending high.

After a bit of thought and some back-testing over the last 2 months of data, https://gerrit.wikimedia.org/r/1141959 sketches out what T390630#10787491 could look like.

Mon, May 5, 7:26 PM · Patch-For-Review, Cassandra, SRE-OnFire, Sustainability (Incident Followup)
Scott_French added a comment to T393395: Migrate moderator-tools jobs to mw-cron.

Per discussion in #talk-to-moderator-tools, the desired phabricator tag for notifications is Moderator-Tools-Team. The pending patches will update those settings accordingly.

Mon, May 5, 6:07 PM · serviceops, Moderator-Tools-Team
Scott_French added a comment to T388536: Migrate community-tech jobs to mw-cron.

The LoginNotify and PageAssessments jobs have both been migrated. I'll follow up later today to confirm their first scheduled runs succeed (23:00 and 20:42 UTC respectively) before closing this out.

Mon, May 5, 5:36 PM · serviceops, Community-Tech
Scott_French updated the task description for T388536: Migrate community-tech jobs to mw-cron.
Mon, May 5, 3:52 PM · serviceops, Community-Tech
Scott_French added a subtask for T341555: Implement periodic maintenance scripts for mw-on-k8s: T393395: Migrate moderator-tools jobs to mw-cron.
Mon, May 5, 3:49 PM · Patch-For-Review, serviceops, MW-on-K8s
Scott_French added a parent task for T393395: Migrate moderator-tools jobs to mw-cron: T341555: Implement periodic maintenance scripts for mw-on-k8s.
Mon, May 5, 3:49 PM · serviceops, Moderator-Tools-Team
Scott_French created T393395: Migrate moderator-tools jobs to mw-cron.
Mon, May 5, 3:49 PM · serviceops, Moderator-Tools-Team

Fri, May 2

Scott_French added a comment to T390630: Alert when disk space utilization on sessionstore nodes is trending high.

Following up on how this alerting might evolve, there was some discussion in T392989 about how to make the alert insensitive to transient excursions during large compactions.

Fri, May 2, 9:40 PM · Patch-For-Review, Cassandra, SRE-OnFire, Sustainability (Incident Followup)
Scott_French added a comment to T392989: SessionStoreDiskSpaceUtilizationTooHigh brief spike.

Possible alternative: Rather than using an aggregation over hosts, we could instead use the minimum utilization over a rolling window, calibrated the typical duration of a large compaction. IMO, that's an easier-to-reason-about mechanism for "ignoring" the compactions vs. aggregating over hosts (e.g., we frequently see correlated behavior across hosts). As @Eevans notes, we would likely want to combine that with dropping the alert threshold.

Fri, May 2, 5:09 PM · Cassandra
Scott_French updated subscribers of T388536: Migrate community-tech jobs to mw-cron.

I chatted with @MusikAnimal from Community-Tech earlier today, who confirmed there are no concerns with migrating the PageAssessments and LoginNotify jobs. However, it sounds like the already-migrated PageTriage jobs are actually owned by Moderator-Tools-Team. I'll fork those off to a separate task and follow up to update the alert routing.

Fri, May 2, 4:46 PM · serviceops, Community-Tech
Scott_French added a comment to T388761: scap needs to be k8s-cluster aware.

Following up, I did get a chance to sketch this out, and indeed (1) it's not all that complicated in practice but (2) it does run head-first into the naming consistency question I mentioned.

Fri, May 2, 1:28 AM · Patch-For-Review, Release-Engineering-Team, Dumps-Generation, Scap

Thu, May 1

Scott_French added a comment to T388536: Migrate community-tech jobs to mw-cron.

For the record, https://gerrit.wikimedia.org/r/1139923 was reverted due to an issue with the rendered yaml, rather than an issue with the job itself. The yaml rendering issue should be fixed by https://gerrit.wikimedia.org/r/1140548.

Thu, May 1, 11:45 PM · serviceops, Community-Tech

Tue, Apr 29

Scott_French added a comment to T373752: Build php-uuid package, and add to WMF production and CI.

Thanks, @Reedy - It's really not all that much effort to make this happen if it would help unblock you all.

Tue, Apr 29, 11:00 PM · Patch-For-Review, serviceops
Scott_French added a subtask for T319432: Migrate WMF production from PHP 7.4 to PHP 8.1: T392938: Remove PHP 7.4 from deployment hosts.
Tue, Apr 29, 5:39 PM · Data-Engineering-Radar, Data-Engineering, Dumps-Generation, MediaWiki-Platform-Team, serviceops
Scott_French added a parent task for T392938: Remove PHP 7.4 from deployment hosts: T319432: Migrate WMF production from PHP 7.4 to PHP 8.1.
Tue, Apr 29, 5:39 PM · serviceops
Scott_French triaged T392938: Remove PHP 7.4 from deployment hosts as Medium priority.
Tue, Apr 29, 5:39 PM · serviceops
Scott_French created T392938: Remove PHP 7.4 from deployment hosts.
Tue, Apr 29, 5:38 PM · serviceops
Scott_French added a comment to T388761: scap needs to be k8s-cluster aware.

@brouberol - This would require changes to scap, specifically the ability to override the set of environments relevant to a particular deployment (rather than using the "defaults" provided by the k8s_clusters config key).

Tue, Apr 29, 3:17 PM · Patch-For-Review, Release-Engineering-Team, Dumps-Generation, Scap

Apr 19 2025

Scott_French added a comment to T373752: Build php-uuid package, and add to WMF production and CI.

My apologies, @Reedy - I once again lost track of this one.

Apr 19 2025, 1:22 AM · Patch-For-Review, serviceops
Scott_French closed T380485: Transition parsoidtest1001 to PHP 8.1, a subtask of T319432: Migrate WMF production from PHP 7.4 to PHP 8.1, as Resolved.
Apr 19 2025, 1:01 AM · Data-Engineering-Radar, Data-Engineering, Dumps-Generation, MediaWiki-Platform-Team, serviceops
Scott_French closed T380485: Transition parsoidtest1001 to PHP 8.1 as Resolved.

Since the logging issue appears to be resolved (e.g., the ongoing RT testing run is producing logs in logstash as expected), and nothing else of note appears to have arisen during yesterday's run that also surfaced the logging issue, I'm going to optimistically mark this as resolved.

Apr 19 2025, 1:01 AM · Web Team Essential Work 2025, Content-Transform-Team, OKR-Work, MediaWiki-Engineering, serviceops
Scott_French added a comment to T391057: Turn down MediaWiki image builds for PHP 7.4.

Remaining patches to draft:

Apr 19 2025, 12:50 AM · serviceops

Apr 18 2025

Scott_French updated subscribers of T388761: scap needs to be k8s-cluster aware.

Connecting some dots here that I forgot to add yesterday:

Apr 18 2025, 11:51 PM · Patch-For-Review, Release-Engineering-Team, Dumps-Generation, Scap
Scott_French added a comment to T386246: Migrate parsoidtest functionality to kubernetes .

We decided to update parsoidtest1001 to PHP 8.1 "in place" for T380485: Transition parsoidtest1001 to PHP 8.1, so this work no longer transitively blocks 8.1 migration. This will give us more time to develop a solution for this use case on k8s that maximizes overlap / reuse with T276994: Provide an mwdebug functionality on kubernetes (mw-experimental).

Apr 18 2025, 9:53 PM · Content-Transform-Team, OKR-Work, MediaWiki-Engineering, serviceops
Scott_French added a comment to T388536: Migrate community-tech jobs to mw-cron.

Alright, the first run appears to have succeeded:

Apr 18 2025, 6:13 PM · serviceops, Community-Tech
Scott_French updated the task description for T392309: Costly IP block queries triggered by long x-forwarded-for.
Apr 18 2025, 6:04 PM · SecTeam-Processed, Trust and Safety Product Team, MediaWiki-Blocks, Security-Team, DBA, WMF-NDA
Scott_French created T392309: Costly IP block queries triggered by long x-forwarded-for.
Apr 18 2025, 6:00 PM · SecTeam-Processed, Trust and Safety Product Team, MediaWiki-Blocks, Security-Team, DBA, WMF-NDA