Some time in 2010 or so I raised the idea of splitting the core translation file (then MessagesEn.php) to several files to make it easier for translators. The basic idea is that it's easier to approach the translations as several smaller groups rather than one large group.
Back then it had about 2700 messages. @siebrand and @Nikerabbit were not enthusiastic about it, and said that it's not worth the effort. (We discussed it in person at the 2011 Berlin Hackathon, and possibly in writing on some mailing lists or Bugzilla tasks, but I cannot find it now.)
A few things changed since then:
- It went up from 2700 to 3800. In fact, it's over 4000 if you count the optional and ignored messages.
- We transitioned from PHP to JSON.
- In practice we already have several separate en.json files: the core itself is split to Core, API, and Installer, and there are also separate repos for skins.
- translatewiki.net configuration files are not that hard. (I don't quite know how did they look in 2010, to be honest, but I do know them now, and they aren't terrible.)
As far as I know, splitting a group is a matter of:
- Finding a group of closely related messages, making sure that no information is lost compared to the current subgroups of messages en.json contains (T162172#3280030).
- In the core repository (example):
- Moving the relevant messages to a new en.json and qqq.json while keeping all the message keys identical. Unless there's a reason to do it differently, the new files should be under languages/i18n/new-group-name/en.json.
- Adding an entry for the new file to function getMessagesDirs() in includes/cache/localisation/LocalisationCache.php.
- Adding an entry for the new file to the banana section in Gruntfile.js.
- In the translatewiki repository:
- Adding a new group in groups/MediaWiki/MediaWiki.yaml and moving the ignored and optional messages into it (example).
- Adding the new group to the appropriate aggregate "used by Wikimedia" group, such as groups/MediaWiki/WikimediaMainAgg.yaml or WikimediaTechnicalAgg.yaml. (example).
- Adding the new group to the mediawiki:/group: section in repoconfig.yaml (example).
- Doing a new export so that the translations are moved as well.
- (Did I miss anything? Does anything need to be updated also in the scripts for synching translatewiki with Gerrit?)
I'm not talking about splitting it to 50 groups, but some initial groups I can think of are:
- exif/ definitely the exif tags (about 380 messages)
- datetime/ - maybe calendars (not only Gregorian, but also Hebrew, Persian, days of week, etc.)
- maybe log messages
- language converter
- namespace messages (nstab, etc.)
- skin messages (user menu, sidebar, tool box, etc.)
- special pages
- preferences/ Special:Preference
- user emails (enotif-*, etc.)
- user groups (group-*, grouppage-*, *.css, etc.)
- user rights
- I haven't given this much thought yet, but perhaps the ignored messages could be moved to a separate file. That file would simply be not loaded to translatewiki, and then we could remove the long "ignored" list from the translatewiki configuration (269 items at the moment). But that's really a separate issue to discuss.
- Possibly some more.
I can do it myself some time as a pet project. This task is a kind of an RFC: Are there caveats that I am missing? Is it harder than I imagine? Is anybody opposed to it for any reason?