Summary: If you don't have time to read all this the bottom line is I decided to run the system off one battery until issue is fixed later. It looks like you were right Sunshine Eggo, but see below for further detail.
(comment is about 900 words, 2 minutes to read)
The fault has been regularly occurring still. If I turn off the battery and then turn it back on again only minutes later, and do nothing else, the fault never instantly reappears, but will reliably recur minutes or hours later, at the most the next day.
Batteries tend to be out of balance after the fault, especially if I detect the fault at 6am or 7am which means it could have been in the faulty condition for hours as the alarm is not quite loud enough to wake me to the other end of the house. This unbalance (which happened a few times in the last weeks) suggests that when the battery is in the fault condition it is probably not discharging. In some cases when this happened I then disconnected the batteries and did a partial or full rebalancing before connecting. When I did this, then the fault will not recur as soon, but will still recur after perhaps a few days or a week.
There is insufficient data to draw a conclusion with high probability, but my gut feeling is that the benefit of battery balancing is to give the battery time to rest and recover from heavy demand situations, rather than because of the actual balancing benefit itself. To test this theory I just turned the batteries off for over a day recently and it was 2 days later that the fault occurred. Similar to when I do battery balancing.
Last night I tried turning the one faulty battery off and back on again after only 30 minutes, but the error repeated within a few hours. I then left just one battery off overnight. The single battery was still able to handle a 2000W load. However after turning off the faulty battery a red fault light (no flashing, nothing audible) was displayed on the master battery, and the 6 LED lights on the master battery showing SOC did not display. However it still worked - during steady discharge at low load for maybe 10 hours. I turned the bottom battery back on this morning so I didn't get to test charging with the one battery turned off. When I turned the bottom, faulty battery back on the red fault light disappeared immediately and the SOC lights immediately returned also.
Interestingly, when I turned the batteries back on they were heavily unbalanced with one showing one light and the other four, however, impressively, the two batteries then balanced each other out over a few hours. At one point one of the batteries was actually charging even though the whole system was discharging (i.e. loads > PV input). So all the times I disconnected the batteries to balance them in isolation from the inverter and the rest of the system may not have even been necessary.
I am able to turn the faulty battery off and on again without causing a power cut if I don't turn off the other battery. This is the only way I've found to reliably create a short term fix to the error without a power cut. However, given the fault light and the non-display of SOC, this makes me uncomfortable as there may be a possibility, however slight, that I could damage the top battery in this arrangement of passing circuit through the turned off battery.
I therefore now completely disconnected (and turned off) the faulty battery for now and connected both long cables to the top battery. I have not changed the dip positions. I have reduced the charging current from 30A to 20A but kept other inverter settings the same. I attach a photo with the current cable connections if anyone wants to check it. I currently plan to only connect it if necessary to do so as part of an attempt to solve the problem or if we had a grid power cut and it's the only power we have left. I am storing the battery at 50% charge level, by the way, and if it gets cold out there (nearer to zero degrees than ten) I will move the battery into another storage location in the house. Given this, it seems like that should be OK to store it for months turned off if necessary (according to a few articles I just read including some graphs).
I'm too busy to properly solve this this week. It looks like the issue needs software diagnostic with Pylontech at this point and failing success with that to try and get the battery replaced or repaired. I will try and get to that within May and then provide an update.
It looks at the moment less likely that the error is caused by communication issues or current flow from the faulty battery to the inverter or the other battery. And perhaps more likely that there is an internal issue with one of the faulty batteries.
I am concerned about running the system with one battery as it is quite undersized for the evening demand. 2000W evening demand will work but is above the recommendation and could affect lifetime. I don't know what happens with 4000W or so demand in the evening, but hopefully would default to grid. However, this seems to be the best solution for now. I may also try this is a permanent solution if I can't get the battery repaired or replaced under warranty.