I have been trying to create a post in the Canada community. Scuttlebutt is that the post limit was set to 10,000 characters, but has since been set to 50,000 characters. My post has 9961 UTF-8 characters (9969 characters overall, 8396 characters excluding spaces) and when I hit submit the submission never completes.
looks like it’s still 10k characters, but some others have been running into issues at around 6k-7k chars, for whatever reason. it’s probably storing stuff in markdown format but i can’t see the overhead being that large?
Strangely enough, my character count is including all markup.
it’s still 10k characters
That very post you link to also points to another Git action that raises it to 50k characters. It was introduced into 0.18.0-rc.3, and backported all the way down to 0.17.0-rc1. Check out commit ee0cdde.
hmmm. and it looks like the commit to add the 10k limit in the backend was only added in v0.18. will investigate this some more.
i was able to successfully make a post with 9415 characters: https://lemmy.ca/post/784193 (edit also one with 9999 chars: https://lemmy.ca/post/784481)
they are all just ascii characters, so maybe that’s why it worked for me? i’ve also seen issues when commenting that if the Language selected doesn’t match what’s been selected in the Community, it will also just spin. i believe they’re going to be returning proper error messages in the next version?
but when i went over 10,000 it didn’t work (the spinner just spun).
they are all just ascii characters
Wait… what??
If this is only ASCII characters, then the issues make a lot of sense… but then this is also one of the more brain-dead bugs from a programmer’s standpoint. Everything is in UTF-8 these days, especially if you want i18n, as Lemmy seems to do with almost any post/comment submission or community creation. Doing an ASCII character count and crapping out at 5k UTF-8 characters because they are double the bit size is just… really, really bad.
i did another test post (https://lemmy.ca/post/795726) with some emoji, and it appears that it’s counting each one as two “characters”. so yes, it looks like it’s just counting bytes.