There isn't a bug report thread so>Incorrect string value: '\xEF\xBDuygh' for column `lainchan`.`posts_b`.`body_nomarkup` at row 1I tried posting this picture on /b/ with full-width text as the body.

Posting mode: Reply [Return]
Name
Email
Subject	Spoiler Image
Comment
Flag
File	Select/drop/paste files here
Embed
Password	(For file deletion.)

File (hide): 1621313198729.jpg ( 274.61 KB , 2048x1152 , 1621284985965.jpg )

▶Anonymous 2021-05-18 (Tue) 04:46:38 No.6927>>6928

There isn't a bug report thread so

>Incorrect string value: '\xEF\xBDuygh' for column `lainchan`.`posts_b`.`body_nomarkup` at row 1

I tried posting this picture on /b/ with full-width text as the body.

▶Anonymous 2021-05-18 (Tue) 06:42:05 No.6928

>>6927 (OP)
It would have been the full-width text ("body" refers to this main text part of a post). As you can tell, your image worked fine.
We've seen a few cases where unusual characters (including some foreign-language copypastas) have caused similar-looking errors.
I'll ask the devs but I think it's an issue that stalled due to priorities and devs doing IRL stuff.

▶Anonymous 2021-05-18 (Tue) 06:42:44 No.6929>>6930

> lainchan
You have to go back.

▶Anonymous 2021-05-18 (Tue) 06:43:08 No.6930

>>6929
Where do you think you are?

▶Anonymous 2021-05-22 (Sat) 05:39:14 No.7094

I got a similar error message when making a post that contained Japanese characters.

▶Anonymous 2021-05-27 (Thu) 07:46:11 No.7252

Looking at the error, I think I know what it is and it should now be fixed.
<these next two points are technical, interesting but TL;DRable
>in the beginning there were ASCII character encoding, where the values 0 - 127 that's 2^7 (7 bits), minus 1 because programmers start counting at 0 not 1 represent each character ('A' = 41, 'B' = 42, … )
>naturally 127 values isn't enough even for Europe so everyone told America to fuck off and now we use UTF-8 encoding (Unicode), where a single character can be represented by an amount multiple bytes long ('A' = 41, '☭' = 14850221, '日' = 15112101, 'ａ' = 15711617, '💩' = 4036989609), and in order to remain compatable with ASCII you can basically have a signal (first bit of byte=1) for ''and also include the next byte after this". A value under 256 (2^8) requires one byte, 65536 (2^16) is two, 16777216 is three, 4294967296 is four.
>however, by default, PHP's function for the filter parsing language (regex) doesn't treat messages as multi-byte by default, despite UTF-8 being used almost everywhere if you aren't anglo. So it would break the 日 (3 bytes) into three weird bits and treat each of those weird bits as letters when looking for a phrase.
>if the stars aligned and if 'found' a filtered phrase, it replaced it. Even if it was the 3rd byte of 日 and the 3 bytes of 本. It would have just ripped out the last third of 日 and replaced it with 'uygh'.
>when that string of characters (your post) gets processed later after running the filter, the broken character caused an error because by chance it's an impossible letter so it refuses to post.

Unique IPs: 3

Replies: …Files: Page: …

/meta/ - Ruthless criticism of all that exists (in leftychan.net)

General

User CSS

User JS

WebM

Filters