API response encoding

Need help with your application? Ask here.
User avatar
Zer0xFF
Posts: 253
Joined: Fri Apr 20, 2018 9:18 am

Re: API response encoding

Post by Zer0xFF » Fri Aug 10, 2018 10:16 pm

@freelancer,

thanks for this information, based on that I came up with this sql query to update overview entries.

Code: Select all


UPDATE games SET overview = replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(overview, unhex('c280'), unhex('e282ac')), unhex('c282'), unhex('e2809a')), unhex('c283'), unhex('c692')), unhex('c284'), unhex('e2809e')), unhex('c285'), unhex('e280a6')), unhex('c286'), unhex('e280a0')), unhex('c287'), unhex('e280a1')), unhex('c288'), unhex('cb86')), unhex('c289'), unhex('e280b0')), unhex('c28a'), unhex('c5a0')), unhex('c28b'), unhex('e280b9')), unhex('c28c'), unhex('c592')), unhex('c28e'), unhex('c5bd')), unhex('c291'), unhex('e28098')), unhex('c292'), unhex('e28099')), unhex('c293'), unhex('e2809c')), unhex('c294'), unhex('e2809d')), unhex('c295'), unhex('e280a2')), unhex('c296'), unhex('e28093')), unhex('c297'), unhex('e28094')), unhex('c298'), unhex('cb9c')), unhex('c299'), unhex('e284a2')), unhex('c29a'), unhex('c5a1')), unhex('c29b'), unhex('e280ba')), unhex('c29c'), unhex('c593')), unhex('c29e'), unhex('c5be')), unhex('c29f'), unhex('c5b8')) WHERE overview <> replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(overview, unhex('c280'), unhex('e282ac')), unhex('c282'), unhex('e2809a')), unhex('c283'), unhex('c692')), unhex('c284'), unhex('e2809e')), unhex('c285'), unhex('e280a6')), unhex('c286'), unhex('e280a0')), unhex('c287'), unhex('e280a1')), unhex('c288'), unhex('cb86')), unhex('c289'), unhex('e280b0')), unhex('c28a'), unhex('c5a0')), unhex('c28b'), unhex('e280b9')), unhex('c28c'), unhex('c592')), unhex('c28e'), unhex('c5bd')), unhex('c291'), unhex('e28098')), unhex('c292'), unhex('e28099')), unhex('c293'), unhex('e2809c')), unhex('c294'), unhex('e2809d')), unhex('c295'), unhex('e280a2')), unhex('c296'), unhex('e28093')), unhex('c297'), unhex('e28094')), unhex('c298'), unhex('cb9c')), unhex('c299'), unhex('e284a2')), unhex('c29a'), unhex('c5a1')), unhex('c29b'), unhex('e280ba')), unhex('c29c'), unhex('c593')), unhex('c29e'), unhex('c5be')), unhex('c29f'), unhex('c5b8'));

UPDATE games SET game_title = replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(game_title, unhex('c280'), unhex('e282ac')), unhex('c282'), unhex('e2809a')), unhex('c283'), unhex('c692')), unhex('c284'), unhex('e2809e')), unhex('c285'), unhex('e280a6')), unhex('c286'), unhex('e280a0')), unhex('c287'), unhex('e280a1')), unhex('c288'), unhex('cb86')), unhex('c289'), unhex('e280b0')), unhex('c28a'), unhex('c5a0')), unhex('c28b'), unhex('e280b9')), unhex('c28c'), unhex('c592')), unhex('c28e'), unhex('c5bd')), unhex('c291'), unhex('e28098')), unhex('c292'), unhex('e28099')), unhex('c293'), unhex('e2809c')), unhex('c294'), unhex('e2809d')), unhex('c295'), unhex('e280a2')), unhex('c296'), unhex('e28093')), unhex('c297'), unhex('e28094')), unhex('c298'), unhex('cb9c')), unhex('c299'), unhex('e284a2')), unhex('c29a'), unhex('c5a1')), unhex('c29b'), unhex('e280ba')), unhex('c29c'), unhex('c593')), unhex('c29e'), unhex('c5be')), unhex('c29f'), unhex('c5b8')) WHERE game_title <> replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(game_title, unhex('c280'), unhex('e282ac')), unhex('c282'), unhex('e2809a')), unhex('c283'), unhex('c692')), unhex('c284'), unhex('e2809e')), unhex('c285'), unhex('e280a6')), unhex('c286'), unhex('e280a0')), unhex('c287'), unhex('e280a1')), unhex('c288'), unhex('cb86')), unhex('c289'), unhex('e280b0')), unhex('c28a'), unhex('c5a0')), unhex('c28b'), unhex('e280b9')), unhex('c28c'), unhex('c592')), unhex('c28e'), unhex('c5bd')), unhex('c291'), unhex('e28098')), unhex('c292'), unhex('e28099')), unhex('c293'), unhex('e2809c')), unhex('c294'), unhex('e2809d')), unhex('c295'), unhex('e280a2')), unhex('c296'), unhex('e28093')), unhex('c297'), unhex('e28094')), unhex('c298'), unhex('cb9c')), unhex('c299'), unhex('e284a2')), unhex('c29a'), unhex('c5a1')), unhex('c29b'), unhex('e280ba')), unhex('c29c'), unhex('c593')), unhex('c29e'), unhex('c5be')), unhex('c29f'), unhex('c5b8'));

(might change the condition and use the value produced by replace, so that it will only apply to those that will have a visible change (not that it would make much of a difference))


this should limit the changes to those with latin1 encoding that doesn't match the utf8 output result (as latin1 will be converted back to utf8), then it will replace all the latin1 characters with their utf8 counterpart... noting that the condition will also be trigger by Japanese/Chinese characters, but those should not be effected as they fall outside of the control range.

I'll apply this code around this same time tomorrow after I ran a full backup to have a fall back incase anything goes wrong.
if you've any comments or converns before hand please do let me know.

also you can use this url https://thegamesdb.net/gamev2.php?id=165 to see the effect of the change to overview (as it's done before returning the results)

P.S I'd also switch database encoding from utf8_general_ci to utf8mb4_unicode_ci as I'm reading this is the appropriate utf8 encoding with bettering wider language support... would probably need to run this through games titles as well

Regards
Zer0xFF
Regards
Zer0xFF

User avatar
freelancer
Posts: 17
Joined: Fri Jun 29, 2018 4:53 pm
Location: Sweden

Re: API response encoding

Post by freelancer » Sat Aug 11, 2018 12:26 am

Looks good to me! I've had a decent amount of gin though, will take a closer look tomorrow and let you know if anything looks wrong.

And yeah, you probably wanna run that on every text field to be sure (overview, title, alternates, ...)

Changing the collation to utf8mb4_unicode_ci sounds like a good idea aswell.

I'm feeling very good about this. New site looks good, new API is good, and this'll fix the only issue I've found so far =) Keep up the good work!

User avatar
freelancer
Posts: 17
Joined: Fri Jun 29, 2018 4:53 pm
Location: Sweden

Re: API response encoding

Post by freelancer » Sat Aug 11, 2018 9:21 pm

Checked out some more games today on the "old" vs "new" pages and it still looks like this'll fix the issue.

Let me know when you implement it and then I can do the same on my end and re-enable update.

User avatar
Zer0xFF
Posts: 253
Joined: Fri Apr 20, 2018 9:18 am

Re: API response encoding

Post by Zer0xFF » Sat Aug 11, 2018 10:05 pm

@freelancer Done, Thanks for the report and help, note that I will push complete list of effected games, so other can re-poll those if needed, though, an sql fix based on what I posted is recommend.

To note: 3774 overviews were effected, and 52 games titles.
no alts, publishers or dev.
Regards
Zer0xFF

Post Reply