#Unicode - kbin.social

ButterflyOfFire, 7 months ago (edited 7 months ago) to Morocco French

Vu sur El Jazeera : letter GA.

Wiki : https://en.m.wikipedia.org/wiki/Ng_(Arabic_letter)

U+06AD

Now, why the writer used the final [isolated] form of ga in the middle of the word?

I mean, why not : تحڭره ?
Possible that it was a copy & paste and no accessibily to the letter with a standard keyboard layout.

#unicode #morocco

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

gringene, 7 months ago to random

ۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗSo... does anyone know what long unicode accents do to mastodon text?

ۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗ
Just putting this here so that I can refer to it later.

#unicode

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

paulox, 7 months ago to python

Python 3.12.0 has been released last week 🐍

Ubuntu 23.10 "Mantic Minotaur" has been released today ♉

The best way to use Python 3.12 in Ubuntu is to simply update it to 23.10 🚀

#Python #Python312 #Ubuntu #Ubuntu2310 #Update #ManticMinotaur #Unicode #Unicode15

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bentolor, 7 months ago to programming German

Excellent writeup on #Unicode and it's various encodings like #UTF8 and others in a beautiful and precise style.

A must-read for any software developer in 2023! #programming #i18n

https://tonsky.me/blog/unicode/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ reiver

SteveFaulkner, 7 months ago to accessibility

👁️ short note on emoji text alternative variations

"Unicode symbols do not have inbuilt text alternatives. They are exposed in the browser accessibility tree as a text symbol"

#emoji #screenreaders #a11y #unicode #webDev

https://html5accessibility.com/stuff/2022/01/17/short-note-on-emoji-text-alternative-variations/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ rotnicki

markush, 7 months ago to random

This page has THE best explanation of how #Unicode and #UTF8 #encoding works https://tonsky.me/blog/unicode/

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jochen, Rhaedas

blausand, 7 months ago

TIL about something graphene clusters.
Must have been a good read, since i feel more incompetent now and don't trust #Unicode as i did before.
https://tonsky.me/blog/unicode/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hook, 8 months ago to random

This was a surprisingly fun and interesting read:

The Absolute Minimum Every Software Developer Must Know About #Unicode in 2023 (Still No Excuses!)
https://tonsky.me/blog/unicode/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ seav, nivrig, sldrant, jernej__s +1 more

janriemer, 8 months ago to random

Did you know that #Unicode has a "boost symbol"?

It's this:
⮔

https://unicode-explorer.com/c/2B94

You know what to do with this toot, right!? 😏

⮔

#FunFact #Boost

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

AccordionBruce, 8 months ago to vinyl

Finally! Proposal for a #PhonographEmoji:

https://docs.google.com/document/d/1DaNNYA1qnSCVZpwrgLx_HAlzq0Sn8UlINbmUeLkUwVw/edit?usp=sharing

You're seeing it before almost anybody else in the world. So be kind, if you like

As I said, I'm hoping to work with a few folks from sound #Archives so it's not just me, now that I've done some of the heavy-lifting

There it is if that link works. Should be read-only now. I don't really know how google docs works

#NonAccordionContent #Emoji #Vinyl #Records

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

AccordionBruce, 8 months ago

@jgamble
OMG! I should add that to the proposal

Probably #PhonographEmoji + 🐶 would be closest

#Unicode is working on a scheme to have emoji be reversible so you can make sure all the police-cars are chasing the right surfer instead of the other way 'round or whatever

So if you had a sitting dog, that might take care of the issue

But I don't know what other dog: 🐕 Dog2; 🐕‍🦺 Service Dog; 🦮 Guide Dog; or 🌭 Hot Dog; would fit as well as 🐶 for ol' Nipper

#WallaceAndGromit #emoji
#DogsOfMastodon

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ dgoldsmith

SnoopJ, 8 months ago to random

Getting around to reading the 'new' "Absolute minimum" blog post about dev knowledge about #Unicode, and I assume parts of it are going to rub me the wrong way

reply

expand (13)

collapse (13)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov, Binder

SnoopJ, 8 months ago

As an aside:

To me, one of the most important "absolute minimum" bits of #Unicode knowledge for devs is an understanding that "Unicode" is not a monolith, and that the specifications which fall under this name allow for a lot of nuance. Understanding that Unicode is [UCD + a bunch of rules + …] goes a long way.

There is rarely a single Right Way™ to do things, and this is entirely on purpose because it turns out that text is complicated.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

globalc, 8 months ago to random

Quite some interesting pieces about #unicode :
https://tonsky.me/blog/unicode/

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ploum, 8 months ago (edited 8 months ago) to random

Wow, my small brain eventually managed to understand UTF-8 thanks to this great article:

https://tonsky.me/blog/unicode/

Yes, it was hard to me. Even @bortzmeyer was not enough to allow me to grasp the thing.

I like UTF-8 even more now.

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ reiver, danielsiepmann, alexanderschnitzler, oliklee +2 more

kravietz, 8 months ago

@ploum

You might also find my 2018 OWASP presentation on the same subject of #Unicode interesting:

https://www.youtube.com/watch?v=u34-Cc64xFk

@bortzmeyer

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

janriemer, 8 months ago to ai

Aaaaaannnd we have another example of #AI creating bullshit code. 💩

This time it tries to create a "simple" #Rust function that checks if a string is an acronym:

https://www.youtube.com/watch?v=Fvy2nXcw3zc&t=224s (YT, because timestamp)

The AI generated code absolutely does not care about #unicode at all, so it panics, when you give it a unicode character that happens to not have their char boundary at byte index 1.

1/2

#LLM #LLMs #ArtificialIntelligence #SALAMI #RustLang

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ janriemer, AstraKernel, alcinnz

EvanHahn, 8 months ago to programming

"The Absolute Minimum Every Software Developer Must Know About Unicode in 2023" https://tonsky.me/blog/unicode/

#programming #Unicode #ASCII #code

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ slink

shaft, 8 months ago to random French

Je découvre que dans #Unicode, il y a des glyphes pour noter le mois en 1 seul caractère lorsque l'on écrit la date en mandarin ou japonais. Par contre, il n'y a pas l'équivalent pour les jours. Donc nous sommes le ㋈30日, 3 caractères 😔

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

shaft, 8 months ago to random French

Je viens d'ajouter le glyphe U+1F16D (Circled CC, le symbole Creative Commons donc), qui présent dans #Unicode depuis sa version 13.0, à la police "Symboles" de mon blog (qui permet déjà d'avoir le piti logo Mastodon ou celui du flux de syndication)

Flemme de modifier mon pied de page ceci dit ^^

https://r12a.github.io/uniview/?char=1f16d

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bortzmeyer, 8 months ago to random French

Vu dans le profil Twitter d'un type plutôt comploplo d'après ses tweets :

🍷+ 🐖 = 🇫🇷

Je vais essayer son menu ce soir, je vous dirais si je me sens plus français après.

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

bortzmeyer, 8 months ago

Alors, j'ai refait ses calculs et 1F377 + 1F416 = 3E78D qui n'est pas un caractère #Unicode alloué.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fell, 8 months ago to fediverse

Wasn't there a de-facto standard #emoji set for the #Fediverse, or did I dream that?

You know, with all the #FOSS classic icons like #tux, #archlinux, #postmarketos, #blender, #kde and so on?

#MastoAdmin #FediAdmin #AskFedi

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ andypiper

fell, 8 months ago

@HauntedOwlbear Sorry, I didn't mean the #Unicode #emoji. I mean the custom emoji, stuff like the beloved #Linux penguin.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gairdeachas, 8 months ago to Greek

Today I had to once again tell someone that the Unicode GREEK QUESTION MARK gets NFC normalized into SEMICOLON and is thus not distinguishable after normalization.

Now I can expect that they will argue that we shouldn't be doing NFC normalization to which I will point them to the design doc wherein we discussed why we made the decisions we did (yay past-me for writing those up).

And I will never get those brain cells back for something less esoteric. Ever.

#unicode #greek

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

brokenix, 8 months ago to ascii

Please do not use the #ASCII grave accent (0x60) as a left quotation mark together with the ASCII apostrophe (0x27) as the corresponding right quotation mark (as in `quote'). Your text will otherwise appear rather strange with most modern fonts (e.g., on #Windows and Mac systems). Only old X Window System fonts and some old video terminals show ASCII 0x60/0x27 as left and right quotation marks, while most modern systems follow the ISO and Unicode standards instead. If you can use only ASCII’s typewriter characters, then use the apostrophe character (0x27) as both the left and right quotation mark (as in 'quote'). If you can use #Unicode characters, nice directional quotation marks are available in the form of characters U+2018, U+2019, U+201C, and U+201D (as in ‘quote’ or “quote”).
If you work in an environment where the UTF-8 encoding is already used everywhere (e.g., Plan9 and most modern GNU/Linux installations), you could even decide to use proper directional quotation marks, as in ‘quote’ or “quote”.

Check your source code directories with

grep *`
to find out, where modifications are necessary. Then use (with proper care!) something like

perl -pi.bak -e "s//'/g;" file1 file2 ...`
to make the necessary substitutions automatically, or make the edits manually instead.

The use of 0x60 (grave accent) as a special control character in the Unix shell (to denote command substitution as in command or better $(command)), in #Perl, in #Lisp, or in #TeX/troff (to denote a proper left single quotation mark) does not have to be changed and remains unaffected
https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ mjgardner

shaft, 8 months ago to random French

On peut la typo de ⊃∪∩⊂ de machin-là¹ avec le bloc des opérateurs mathématiques, inclus dans #Unicode dans la version 1.1 🤔

¹ Oubli de nom volontaire

reply

expand (10)

collapse (10)

report

activity

copy /kbin url

copy original url

open original url

Loading...

brokenix, 8 months ago to random

unicode-bidi | Codrops

#Unicode specifies an algorithm that is used by user agents to determine the direction of text in a bi-directional content. The algorithm determines directional flow of content based on properties of the characters and content, as well as explicit controls for language “embeddings” and directional overrides"
https://tympanus.net/codrops/css_reference/unicode-bidi/#:~:text=Unicode%20specifies%20an,and%20directional%20overrides

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ rml

Annalee, 8 months ago to random

I realized today that we don't have a cicada emoji, which seems like an oversight. They're the avatars of endless screaming.

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

quinnanya, 8 months ago

@Annalee Given the #Unicode preference for multi-function #emoji and combinatory possibilities of what's already there, I wonder if one might get a cicada faster by proposing a 🦗😱 combo with a zero-width joiner. 🤔

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...