wmd, to random
@wmd@chaos.social avatar

How can unicode ever be considered complete without a :guillotine: emoji...

cstross, to random
@cstross@wandering.shop avatar

WHY IS THERE NO EMOJI FOR GIBBON?

I WANT TO BE ABLE TO SIGNAL "tangerine shitgibbon" UNAMBIGUOUSLY!

SAYING "🍊💩🐒" IS OPEN TO MISINTERPRETATION!!!

(Although I'm okay with 🍊💩🐒🚔⛓️‍💥)

JdeBP,

@cstross

I can get you to elsewhere within the apes. (-:

U+130FC is a baboon sitting on a basket, which does rather resemble taking a shit.

𓃼

And U+1313D is excrement.

𓄽

U+130E2 is lying canine, which I mention purely on the offchance that you might have some use for it.

The Egyptian Hieroglyphs section of #Unicode is sometimes quite useful in the modern world, and much underappreciated.

typographica, to languagelearning
@typographica@typo.social avatar

SILICON Digitally Disadvantaged Languages Fellowship Program: Call for Proposals.

SILICON invites applications from keyboard designers, app designers, type designers, Large Language Model designers, language ethnographers, OCR/ML experts, and digital typographic experts for inaugural SILICON Fellowship Program. Awards of up to $7,000 to be granted.

https://docs.google.com/forms/d/e/1FAIpQLSdx4mUMwhHNu1Lgxr0bKlcYcTNY3WdQWQQyVKgCzXe9SAPYLA/viewform

elilla, to random
@elilla@transmom.love avatar

everybody's hyped for the new 1FAE9 FACE WITH BAGS UNDER EYES, which I mean, :big_mood:, but for you appreciators of text-presentation glyphs, might I draw your attention to the new range "Symbols for Legacy Computing Supplement" (1CC00–1CEBF) that includes several gaming sprites from retrocomputer codesets including Pac-Man, a full set of Space Invaders, 1CC96 FLAPPING BIRD and so on—as well as a full set of box characters for emulators?

https://www.unicode.org/charts/PDF/Unicode-16.0/U160-1CC00.pdf

Another section showing some old terminal drawing elements ("white lower left pointer", "two rings aligned horizontally", "inverse black diamond" etc.) and more game sprites (such as tanks and racing cars and fish in various positions).

exegete, to hebrew
@exegete@autonomous.zone avatar

I just stumbled onto something horrifying, neo-Nazi symbolism seemingly hidden away in . The first Unicode codepoint, corresponding to א, is u05D0. The integer corresponding to the hex? 1488. You can't convince me that was a mere coincidence.

Who planned this???

krans,
@krans@mastodon.me.uk avatar

@exegete The first codepoint is U+05BE HEBREW PUNCTUATION MAQAF.

It should be possible to check the archive of WG4 minutes and papers to look for corroborating evidence for whether there is a conspiracy or a coincidence. Members of the Unicode standards body hang out on Mastodon and may be interested in investigating further.

spacemagick, to Futurology
@spacemagick@mastodon.social avatar
Edent, to webdev
@Edent@mastodon.social avatar

🆕 blog! “Accents and eBooks”

By and large, the English language doesn't use diacritical marks. Even our loanwords are stripped of them; we drink in a cafe rather than the more pretentious café. This has a consequence for HTML and, by extension, eBooks. As a quick primer, modern computing gives us two main ways of displaying a letter with an […]

👀 Read more: https://shkspr.mobi/blog/2024/05/accents-and-ebooks/

blog, to webdev
@blog@shkspr.mobi avatar

Accents and eBooks
https://shkspr.mobi/blog/2024/05/accents-and-ebooks/

By and large, the English language doesn't use diacritical marks. Even our loanwords are stripped of them; we drink in a cafe rather than the more pretentious café. This has a consequence for HTML and, by extension, eBooks.

As a quick primer, modern computing gives us two main ways of displaying a letter with an accent. The first is simple - encode every single accented letter as a separate "pre-composed" character. So è (U+00E8), é (U+00E0), ê (U+00EA, and ë (U+00EB) are all stored as different codepoints.

But this seems a little inefficient and can make it hard to search through text for an exact lexical match.

So there is a second way to add accents. You take the base character - e (U+0065) - and then apply a separate "combining" accent character to it. For example the combining accent ◌́ (U+0301). That means you can add an accent to áńý ĺét́t́éŕ!́

Note, the accent ◌́ (U+0301) is separate from the character ´ (U+00B4). In fact, most accents have a pre-composed, combining, and separate form. This, understandably, causes much confusion!

Here's a good example. I was reading the excellent Fallen Idols, when I noticed this typesetting bug.

The phrase "Swords of Qadisiyyah." But the combining macron over the letter "a" has been rendered as a separate dash.

It's always hard to transliterate languages. The Victory Arch in Iraq is known as قوس النصر, and usually written in English as the "Swords of Qādisīyah".

Examining the HTML code in the eBook, it was obvious that the publishers had used a macron ¯ (U+00AF) rather than the combining version ◌̄ (U+0304).

I've reported it to the publisher. I've no idea if they'll fix it in a subsequent re-issue.

https://shkspr.mobi/blog/2024/05/accents-and-ebooks/

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • thenastyranch
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • ngwrru68w68
  • provamag3
  • magazineikmin
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • JUstTest
  • All magazines