youronlyone, to linguistics
@youronlyone@c.im avatar

It's easier to use Hangeul and Kana to write pronunciations of Filipino words, than to use Filipino diacritical marks.

  1. Last we were taught about Filipino diacritical marks was in Grade 4 or 5 (early 90s). I don't know why, but after that diacritical marks were totally forgotten.

  2. Tracking it down, IIRC, it was late 90s / early 00s when it was officially removed by the KWF.

  3. Sometime 2010, the KWF brought diacritical marks back, though limited.

  4. In 2014 (or was it 2016?) the KWF introduced a new diacritical mark, the Filipino schwa. It didn't exist before. There are only like 4 Philippine languages with a schwa vowel. They added it in Filipino so words from those Philippine languages can be integrated into the Filipino language.

Here's my problem, no matter how many times I read the KWF document on Filipino diacritical marks, I can't get my head around it. 🤪 I understood it differently, or I remembered them incorrectly. 🤷🏽‍♂️ Or! I've been pronouncing a lot of words wrongly! 🤦🏽‍♂️

However, when I use Hangeul and Kana, I don't have to worry about diacritical marks. Both scripts have stable pronunciations, not like Latin characters where we have to use diacritical marks.

The only catch, the reader should be able to read Hangeul or Kana scripts, which most don't. 🤔 So, back to trying to get a grasp of Filipino diacritical marks. 🤯


Am I right that the Filipino diacritical marks represent the sound?

Examples:

  • e = neutral = abrupt soft stop?
  • è = high to low = abrupt hard stop? (paiwa?)
  • é = low to high = malumay? (malumanay?)
  • ê = low to high to low = ??
  • ë = the new Filipino schwa (no idea, since I don't speak the few Philippine languages where a Filipino schwa is needed).

Any experts out there?

(In the revived diacritical marks, we no longer use ē. IIRC, it used to represent a long vowel sound.)

#Wika #Language #Filipino #Tagalog #Latin #Hangeul #Kana #LearningFilipino #MatutoMagFilipino @pilipinas @philippines @pinoy

silsinn9821,

@youronlyone So, in other words, they made it so even old computers who can do #ISO-8859-1 / #Windows-1252 but not #Unicode can still type such words with diacritics via Alt+NumKey combinations. But ē only exists in Unicode (maybe it first showed up on #Windows-1257 but that was only used by Baltic languages), so it can't be typed via Alt+NumKey codes (& not everyone knows that CharMap is a thing).
@pilipinas @philippines @pinoy

lpwaterhouse, to random
@lpwaterhouse@ioc.exchange avatar

I am currently designing a small toy-language and was considering making all strings proper objects and all source files utf-8. Lo and behold, Unicode has recently published some guidance: http://www.unicode.org/reports/tr55/ I am, however, rather deeply concerned about the general strong preference for over , e.g. as recommended for identifiers. I get wanting to allow people to use their own language and script wherever possible, and therefore recommending switching from e.g. requiring type names to start with an upper-case character to blocking an initial lower-case character, thereby allowing the use of unicameral (without upper and lower case) scripts. But I have this deep gut-feeling that while the TR certainly solves some existing classes, it also opens up a huge amount of new ones with this general attitude. I haven't yet gone through the TR with a fine-toothed comb to allay that fear, but I'd appreciate input from anyone that has thoughts on the matter.

Edent, to random
@Edent@mastodon.social avatar

Of course, we all know what the Fab Four enjoyed a bit of…

U+1FAB4 😉

worldsendless, to emacs
@worldsendless@qoto.org avatar

Today I am reminded that the difference between lazy "a la" and correct "à la" is called a "grave accent," not the pinyin 4th tone. We are doing french-english, not chinese-latin characters! In that's "LATIN SMALL LETTER A GRAVE"

TheDJ, to Wikipedia
@TheDJ@mastodon.social avatar

Wikimedia Foundation Joins as an Associate Member of the Unicode Consortium

https://blog.unicode.org/2024/03/wikimedia-foundation-joins-as-associate.html

ib, to ancientneareast

is going to get some additional characters.
@antiquidons @ancientneareast @archaeodons
<a class="invalid-href" rel="noopener noreferrer" target="_blank" title="Invalid link protocol">blog.unicode.org/2024/02/unico…</a>
Unicode 16.0 Alpha Review Opens for Feedback

villares, to python Portuguese
@villares@ciberlandia.pt avatar
Taffer, (edited ) to Discord
@Taffer@mastodon.gamedev.place avatar

How do you figure out what font(s) Discord desktop is using? Mine currently displays Japanese and Chinese text as Unicode "oh noes, I can't find a glyph" rectangles. There aren't any font settings in the app other than size. 🤷

Taffer,
@Taffer@mastodon.gamedev.place avatar

Looks like OpenSUSE Tumbleweed didn't install the Noto CJK fonts. Fixed it!

MegaMichelle, to random
@MegaMichelle@a2mi.social avatar

What the hell is the deal with the "soon" emoji 🔜 . How did that get into the standard? It's got text in English fer chrissake. Who needed this so badly that this symbol was in sufficient circulation that they had to standardize it?

hywan, to rust
@hywan@fosstodon.org avatar

USV, https://github.com/sixarm/usv.

Unicode Separated Values (USV) data markup for units, records, groups, files, streaming, and more.

A better CSV, TSV, or ASV. It uses existing UTF-8 symbols. An RFC for the IETF has been submitted, https://www.ietf.org/rfc/rfc4180.txt.

It comes with Rust crates. Even converters, like usv-to-csv and csv-to-usv (https://github.com/SixArm/csv-to-usv-rust-crate and https://github.com/SixArm/usv-to-csv-rust-crate).

Pretty neat and clever!

codepoints, to random

Character Model for the World Wide Web: String Matching

https://www.w3.org/TR/charmod-norm/

by @addison

This document is a trove of knowledge about string processing and what can go wrong with regard to features like normalization.

ailnoth, to random

TIL that any is a valid as long as your keep under 32 bytes. And yes, this includes exotic control characters. And I now have ssids that is just random . I challenge everyone to build an wifi fuzzer that finds an at most 32 bytes unicode ssid that can crash a device remotly when it tries to scan afters stations.

idontlikenames, to art

⸻ unicode character l⸻o⸻n⸻g⸻b⸻o⸻i

kjg, to programming
@kjg@hachyderm.io avatar

I’ve never seen this transcoding error before. Do you think it was supposed to be an emoji?

sebsauvage, to random French
@sebsauvage@framapiaf.org avatar


Un collègue a posté ça. C'est joli :

ᓚᘏᗢ

Edent, to fediverse
@Edent@mastodon.social avatar

🆕 blog! “Internationalise The Fediverse”

We live in the future now. It is OK to use Unicode everywhere. It seems bizarre to me that modern Internet services sometimes "forget" that there's a world outside the Anglosphere. Some people have the temerity to speak foreign languages! And some of those languages have accents on their letters!! Even worse, some …

👀 Read more: https://shkspr.mobi/blog/2024/02/internationalise-the-fediverse/

chris, to plex
@chris@mstdn.games avatar

MusicBrainz Picard is really good, open source and cross-platform. If your music folder is a chaotic mess of unsorted, untagged files, give it a try, it works like magic.

https://picard.musicbrainz.org/?l=en

#plex #musicbrainz

pootriarch,
@pootriarch@eldritch.cafe avatar

for last.fm users dragging their feet like me, because all those 'smart' quotes blow up your last.fm stats: there's an option to dumbify them. i wish i had known this a year earlier
#MusicBrainz #LastFM #picard #unicode

Goffi, (edited ) to random French
@Goffi@mastodon.social avatar

After #GUI, I've now pushed implementation of a #TUI output in #Libervia #CLI frontend, which shows A/V call video streams directly into your terminal! It's using #Kitty or #iTerm2 image protocols, or #Unicode half-blocks (thanks to #termimage)

I'm not aware of any other CLI tools doing something similar (#XMPP or not). It's not as useful as GUI, but it's quite fun :)

Attached are 2 demo videos of call between Libervia and #Conversations, on #Konsole.

#terminal #shell

alan, to kpop
@alan@subdued.social avatar

Somehow I ended up having to explain the finger heart gesture to my coworkers on our standup this morning. Yes, I'm that cool. I've got my finger on the pulse of the youth. Also, who knew that finger heart is already in ?!

🫰

Also, happy to those who celebrate. :)

https://en.wikipedia.org/wiki/Finger_heart

Edent, to fediverse
@Edent@mastodon.social avatar

In theory you should be able to follow this test user:

@你好

But I can't find any Fediverse software which actually supports non-ASCII usernames.

If you are able to see the user, its description, and its avatar - please send me a screenshot 🙂

gbraad, to random
@gbraad@mastodon.social avatar

30EDD 30EDE

ovid, to random
@ovid@fosstodon.org avatar

Years ago I mentioned "the little-known tapeworm operator": "\x{1F4A9}\x{0327}"

There is no such thing, but the comment confused people. I was just making an obscure (and crass) joke. \x{1F4A9} is a "pile of poo" emoji and \x{0327} is a combining cedille, that little wormlike accent you'll see in some foreign words: français.

perl -C -E 'say "\x{1F4A9}\x{0327}"'

💩̧

I know others often don't "get" my humor, but these jokes are for me. It's a bonus if others get them.

ai6yr, to random
@ai6yr@m.ai6yr.org avatar

Bah humbug. U+1FAAB is a battery, U+1FAA8 is a rock.

SteveFaulkner, to accessibility
@SteveFaulkner@mastodon.social avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • khanakhh
  • mdbf
  • ethstaker
  • magazineikmin
  • GTA5RPClips
  • rosin
  • thenastyranch
  • Youngstown
  • InstantRegret
  • slotface
  • osvaldo12
  • kavyap
  • DreamBathrooms
  • JUstTest
  • Durango
  • everett
  • cisconetworking
  • normalnudes
  • tester
  • ngwrru68w68
  • cubers
  • modclub
  • tacticalgear
  • provamag3
  • Leos
  • anitta
  • lostlight
  • All magazines