|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
|
<html><head><title>UTF-8 Sampler</title>
|
|
|
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
|
</head><body bgcolor="#ffffff" text="#000000">
|
|
|
<h1><tt>UTF-8 SAMPLER</tt></h1>
|
|
|
|
|
|
<big><big> ¥ · £ · € · $ · ¢ · ₡ · ₢ · ₣ · ₤ · ₥ · ₦ · ₧ · ₨ · ₩ · ₪ · ₫ · ₭ · ₮ · ₯</big></big>
|
|
|
|
|
|
<p>
|
|
|
<blockquote>
|
|
|
Frank da Cruz<br>
|
|
|
<a href="index.html">The Kermit Project - Columbia University</a><br>
|
|
|
New York City<br>
|
|
|
<a href="mailto:fdc@columbia.edu">fdc@columbia.edu</a>
|
|
|
|
|
|
<p>
|
|
|
<i>Last update:</i>
|
|
|
Wed Apr 12 16:54:07 2006
|
|
|
</blockquote>
|
|
|
<p>
|
|
|
<hr>
|
|
|
[ <a href="http://www.columbia.edu/~fdc/pace/">PEACE</a> ]
|
|
|
[ <a href="#poetry">Poetry</a> ]
|
|
|
[ <a href="#glass">I Can Eat Glass</a> ]
|
|
|
[ <a href="#quickbrownfox">The Quick Brown Fox</a> ]
|
|
|
[ <a href="#html">HTML Features</a> ]
|
|
|
[ <a href="#credits">Credits, Tools, Commentary</a> ]
|
|
|
<p>
|
|
|
|
|
|
<big><big>U</big>TF-8</big> is an ASCII-preserving encoding method for
|
|
|
<a href="unicode.html">Unicode</a> (ISO 10646), the Universal Character Set
|
|
|
(UCS). The UCS encodes most of the world's writing systems in a single
|
|
|
character set, allowing you to mix languages and scripts within a document
|
|
|
without needing any tricks for switching character sets. This web page is
|
|
|
encoded directly in UTF-8.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
As shown <a href="glass.html">HERE</a>,
|
|
|
Columbia University's <a href="k95.html">Kermit 95</a> terminal emulation
|
|
|
software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, or 2000
|
|
|
when using a monospace Unicode font like <a
|
|
|
href="http://www.monotype.com">Andale Mono WT J</a> or <a
|
|
|
href="http://www.evertype.com/emono/">Everson Mono Terminal</a>, or the lesser
|
|
|
populated Courier New, Lucida Console, or Andale Mono. <a
|
|
|
href="ckermit.html">C-Kermit</a> can handle it too,
|
|
|
<a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html">if you have a Unicode
|
|
|
display</a>. As many languages as are representable in your font can be seen
|
|
|
on the screen at the same time.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
This, however, is a Web page. Some Web browsers can handle UTF-8, some can't.
|
|
|
And those that can might not have a sufficiently populated font to work with
|
|
|
(some browsers might pick glyphs dynamically from multiple fonts; Netscape 6
|
|
|
seems to do this).
|
|
|
<a href="http://www.alanwood.net/unicode/fonts.html">CLICK HERE</a>
|
|
|
for a survey of Unicode fonts for Windows.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
The subtitle above shows currency symbols of many lands. If they don't
|
|
|
appear as blobs, we're off to a good start!
|
|
|
|
|
|
<hr>
|
|
|
<h3><a name="poetry">Poetry</a></h3>
|
|
|
|
|
|
From the Anglo-Saxon <a href="http://www.ragweedforge.com/poems.html"><cite>Rune Poem</cite></a> (Rune version):
|
|
|
<p><blockquote>
|
|
|
ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ<br>
|
|
|
ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ<br>
|
|
|
ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬<br>
|
|
|
</blockquote>
|
|
|
<p>
|
|
|
|
|
|
From Laȝamon's<i> <a href="http://mesl.itd.umich.edu/b/brut/">Brut</a></i>
|
|
|
(<i>The Chronicles of England</i>, Middle English, West Midlands):
|
|
|
<p>
|
|
|
<blockquote>
|
|
|
An preost wes on leoden, Laȝamon was ihoten<br>
|
|
|
He wes Leovenaðes sone -- liðe him be Drihten.<br>
|
|
|
He wonede at Ernleȝe at æðelen are chirechen,<br>
|
|
|
Uppen Sevarne staþe, sel þar him þuhte,<br>
|
|
|
Onfest Radestone, þer he bock radde.
|
|
|
</blockquote>
|
|
|
<p>
|
|
|
|
|
|
(The third letter in the author's name is Yogh, missing from many fonts;
|
|
|
<a href="st-erkenwald.html">CLICK HERE</a> for another Middle English sample
|
|
|
with some explanation of letters and encoding).
|
|
|
|
|
|
<p>
|
|
|
|
|
|
From the <cite>Tagelied</cite> of
|
|
|
|
|
|
<a href="http://gutenberg.spiegel.de/autoren/eschenba.htm">
|
|
|
<b>Wolfram von Eschenbach</b></a> (Middle High German):
|
|
|
<p><blockquote>
|
|
|
Sîne klâwen durh die wolken sint geslagen,<br>
|
|
|
er stîget ûf mit grôzer kraft,<br>
|
|
|
ich sih in grâwen tägelîch als er wil tagen,<br>
|
|
|
den tac, der im geselleschaft<br>
|
|
|
erwenden wil, dem werden man,<br>
|
|
|
den ich mit sorgen în verliez.<br>
|
|
|
ich bringe in hinnen, ob ich kan.<br>
|
|
|
sîn vil manegiu tugent michz leisten hiez.<br>
|
|
|
</blockquote><p>
|
|
|
|
|
|
Some lines of
|
|
|
<a href="http://users.hol.gr/~artemis/odysseas_elytis.htm">
|
|
|
<b>Odysseus Elytis</b></a> (Greek):
|
|
|
|
|
|
<blockquote>
|
|
|
Τη γλώσσα μου έδωσαν ελληνική<br>
|
|
|
το σπίτι φτωχικό στις αμμουδιές του Ομήρου.<br>
|
|
|
Μονάχη έγνοια η γλώσσα μου στις αμμουδιές του Ομήρου.<br>
|
|
|
<p>
|
|
|
από το Άξιον Εστί<br>
|
|
|
του Οδυσσέα Ελύτη
|
|
|
</blockquote>
|
|
|
|
|
|
<p>
|
|
|
|
|
|
The first stanza of
|
|
|
<a href="http://www.ocf.berkeley.edu/%7Eleong/Russkaya%20Literatura/Aleksandr%20Sergeevich%20Pushkin.htm"><b>Pushkin</b></a>'s <cite>Bronze Horseman</cite> (Russian):<br>
|
|
|
<p><blockquote>
|
|
|
На берегу пустынных волн<br>
|
|
|
Стоял он, дум великих полн,<br>
|
|
|
И вдаль глядел. Пред ним широко<br>
|
|
|
Река неслася; бедный чёлн<br>
|
|
|
По ней стремился одиноко.<br>
|
|
|
По мшистым, топким берегам<br>
|
|
|
Чернели избы здесь и там,<br>
|
|
|
Приют убогого чухонца;<br>
|
|
|
И лес, неведомый лучам<br>
|
|
|
В тумане спрятанного солнца,<br>
|
|
|
Кругом шумел.<br>
|
|
|
</blockquote><p>
|
|
|
|
|
|
<a href="http://www.compling.hu-berlin.de/~johannes/mxedruli/"><b>Šota Rustaveli</b></a>'s Veṗxis Ṭq̇aosani,
|
|
|
̣︡Th, <cite>The Knight in the Tiger's Skin</cite> (Georgian):<p>
|
|
|
<blockquote>
|
|
|
ვეპხის ტყაოსანი
|
|
|
შოთა რუსთაველი
|
|
|
<p>
|
|
|
ღმერთსი შემვედრე, ნუთუ კვლა დამხსნას სოფლისა შრომასა,
|
|
|
ცეცხლს, წყალსა და მიწასა, ჰაერთა თანა მრომასა;
|
|
|
მომცნეს ფრთენი და აღვფრინდე, მივჰხვდე მას ჩემსა ნდომასა,
|
|
|
დღისით და ღამით ვჰხედვიდე მზისა ელვათა კრთომაასა.
|
|
|
</blockquote>
|
|
|
<p>
|
|
|
|
|
|
Tamil poetry of Cupiramaniya Paarathiyar,
|
|
|
|
|
|
சுப்ரமணிய பாரதியார் (1882-1921):
|
|
|
|
|
|
<p>
|
|
|
<blockquote>
|
|
|
|
|
|
யாமறிந்த மொழிகளிலே தமிழ்மொழி போல் இனிதாவது எங்கும் காணோம், <br>
|
|
|
பாமரராய் விலங்குகளாய், உலகனைத்தும் இகழ்ச்சிசொலப் பான்மை கெட்டு, <br>
|
|
|
நாமமது தமிழரெனக் கொண்டு இங்கு வாழ்ந்திடுதல் நன்றோ? சொல்லீர்!<br
|
|
|
தேமதுரத் தமிழோசை உலகமெலாம் பரவும்வகை செய்தல் வேண்டும்.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
</blockquote>
|
|
|
|
|
|
<hr>
|
|
|
<h3><a name="glass">I Can Eat Glass</a></h3>
|
|
|
|
|
|
And from the sublime to the ridiculous, here is a
|
|
|
<a href="#notes">certain phrase¹</a> in an assortment of languages:
|
|
|
|
|
|
<p>
|
|
|
<ol>
|
|
|
<li><b>Sanskrit</b>: काचं शक्नोम्यत्तुम् । नोपहिनस्ति माम् ॥
|
|
|
|
|
|
<li><b>Sanskrit</b> <i>(standard transcription):</i> kācaṃ śaknomyattum; nopahinasti mām.
|
|
|
<li><b>Classical Greek</b>: ὕαλον ϕαγεῖν δύναμαι· τοῦτο οὔ με βλάπτει.
|
|
|
<li><b>Greek</b>: Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα.
|
|
|
<br><b>Etruscan</b>: (NEEDED)
|
|
|
<li><b>Latin</b>: Vitrum edere possum; mihi non nocet.
|
|
|
<li><b>Old French</b>: Je puis mangier del voirre. Ne me nuit.
|
|
|
<li><b>French</b>: Je peux manger du verre, ça ne me fait pas de mal.
|
|
|
<li><b>Provençal / Occitan</b>: Pòdi manjar de veire, me nafrariá pas.
|
|
|
<li><b>Québécois</b>: J'peux manger d'la vitre, ça m'fa pas mal.
|
|
|
<li><b>Walloon</b>: Dji pou magnî do vêre, çoula m' freut nén må.
|
|
|
<br><b>Champenois</b>: (NEEDED)
|
|
|
<br><b>Lorrain</b>: (NEEDED)
|
|
|
<li><b>Picard</b>: Ch'peux mingi du verre, cha m'foé mie n'ma.
|
|
|
<br><b>Corsican</b>: (NEEDED)
|
|
|
<br><b>Jèrriais</b>: (NEEDED)
|
|
|
<li><b>Kreyòl Ayisyen</b>: Mwen kap manje vè, li pa blese'm.
|
|
|
<li><b>Basque</b>: Kristala jan dezaket, ez dit minik ematen.
|
|
|
<li><b>Catalan / Català</b>: Puc menjar vidre, que no em fa mal.
|
|
|
<li><b>Spanish</b>: Puedo comer vidrio, no me hace daño.
|
|
|
<li><b>Aragones</b>: Puedo minchar beire, no me'n fa mal .
|
|
|
<li><b>Galician</b>: Eu podo xantar cristais e non cortarme.
|
|
|
<li><b>Portuguese</b>: Posso comer vidro, não me faz mal.
|
|
|
<li><b>Brazilian Portuguese</b> (<a href="#notes">7</a>):
|
|
|
Posso comer vidro, não me machuca.
|
|
|
<li><b>Caboverdiano</b>: M' podê cumê vidru, ca ta maguâ-m'.
|
|
|
<li><b>Papiamentu</b>: Ami por kome glas anto e no ta hasimi daño.
|
|
|
<li><b>Italian</b>: Posso mangiare il vetro e non mi fa male.
|
|
|
<li><b>Milanese</b>: Sôn bôn de magnà el véder, el me fa minga mal.
|
|
|
<li><b>Roman</b>: Me posso magna' er vetro, e nun me fa male.
|
|
|
<li><b>Napoletano</b>: M' pozz magna' o'vetr, e nun m' fa mal.
|
|
|
<li><b>Sicilian</b>: Puotsu mangiari u vitru, nun mi fa mali.
|
|
|
<li><b>Venetian</b>: Mi posso magnare el vetro, no'l me fa mae.
|
|
|
<li><b>Zeneise</b> <i>(Genovese):</i> Pòsso mangiâ o veddro e o no me fà mâ.
|
|
|
<br><b>Rheto-Romance / Romansch</b>: (NEEDED)
|
|
|
<br><b>Romany / Tsigane</b>: (NEEDED)
|
|
|
<li><b>Romanian</b>: Pot să mănânc sticlă și ea nu mă rănește.
|
|
|
<li><b>Esperanto</b>: Mi povas manĝi vitron, ĝi ne damaĝas min.
|
|
|
<br><b>Pictish</b>: (NEEDED)
|
|
|
<br><b>Breton</b>: (NEEDED)
|
|
|
<li><b>Cornish</b>: Mý a yl dybry gwéder hag éf ny wra ow ankenya.
|
|
|
<li><b>Welsh</b>: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi.
|
|
|
<li><b>Manx Gaelic</b>: Foddym gee glonney agh cha jean eh gortaghey mee.
|
|
|
<li><b>Old Irish</b> <i>(Ogham):</i> ᚛᚛ᚉᚑᚅᚔᚉᚉᚔᚋ ᚔᚈᚔ ᚍᚂᚐᚅᚑ ᚅᚔᚋᚌᚓᚅᚐ᚜
|
|
|
<li><b>Old Irish</b> <i>(Latin):</i> Con·iccim ithi nglano. Ním·géna.
|
|
|
|
|
|
<li><b>Irish</b>: Is féidir liom gloinne a ithe. Ní dhéanann sí dochar ar bith dom.
|
|
|
|
|
|
<li><b>Scottish Gaelic</b>: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
|
|
|
<li><b>Anglo-Saxon</b> <i>(Runes):</i>
|
|
|
ᛁᚳ᛫ᛗᚨᚷ᛫ᚷᛚᚨᛋ᛫ᛖᚩᛏᚪᚾ᛫ᚩᚾᛞ᛫ᚻᛁᛏ᛫ᚾᛖ᛫ᚻᛖᚪᚱᛗᛁᚪᚧ᛫ᛗᛖ᛬
|
|
|
<li><b>Anglo-Saxon</b> <i>(Latin):</i> Ic mæg glæs eotan ond hit ne hearmiað me.
|
|
|
<li><b>Middle English</b>: Ich canne glas eten and hit hirtiþ me nouȝt.
|
|
|
<li><b>English</b>: I can eat glass and it doesn't hurt me.
|
|
|
<li><b>English</b> <i>(IPA):</i> [aɪ kæn iːt glɑːs ænd ɪt dɐz nɒt hɜːt miː] (Received Pronunciation)
|
|
|
<li><b>English</b> <i>(Braille):</i> ⠊⠀⠉⠁⠝⠀⠑⠁⠞⠀⠛⠇⠁⠎⠎⠀⠁⠝⠙⠀⠊⠞⠀⠙⠕⠑⠎⠝⠞⠀⠓⠥⠗⠞⠀⠍⠑
|
|
|
<li><b>Lalland Scots / Doric</b>: Ah can eat gless, it disnae hurt us.
|
|
|
<br><b>Glaswegian</b>: (NEEDED)
|
|
|
<li><b>Gothic</b> (<a href="#notes">4</a>):
|
|
|
𐌼𐌰𐌲
|
|
|
𐌲𐌻𐌴𐍃
|
|
|
𐌹̈𐍄𐌰𐌽,
|
|
|
𐌽𐌹
|
|
|
𐌼𐌹𐍃
|
|
|
𐍅𐌿
|
|
|
𐌽𐌳𐌰𐌽
|
|
|
𐌱𐍂𐌹𐌲𐌲𐌹𐌸.
|
|
|
<li><b>Old Norse</b> <i>(Runes):</i> ᛖᚴ ᚷᛖᛏ ᛖᛏᛁ
|
|
|
ᚧ ᚷᛚᛖᚱ ᛘᚾ
|
|
|
ᚦᛖᛋᛋ ᚨᚧ ᚡᛖ
|
|
|
ᚱᚧᚨ ᛋᚨᚱ
|
|
|
|
|
|
<li><b>Old Norse</b> <i>(Latin):</i> Ek get etið gler án þess að verða sár.
|
|
|
|
|
|
<li><b>Norsk / Norwegian (Nynorsk):</b> Eg kan eta glas utan å skada meg.
|
|
|
<li><b>Norsk / Norwegian (Bokmål):</b> Jeg kan spise glass uten å skade meg.
|
|
|
<br><b>Føroyskt / Faroese</b>: (NEEDED)
|
|
|
<li><b>Íslenska / Icelandic</b>: Ég get etið gler án þess að meiða mig.
|
|
|
<li><b>Svenska / Swedish</b>: Jag kan äta glas utan att skada mig.
|
|
|
<li><b>Dansk / Danish</b>: Jeg kan spise glas, det gør ikke ondt på mig.
|
|
|
<li><b>Soenderjysk</b>: Æ ka æe glass uhen at det go mæ naue.
|
|
|
<li><b>Frysk / Frisian</b>: Ik kin glês ite, it docht me net sear.
|
|
|
<!-- <li><b>Nederlands / Dutch</b>: Ik kan glas eten, het doet mij geen pijn. -->
|
|
|
<!-- <li><b>Nederlands / Dutch</b>: Ik kan glas eten zonder dat het
|
|
|
mij
|
|
|
schaadt. -->
|
|
|
<!-- <li><tt>Dutch: Ik kan glas eten, maar dat doet mij geen kwaad.</tt> -->
|
|
|
<li><b>Nederlands / Dutch</b>: Ik kan glas eten, het doet
|
|
|
mij
|
|
|
geen kwaad.
|
|
|
|
|
|
|
|
|
<LI><B>Kirchröadsj/Bôchesserplat</B>: Iech ken glaas èèse, mer 't deet miech
|
|
|
jing pieng.</LI>
|
|
|
|
|
|
<li><b>Afrikaans</b>: Ek kan glas eet, maar dit doen my nie skade nie.
|
|
|
<li><b>Lëtzebuergescht / Luxemburgish</b>: Ech kan Glas iessen, daat deet mir nët wei.
|
|
|
<li><b>Deutsch / German</b>: Ich kann Glas essen, ohne mir weh zu tun.
|
|
|
<li><b>Ruhrdeutsch</b>: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut.
|
|
|
<li><b>Langenfelder Platt</b>:
|
|
|
Isch kann Jlaas kimmeln, uuhne datt mich datt weh dääd.
|
|
|
<li><b>Lausitzer Mundart</b> ("Lusatian"): Ich koann Gloos assn und doas
|
|
|
dudd merr ni wii.
|
|
|
<li><b>Odenwälderisch</b>: Iech konn glaasch voschbachteln ohne dass es mir ebbs daun doun dud.
|
|
|
<li><b>Sächsisch / Saxon</b>: 'sch kann Glos essn, ohne dass'sch mer wehtue.
|
|
|
<li><b>Pfälzisch</b>: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
|
|
|
<li><b>Schwäbisch / Swabian</b>: I kå Glas frässa, ond des macht mr nix!
|
|
|
<li><b>Bayrisch / Bavarian</b>: I koh Glos esa, und es duard ma ned wei.
|
|
|
<li><b>Allemannisch</b>: I kaun Gloos essen, es tuat ma ned weh.
|
|
|
<li><b>Schwyzerdütsch</b>: Ich chan Glaas ässe, das tuet mir nöd weeh.
|
|
|
<li><b>Hungarian</b>: Meg tudom enni az üveget, nem lesz tőle bajom.
|
|
|
<li><b>Suomi / Finnish</b>: Voin syödä lasia, se ei vahingoita minua.
|
|
|
<li><b>Sami (Northern)</b>: Sáhtán borrat lása, dat ii leat bávččas.
|
|
|
<li><b>Erzian</b>: Мон ярсан
|
|
|
суликадо, ды
|
|
|
зыян
|
|
|
эйстэнзэ а
|
|
|
ули.
|
|
|
<br><b>Karelian</b>: (NEEDED)
|
|
|
<br><b>Vepsian</b>: (NEEDED)
|
|
|
<br><b>Votian</b>: (NEEDED)
|
|
|
<br><b>Livonian</b>: (NEEDED)
|
|
|
<li><b>Estonian</b>: Ma võin klaasi süüa, see ei tee mulle midagi.
|
|
|
<li><b>Latvian</b>: Es varu ēst stiklu, tas man nekaitē.
|
|
|
<li><b>Lithuanian</b>: Aš galiu valgyti stiklą ir jis manęs nežeidžia
|
|
|
<br><b>Old Prussian</b>: (NEEDED)
|
|
|
<br><b>Sorbian</b> (Wendish): (NEEDED)
|
|
|
<li><b>Czech</b>: Mohu jíst sklo, neublíží mi.
|
|
|
<li><b>Slovak</b>: Môžem jesť sklo. Nezraní ma.
|
|
|
<li><b>Polska / Polish</b>: Mogę jeść szkło i mi nie szkodzi.
|
|
|
<li><b>Slovenian:</b> Lahko jem steklo, ne da bi mi škodovalo.
|
|
|
<li><b>Croatian</b>: Ja mogu jesti staklo i ne boli me.
|
|
|
<li><b>Serbian</b> <i>(Latin):</i> Mogu jesti staklo a da mi ne škodi.
|
|
|
<li><b>Serbian</b> <i>(Cyrillic):</i> Могу јести стакло
|
|
|
а
|
|
|
да ми
|
|
|
не
|
|
|
шкоди.
|
|
|
<li><b>Macedonian:</b> Можам да јадам стакло, а не ме штета.
|
|
|
<li><b>Russian</b>: Я могу есть стекло, оно мне не вредит.
|
|
|
<li><b>Belarusian</b> <i>(Cyrillic):</i> Я магу есці шкло, яно мне не шкодзіць.
|
|
|
<li><b>Belarusian</b> <i>(Lacinka):</i> Ja mahu jeści škło, jano mne ne škodzić.
|
|
|
<li><b>Ukrainian</b>: Я можу їсти шкло, й воно мені не пошкодить.
|
|
|
<!-- <li><b>Bulgarian</b>: Мога да ям стъкло и не ме боли. -->
|
|
|
<li><b>Bulgarian</b>: Мога да ям стъкло, то не ми вреди.
|
|
|
|
|
|
<li><b>Georgian</b>: მინას ვჭამ და არა მტკივა.
|
|
|
<li><b>Armenian</b>: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։
|
|
|
<li><b>Albanian</b>: Unë mund të ha qelq dhe nuk më gjen gjë.
|
|
|
<li><b>Turkish</b>: Cam yiyebilirim, bana zararı dokunmaz.
|
|
|
<li><b>Turkish</b> <i>(Ottoman):</i> جام ييه بلورم بڭا ضررى طوقونمز
|
|
|
<li><b>Bangla / Bengali</b>:
|
|
|
আমি কাঁচ খেতে পারি, তাতে আমার কোনো ক্ষতি হয় না।
|
|
|
<li><b>Marathi</b>: मी काच खाऊ शकतो, मला ते दुखत नाही.
|
|
|
<li><b>Hindi</b>: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.
|
|
|
<li><b>Tamil</b>: நான் கண்ணாடி சாப்பிடுவேன், அதனால் எனக்கு ஒரு கேடும் வராது.
|
|
|
|
|
|
<li><b>Urdu</b><a href="#notes">(2)</a>: <span dir="RTL" lang=UR>
|
|
|
میں کانچ کھا سکتا ہوں اور مجھے تکلیف نہیں ہوتی ۔</span>
|
|
|
<li><b>Pashto</b><a href="#notes">(2)</a>: زه شيشه خوړلې شم، هغه ما نه خوږوي
|
|
|
<li><b>Farsi / Persian</b>: .من می توانم بدونِ احساس درد شيشه بخورم
|
|
|
<li><b>Arabic</b><a href="#notes">(2)</a>: <span dir="RTL" lang=AR>أنا قادر على أكل الزجاج و هذا لا يؤلمني.</span>
|
|
|
<br><B>Aramaic</B>: (NEEDED)
|
|
|
<li><B>Hebrew</B><a href="#notes">(2)</a>: <SPAN dir=rtl lang=HE>אני יכול לאכול זכוכית וזה לא מזיק לי.</SPAN>
|
|
|
<li><B>Yiddish</B><a href="#notes">(2)</a>: <SPAN dir=rtl lang=JI>איך קען עסן גלאָז און עס טוט מיר נישט װײ.</SPAN>
|
|
|
<br><b>Judeo-Arabic</b>: (NEEDED)
|
|
|
<br><b>Ladino</b>: (NEEDED)
|
|
|
<br><b>Gǝʼǝz</b>: (NEEDED)
|
|
|
<br><b>Amharic</b>: (NEEDED)
|
|
|
<li><b>Twi</b>: Metumi awe tumpan, ɜnyɜ me hwee.
|
|
|
<li><b>Hausa</b> (<i>Latin</i>): Inā iya taunar gilāshi kuma in gamā lāfiyā.
|
|
|
<li><b>Hausa</b> (<i>Ajami</i>) <a href="#notes">(2)</a>: <SPAN dir=rtl lang=HA>
|
|
|
إِنا إِىَ تَونَر غِلَاشِ كُمَ إِن غَمَا لَافِىَا</SPAN>
|
|
|
<li><b>Yoruba</b><a href="#notes">(3)</a>: Mo lè je̩ dígí, kò ní pa mí lára.
|
|
|
<li><b>(Ki)Swahili</b>: Naweza kula bilauri na sikunyui.
|
|
|
|
|
|
<li><b>Malay</b>: Saya boleh makan kaca dan ia tidak mencederakan saya.
|
|
|
<li><b>Tagalog</b>: Kaya kong kumain nang bubog at hindi ako masaktan.
|
|
|
<li><b>Chamorro</b>: Siña yo' chumocho krestat, ti ha na'lalamen yo'.
|
|
|
<li><b>Javanese</b>: Aku isa mangan beling tanpa lara.
|
|
|
<li><b>Burmese</b>:
|
|
|
က္ယ္ဝန္တော္၊က္ယ္ဝန္မ မ္ယက္စားနုိင္သည္။ ၎က္ရောင့္
|
|
|
ထိခုိက္မ္ဟု မရ္ဟိပာ။
|
|
|
(7)
|
|
|
|
|
|
<li><B>Vietnamese (quốc ngữ)</B>: Tôi có thể ăn thủy tinh mà không hại gì.
|
|
|
<li><B>Vietnamese (nôm)</B> (<a href="#notes">4</a>): 些 𣎏 世 咹 水 晶 𦓡 空 𣎏 害 咦
|
|
|
<br><b>Khmer</b>: (NEEDED)
|
|
|
<br><b>Lao</b>: (NEEDED)
|
|
|
<li><b>Thai</b>: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ
|
|
|
<li><b>Mongolian</b> <i>(Cyrillic):</i> Би шил идэй чадна, надад хортой биш
|
|
|
<li><b>Mongolian</b> <i>(Classic) (<a href="#notes">5</a>):</i>
|
|
|
ᠪᠢ ᠰᠢᠯᠢ ᠢᠳᠡᠶᠦ ᠴᠢᠳᠠᠨᠠ ᠂ ᠨᠠᠳᠤᠷ ᠬᠣᠤᠷᠠᠳᠠᠢ ᠪᠢᠰᠢ
|
|
|
<br><b>Dzongkha</b>: (NEEDED)
|
|
|
<br><b>Nepali</b>: (NEEDED)
|
|
|
<li><b>Tibetan</b>: ཤེལ་སྒོ་ཟ་ནས་ང་ན་གི་མ་རེད།
|
|
|
<li><b>Chinese</b>: <span lang=zh>我能吞下玻璃而不伤身体。</span>
|
|
|
<li><b>Chinese</b> (Traditional): 我能吞下玻璃而不傷身體。
|
|
|
|
|
|
<li><b>Taiwanese</b><a href="#notes">(6)</a>: Góa ē-tàng chia̍h po-lê, mā bē tio̍h-siong.
|
|
|
<li><b>Japanese</b>: <span lang=ja>私はガラスを食べられます。それは私を傷つけません。</span>
|
|
|
<li><b>Korean</b>: <span lang=ko>나는 유리를 먹을 수 있어요. 그래도 아프지 않아요</span>
|
|
|
<li><b>Bislama</b>: Mi save kakae glas, hemi no save katem mi.<br>
|
|
|
<li><b>Hawaiian</b>: Hiki iaʻu ke ʻai i ke aniani; ʻaʻole nō lā au e ʻeha.<br>
|
|
|
<li><b>Marquesan</b>: E koʻana e kai i te karahi, mea ʻā, ʻaʻe hauhau.
|
|
|
<li><b>Chinook Jargon:</b> Naika məkmək kakshət labutay, pi weyk ukuk munk-sik nay.
|
|
|
<li><b>Navajo</b>: Tsésǫʼ yishą́ągo bííníshghah dóó doo shił neezgai da.
|
|
|
<br><b>Cherokee</b> <i>(and Cree, Ojibwa, Inuktitut, and other Native American languages):</i> (NEEDED)
|
|
|
<br><b>Garifuna</b>: (NEEDED)
|
|
|
<br><b>Gullah</b>: (NEEDED)
|
|
|
<li><b>Lojban</b>: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi
|
|
|
<li><b>Nórdicg</b>: Ljœr ye caudran créneþ ý jor cẃran.
|
|
|
</ol>
|
|
|
<p>
|
|
|
|
|
|
<i>(Additions, corrections, completions,</i>
|
|
|
<a href="mailto:kermit@columbia.edu"><i>gratefully accepted</i></a><i>.)</i>
|
|
|
|
|
|
<p>
|
|
|
For testing purposes, some of these are repeated in a <b>monospace font</b> . . .
|
|
|
<p>
|
|
|
<ol>
|
|
|
<li><tt>Euro Symbol: €.</tt>
|
|
|
<li><tt>Greek: Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα.</tt>
|
|
|
<li><tt>Íslenska / Icelandic: Ég get etið gler án þess að meiða mig.</tt>
|
|
|
|
|
|
<li><tt>Polish: Mogę jeść szkło, i mi nie szkodzi.</tt>
|
|
|
<li><tt>Romanian: Pot să mănânc sticlă și ea nu mă rănește.</tt>
|
|
|
<li><tt>Ukrainian: Я можу їсти шкло, й воно мені не пошкодить.</tt>
|
|
|
<li><tt>Armenian: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։</tt>
|
|
|
<li><tt>Georgian: მინას ვჭამ და არა მტკივა.</tt>
|
|
|
<li><tt>Hindi: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.</tt>
|
|
|
<li><tt>Hebrew<a href="#notes">(2)</a>: <SPAN dir=rtl lang=HE>אני יכול לאכול זכוכית וזה לא מזיק לי.</SPAN></tt>
|
|
|
<li><tt>Yiddish<a href="#notes">(2)</a>: <SPAN dir=rtl lang=JI>איך קען עסן גלאָז און עס טוט מיר נישט װײ.</SPAN></tt>
|
|
|
<li><tt>Arabic<a href="#notes">(2)</a>: <span dir="RTL" lang=AR>أنا قادر على أكل الزجاج و هذا لا يؤلمني.</span></tt>
|
|
|
<li><tt>Japanese: <span lang=ja>私はガラスを食べられます。それは私を傷つけません。</span></tt>
|
|
|
<li><tt>Thai: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ</tt>
|
|
|
</ol>
|
|
|
<p>
|
|
|
|
|
|
<b><a name="notes">Notes:</a></b>
|
|
|
|
|
|
<p>
|
|
|
<ol>
|
|
|
|
|
|
<li>The "I can eat glass" phrase and initial translations (about 30 of them)
|
|
|
were borrowed from Ethan Mollick's <a
|
|
|
href="http://hcs.harvard.edu/~igp/glass.html">I Can Eat Glass</a> page
|
|
|
(which disappeared on or about June 2004) and converted to UTF-8. Since
|
|
|
Ethan's original page is gone, I should mention that his purpose was to offer
|
|
|
travelers a phrase they could use in any country that would command a
|
|
|
certain kind of respect, or at least get attention. See <a
|
|
|
href="#credits">Credits</a> for the many additional contributions since
|
|
|
then. When submitting new entries, the word "hurt" (if you have a choice)
|
|
|
is used in the sense of "cause harm", "do damage", or "bother", rather than
|
|
|
"inflict pain" or "make sad". In this vein Otto Stolz comments (as do
|
|
|
others further down; personally I think it's better for the purpose of this
|
|
|
page to have extra entries and/or to show a greater repertoire of characters
|
|
|
than it is to enforce a strict interpretation of the word "hurt"!):
|
|
|
|
|
|
<p>
|
|
|
<object>
|
|
|
<blockquote>
|
|
|
<small>
|
|
|
|
|
|
This is the meaning I have translated to the Swabian dialect.
|
|
|
|
|
|
However, I just have noticed that most of the German variants
|
|
|
translate the "inflict pain" meaning. The German example should rather
|
|
|
read:
|
|
|
|
|
|
<p>
|
|
|
<blockquote>
|
|
|
"Ich kann Glas essen ohne mir zu schaden."
|
|
|
</blockquote>
|
|
|
<p>
|
|
|
|
|
|
(The comma fell victim to the 1996 orthographic reform,
|
|
|
cf. <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P76</tt></a>.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
You may wish to contact the contributors of the following translations
|
|
|
to correct them:
|
|
|
|
|
|
<p>
|
|
|
<ul>
|
|
|
|
|
|
<li> Lëtzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir nët wei.
|
|
|
<li> Lausitzer Mundart ("Lusatian"): Ich koann Gloos assn und doas dudd merr ni wii.
|
|
|
<li> Sächsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue.
|
|
|
<li> Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei.
|
|
|
<li> Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
|
|
|
<li> Schwyzerdütsch: Ich chan Glaas ässe, das tuet mir nöd weeh.
|
|
|
</ul>
|
|
|
<p>
|
|
|
|
|
|
In contrast, I deem the following translations *alright*:
|
|
|
|
|
|
<p>
|
|
|
<ul>
|
|
|
|
|
|
<li> Ruhrdeutsch: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut.
|
|
|
<li> Pfälzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
|
|
|
<li> Schwäbisch / Swabian: I kå Glas frässa, ond des macht mr nix!
|
|
|
</ul>
|
|
|
<p>
|
|
|
|
|
|
(However, you could remove the commas, on account of
|
|
|
<a href="http://www.ids-mannheim.de/reform/e3-1.html#P76"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P76</tt></a>
|
|
|
and
|
|
|
|
|
|
<a href="http://www.ids-mannheim.de/reform/e3-1.html#P72"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P72</tt></a>, respectively.)
|
|
|
|
|
|
<p>
|
|
|
|
|
|
I guess, also these examples translate the <i>wrong</i> sense of "hurt",
|
|
|
though I do not know these languages well enough to assert them
|
|
|
definitely:
|
|
|
|
|
|
<p>
|
|
|
<ul>
|
|
|
|
|
|
<li> Nederlands / Dutch: Ik kan glas eten; het doet mij geen
|
|
|
pijn. <i>(This one has been changed)</i>
|
|
|
<li> Kirchröadsj/Bôchesserplat: Iech ken glaas èèse, mer 't deet miech jing pieng.
|
|
|
|
|
|
</ul>
|
|
|
<p>
|
|
|
|
|
|
In the Romanic languages, the variations on "fa male" (it) are probably
|
|
|
wrong, whilst the variations on "hace daño" (es) and "damaĝas" (Esperanto) are probably correct; "nocet" (la) is definitely right.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
The northern Germanic variants of "skada" are probably right, as are
|
|
|
the Slavic variants of "škodi/шкоди" (se); however the Slavic variants
|
|
|
of " boli" (hv) are probably wrong, as "bolena" means "pain/ache", IIRC.
|
|
|
|
|
|
</small>
|
|
|
</blockquote>
|
|
|
</object>
|
|
|
<p>
|
|
|
|
|
|
The numbering of the samples is arbitrary, done only to keep track of how
|
|
|
many there are, and can change any time a new entry is added. The
|
|
|
arrangement is also arbitrary but with some attempt to group related
|
|
|
examples together. Note: All languages not listed are wanted, not just the
|
|
|
ones that say (NEEDED).
|
|
|
|
|
|
<li><a name="note1">Correct right-to-left display of these languages
|
|
|
depends on the capabilities of your browser.</a> The period should
|
|
|
appear on the left. In the monospace Yiddish example, the Yiddish digraphs
|
|
|
should occupy one character cell.
|
|
|
|
|
|
<li>Yoruba: The third word is Latin letter small 'j' followed by
|
|
|
small 'e' with U+0329, Combining Vertical Line Below. This displays
|
|
|
correctly only if your Unicode font includes the U+0329 glyph and your
|
|
|
browser supports combining diacritical marks. The Indic examples
|
|
|
also include combining sequences.
|
|
|
|
|
|
<li>Includes Unicode 3.1 (or later) characters beyond Plane 0.
|
|
|
|
|
|
<li>The Classic Mongolian example should be vertical, top-to-bottom and
|
|
|
left-to-right. But such display is almost impossible. Also no font yet
|
|
|
exists which provides the proper ligatures and positional variants for the
|
|
|
characters of this script, which works somewhat like Arabic.
|
|
|
|
|
|
<li>Taiwanese is also known as Holo or Hoklo, and is related to Southern
|
|
|
Min dialects such as Amoy.
|
|
|
Contributed by Henry H. Tan-Tenn, who comments, "The above is
|
|
|
the romanized version, in a script current among Taiwanese Christians since
|
|
|
the mid-19th century. It was invented by British missionaries and saw use in
|
|
|
hundreds of published works, mostly of a religious nature. Most Taiwanese did
|
|
|
not know Chinese characters then, or at least not well enough to read. More
|
|
|
to the point, though, a written standard using Chinese characters has never
|
|
|
developed, so a significant minority of words are represented with different
|
|
|
candidate characters, depending on one's personal preference or etymological
|
|
|
theory. In this sentence, for example, "-tàng", "chia̍h",
|
|
|
"mā" and "bē" are problematic using Chinese characters.
|
|
|
"Góa" (I/me) and "po-lê" (glass) are as written in other Sinitic
|
|
|
languages (e.g. Mandarin, Hakka)."
|
|
|
|
|
|
<li>Wagner Amaral of Pinese & Amaral Associados notes that
|
|
|
the Brazilian Portuguese sentence for
|
|
|
"I can eat glass" should be identical to the Portuguese one, as the word
|
|
|
"machuca" means "inflict pain", or rather "injuries". The words "faz
|
|
|
mal" would more correctly translate as "cause harm".
|
|
|
|
|
|
<li>Burmese: In English the first person pronoun "I" stands for both
|
|
|
genders, male and female. In Burmese (except in the central part of Burma)
|
|
|
kyundaw (<font
|
|
|
size="+1"
|
|
|
face="Padauk">က္ယ္ဝန္တော္</font>) for male and kyanma (<font
|
|
|
size="+1" face="Padauk">က္ယ္ဝန္မ</font>) for female.
|
|
|
Using here a fully-compliant Unicode Burmese font -- sadly one and only Padauk
|
|
|
Graphite font exists -- rendering using graphite engine.
|
|
|
<a href="http://h1.ripway.com/bamarsar/">CLICK HERE</a> to test Burmese
|
|
|
characters.
|
|
|
|
|
|
</ol>
|
|
|
|
|
|
<hr>
|
|
|
<h3><a name="quickbrownfox">The Quick Brown Fox</a></h3>
|
|
|
|
|
|
The "I can eat glass" sentences do not necessarily show off the orthography of
|
|
|
each language to best advantage. In many alphabetic written languages it is
|
|
|
possible to include all (or most) letters (or "special" characters) in
|
|
|
a single (often nonsense) <i>pangram</i>. These were traditionally used in
|
|
|
typewriter instruction; now they are useful for stress-testing computer fonts
|
|
|
and keyboard input methods. Here are a few examples (SEND MORE):
|
|
|
|
|
|
<p>
|
|
|
<ol>
|
|
|
|
|
|
<li><b>English:</b> The quick brown fox jumps over the lazy dog.
|
|
|
<li><b>Irish:</b> "An ḃfuil do ċroí ag bualaḋ ó ḟaitíos an ġrá a ṁeall lena ṗóg éada ó
|
|
|
ṡlí do leasa ṫú?"
|
|
|
"D'ḟuascail Íosa Úrṁac na hÓiġe Beannaiṫe pór Éava agus Áḋaiṁ."
|
|
|
<li><b>Dutch:</b> Pa's wijze lynx bezag vroom het fikse aquaduct.
|
|
|
<li><b>German: </b> Falsches Üben von Xylophonmusik quält jeden
|
|
|
größeren Zwerg. (1)
|
|
|
<li><b>German: </b> <span lang=da>Im finſteren Jagdſchloß am offenen Felsquellwaſſer patzte der affig-flatterhafte kauzig-höfliche Bäcker über ſeinem verſifften kniffligen C-Xylophon.</span> (2)
|
|
|
<li><b>Swedish:</b> Flygande bäckasiner söka strax hwila på mjuka tuvor.
|
|
|
<li><b>Icelandic:</b> Sævör grét áðan því úlpan var ónýt.
|
|
|
<li><b>Polish:</b> Pchnąć w tę łódź jeża lub ośm skrzyń fig.
|
|
|
<li><b>Czech:</b> Příliš
|
|
|
žluťoučký kůň úpěl
|
|
|
ďábelské kódy.
|
|
|
<li><b>Slovak:</b> Starý kôň na hŕbe
|
|
|
kníh žuje tíško povädnuté
|
|
|
ruže, na stĺpe sa ďateľ
|
|
|
učí kvákať novú ódu o
|
|
|
živote.
|
|
|
<li><b>Russian:</b> В чащах
|
|
|
юга жил-был
|
|
|
цитрус? Да,
|
|
|
но
|
|
|
фальшивый
|
|
|
экземпляр!
|
|
|
ёъ.
|
|
|
|
|
|
<li><b>Bulgarian:</b> Жълтата дюля беше щастлива, че пухът, който цъфна, замръзна като гьон.
|
|
|
|
|
|
<li><b>Sami (Northern):</b> Vuol Ruoŧa geđggiid leat máŋga luosa ja čuovžža.
|
|
|
<li><b>Hungarian:</b> Árvíztűrő tükörfúrógép.
|
|
|
<li><b>Spanish:</b> El pingüino Wenceslao hizo kilómetros bajo exhaustiva lluvia y frío, añoraba a su querido cachorro.
|
|
|
<li><b>Portuguese:</b> O próximo vôo à noite sobre o Atlântico, põe freqüentemente o único médico. (3)
|
|
|
<li><b>French:</b> Les naïfs ægithales hâtifs pondant à Noël où il gèle sont sûrs d'être
|
|
|
déçus et de voir leurs drôles d'œufs abîmés.
|
|
|
|
|
|
<li><b>Esperanto:</b> Eĥoŝanĝo
|
|
|
ĉiuĵaŭde.
|
|
|
|
|
|
<li><b>Hebrew:</b> <span dir="RTL" lang=HE>זה כיף סתם לשמוע איך תנצח קרפד עץ טוב בגן.</span>
|
|
|
|
|
|
<li><b>Japanese</b> (Hiragana):<blockquote>
|
|
|
いろはにほへど ちりぬるを<br>
|
|
|
わがよたれぞ つねならむ<br>
|
|
|
うゐのおくやま けふこえて<br>
|
|
|
あさきゆめみじ ゑひもせず
|
|
|
(4)
|
|
|
</blockquote>
|
|
|
|
|
|
</ol>
|
|
|
<p>
|
|
|
<a name="notes2"><b>Notes:</b></a>
|
|
|
<p>
|
|
|
<ol>
|
|
|
|
|
|
<li>Other phrases commonly used in Germany include: "Ein wackerer Bayer
|
|
|
vertilgt ja bequem zwo Pfund Kalbshaxe" and, more recently, "Franz jagt im
|
|
|
komplett verwahrlosten Taxi quer durch Bayern", but both lack umlauts and
|
|
|
esszet. Previously, going for the shortest sentence that has all the
|
|
|
umlauts and special characters, I had
|
|
|
"Grüße aus Bärenhöfe
|
|
|
(und Óechtringen)!"
|
|
|
Acute accents are not used in native German words, so I was surprised to
|
|
|
discover "Óechtringen" in the Deutsche Bundespost
|
|
|
Postleitzahlenbuch:
|
|
|
<p>
|
|
|
<blockquote>
|
|
|
<a href="http://www.columbia.edu/~fdc/misc/oechtringen.jpg"><img
|
|
|
src="oechtringen-sm.jpg" alt="Click for full-size image (2.8MB)"></a>
|
|
|
</blockquote>
|
|
|
<p>
|
|
|
It's a small village in eastern Lower Saxony.
|
|
|
The "oe" in this case
|
|
|
turns out to be the Lower Saxon "lengthening e" (Dehnungs-e), which makes the
|
|
|
previous vowel long (used in a number of Lower Saxon place names such as Soest
|
|
|
and Itzehoe), not the "e" that indicates umlaut of the preceding vowel.
|
|
|
Many thanks to the Óechtringen-Namenschreibungsuntersuchungskomitee
|
|
|
(Alex Bochannek, Manfred Erren, Asmus Freytag, Christoph Päper, plus
|
|
|
Werner Lemberg who serves as
|
|
|
Óechtringen-Namenschreibungsuntersuchungskomiteerechtschreibungsprüfer)
|
|
|
|
|
|
for their relentless pursuit of the facts in this case. Conclusion: the
|
|
|
accent almost certainly does not belong on this (or any other native German)
|
|
|
word, but neither can it be dismissed as dirt on the page. To add to the
|
|
|
mystery, it has been reported that other copies of the same edition of the
|
|
|
PLZB do not show the accent! UPDATE (March 2006): David Krings was
|
|
|
intrigued enough by this report to contact the mayor of Ebstorf, of which
|
|
|
Oechtringen is a borough, who responded:
|
|
|
|
|
|
<p>
|
|
|
<blockquote style="font-family:sans-serif;font-size:80%">
|
|
|
Sehr geehrter Mr. Krings,<br>
|
|
|
wenn Oechtringen irgendwo mit einem Akzent auf dem O geschrieben wurde,
|
|
|
dann kann das nur ein Fehldruck sein. Die offizielle Schreibweise lautet
|
|
|
jedenfalls „Oechtringen“.<br>
|
|
|
Mit freundlichen Grüssen<br>
|
|
|
Der Samtgemeindebürgermeister<br>
|
|
|
i.A. Lothar Jessel
|
|
|
|
|
|
</blockquote>
|
|
|
|
|
|
|
|
|
<p>
|
|
|
<li>From Karl Pentzlin (Kochel am See, Bavaria, Germany):
|
|
|
"This German phrase is suited for display by a Fraktur (broken letter)
|
|
|
font. It contains: all common three-letter ligatures: ffi ffl fft and all
|
|
|
two-letter ligatures required by the Duden for Fraktur typesetting: ch ck ff
|
|
|
fi fl ft ll ſch ſi ſſ ſt tz (all in a
|
|
|
manner such they are not part of a three-letter ligature), one example of f-l
|
|
|
where German typesetting rules prohibit ligating (marked by a ZWNJ), and all
|
|
|
German letters a...z, ä,ö,ü,ß, ſ [long s]
|
|
|
(all in a manner such that they are not part of a two-letter Fraktur
|
|
|
ligature)."
|
|
|
|
|
|
Otto Stolz notes that "'Schloß' is now spelled 'Schloss', in
|
|
|
contrast to 'größer' (example 4) which has kept its
|
|
|
'ß'. Fraktur has been banned from general use, in 1942, and long-s
|
|
|
(ſ) has ceased to be used with Antiqua (Roman) even earlier (the
|
|
|
latest Antiqua-ſ I have seen is from 1913, but then
|
|
|
I am no expert, so there may well be a later instance." Later Otto confirms
|
|
|
the latter theory, "Now I've run across a book “Deutsche
|
|
|
Rechtschreibung” (edited by Lutz Mackensen) from 1954 (my reprint
|
|
|
is from 1956) that has kept the Antiqua-ſ in its dictionary part (but
|
|
|
neither in the preface nor in the appendix)."
|
|
|
|
|
|
<p>
|
|
|
|
|
|
<li>Diaeresis is not used in Iberian Portuguese.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
<li>From Yurio Miyazawa: "This poetry contains all the sounds in the
|
|
|
Japanese language and used to be the first thing for children to learn in
|
|
|
their Japanese class. The Hiragana version is particularly neat because it
|
|
|
covers every character in the phonetic Hiragana character set." Yurio also
|
|
|
sent the Kanji version:
|
|
|
|
|
|
<p>
|
|
|
<blockquote>
|
|
|
色は匂へど 散りぬるを<br>
|
|
|
我が世誰ぞ 常ならむ<br>
|
|
|
有為の奥山 今日越えて<br>
|
|
|
浅き夢見じ 酔ひもせず
|
|
|
</blockquote>
|
|
|
|
|
|
</ol>
|
|
|
<p>
|
|
|
<b>Accented Cyrillic:</b>
|
|
|
<p>
|
|
|
|
|
|
<i>(This section contributed by Vladimir Marinov.)</i>
|
|
|
|
|
|
<p>
|
|
|
|
|
|
In Bulgarian it is desirable, customary, or in some cases required to
|
|
|
write accents over vowels. Unfortunately, no computer character sets
|
|
|
contain the full repertoire of accented Cyrillic letters. With Unicode,
|
|
|
however, it is possible to combine any Cyrillic letter with any combining
|
|
|
accent. The appearance of the result depends on the font and the rendering
|
|
|
engine. Here are two examples.
|
|
|
|
|
|
<p>
|
|
|
<ol>
|
|
|
|
|
|
<li>Той видя бялата коса́ по главата и́ и ко́са на рамото и́, и ре́че да и́
|
|
|
рече́: "Пара́та по́ па́ри от па́рата, не ща пари́!", но си поми́сли: "Хей,
|
|
|
помисли́ си! А́ и́ река, а́ е скочила в тази река, която щеше да тече́,
|
|
|
а не те́че."
|
|
|
|
|
|
<p>
|
|
|
|
|
|
<li>По пъ́тя пъту́ват кю́рди и югославя́ни.
|
|
|
|
|
|
</ol>
|
|
|
|
|
|
<hr>
|
|
|
<h3><a name="html">HTML Features</a></h3>
|
|
|
|
|
|
Here is the Russian alphabet (uppercase only) coded in three
|
|
|
different ways, which should look identical:
|
|
|
|
|
|
<p>
|
|
|
<ol>
|
|
|
<li>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
|
|
|
<i>(Literal UTF-8)</i>
|
|
|
<li>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
|
|
|
<i>(Decimal numeric character reference)</i>
|
|
|
<li>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
|
|
|
<i>(Hexadecimal numeric character reference)</i>
|
|
|
</ol>
|
|
|
|
|
|
<p>
|
|
|
|
|
|
In another test, we use HTML language tags to distinguish Bulgarian, Russian,
|
|
|
and <a href="http://www.tiro.com/transfer/Serbian_Rendering.pdf">Serbian</a>,
|
|
|
which have different italic forms for lowercase
|
|
|
б, г, д, п, and/or т:
|
|
|
<p>
|
|
|
<blockquote>
|
|
|
<table>
|
|
|
<tr>
|
|
|
<td><b>Bulgarian</b>:
|
|
|
<td><span lang=BG>[ бгдпт</span> ]
|
|
|
<td><span lang=BG>[ <i>бгдпт</i></span> ]
|
|
|
<td><span lang=BG><i> Мога да ям стъкло и не ме боли.</span></i>
|
|
|
<tr>
|
|
|
<td><b>Russian</b>:
|
|
|
<td><span lang=RU>[ бгдпт</span> ]
|
|
|
<td><span lang=RU>[ <i>бгдпт</i></span> ]
|
|
|
<td><span lang=RU><i>Я могу есть стекло, это мне не вредит.</i></span>
|
|
|
<tr>
|
|
|
<td><b>Serbian</b>:
|
|
|
<td><span lang=SR>[ бгдпт</span> ]
|
|
|
<td><span lang=SR>[ <i>бгдпт</i></span> ]
|
|
|
<td> <span lang=SR><i>Могу јести стакло
|
|
|
а
|
|
|
да ми
|
|
|
не
|
|
|
шкоди.</i></span>
|
|
|
</table>
|
|
|
</blockquote>
|
|
|
<p>
|
|
|
|
|
|
<hr>
|
|
|
<h3><a name="credits">Credits, Tools, and Commentary</a></h3>
|
|
|
|
|
|
<dl>
|
|
|
<dt><b>Credits:</b></dt>
|
|
|
<dd>
|
|
|
The "I can eat glass" phrase and the initial collection of translations:
|
|
|
<a href="http://hcs.harvard.edu/~igp/glass.html">Ethan Mollick</a>.
|
|
|
Transcription / conversion to UTF-8: Frank da Cruz.
|
|
|
<b>Albanian:</b> Sindi Keesan.
|
|
|
<b>Afrikaans:</b> Johan Fourie, Kevin Poalses.
|
|
|
<b>Anglo Saxon:</b> Frank da Cruz.
|
|
|
<b>Arabic:</b> Najib Tounsi.
|
|
|
<b>Armenian:</b> Vaçe Kundakçı.
|
|
|
<b>Belarusian:</b> Alexey Chernyak.
|
|
|
<b>Bengali:</b> Somnath Purkayastha, Deepayan Sarkar.
|
|
|
<b>Bislama:</b> Dan McGarry.
|
|
|
<b>Braille:</b> Frank da Cruz.
|
|
|
<b>Bulgarian:</b> Sindi Keesan, Guentcho Skordev, Vladimir Marinov.
|
|
|
<b>Burmese:</b> "cetanapa".
|
|
|
<b>Cabo Verde Creole:</b> Cláudio Alexandre Duarte.
|
|
|
<b>Catalán:</b> Jordi Bancells.
|
|
|
<b>Chinese:</b> Jack Soo, Wong Pui Lam.
|
|
|
<b>Chinook Jargon:</b> David Robertson.
|
|
|
<b>Cornish:</b> Chris Stephens.
|
|
|
<b>Croatian:</b> Marjan Baće.
|
|
|
<b>Czech:</b> Stanislav Pecha, Radovan Garabík.
|
|
|
<b>Dutch:</b> Peter Gotink. Pim Blokland, Rob Daniel, Rob de Wit.
|
|
|
<b>Erzian:</b> Jack Rueter.
|
|
|
<b>Esperanto:</b> Franko Luin, Radovan Garabík.
|
|
|
<b>Estonian:</b> Meelis Roos.
|
|
|
<b>Farsi/Persian:</b> Payam Elahi.
|
|
|
<b>Finnish:</b> Sampsa Toivanen.
|
|
|
<b>French:</b> Luc Carissimo, Anne Colin du Terrail, Sean M. Burke.
|
|
|
<b>Galician:</b> Laura Probaos.
|
|
|
<b>Georgian:</b> Giorgi Lebanidze.
|
|
|
<b>German:</b> Christoph Päper, Otto Stolz, Karl Pentzlin, David Krings,
|
|
|
Frank da Cruz.
|
|
|
<b>Gothic:</b> Aurélien Coudurier.
|
|
|
<b>Greek:</b> Ariel Glenn, Constantine Stathopoulos, Siva Nataraja.
|
|
|
<b>Hebrew:</b> Jonathan Rosenne, Tal Barnea.
|
|
|
<b>Hausa:</b> Malami Buba, Tom Gewecke.
|
|
|
<b>Hawaiian:</b> na Hauʻoli Motta, Anela de Rego, Kaliko Trapp.
|
|
|
<b>Hindi:</b> Shirish Kalele.
|
|
|
<b>Hungarian:</b> András Rácz, Mark Holczhammer.
|
|
|
<b>Icelandic:</b> Andrés Magnússon, Sveinn Baldursson.
|
|
|
<b>International Phonetic Alphabet (IPA):</b> Siva Nataraja / Vincent Ramos.
|
|
|
<b>Irish:</b> Michael Everson, Marion Gunn, James Kass, Curtis Clark.
|
|
|
<b>Italian:</b> Thomas De Bellis.
|
|
|
<b>Japanese:</b> Makoto Takahashi, Yurio Miyazawa.
|
|
|
<b>Kirchröadsj:</b> Roger Stoffers.
|
|
|
<b>Kreyòl:</b> Sean M. Burke.
|
|
|
<b>Korean:</b> Jungshik Shin.
|
|
|
<b>Langenfelder Platt:</b> David Krings.
|
|
|
<b>Lëtzebuergescht:</b> Stefaan Eeckels.
|
|
|
<b>Lithuanian:</b> Gediminas Grigas.
|
|
|
<b>Lojban:</b> Edward Cherlin.
|
|
|
<b>Lusatian:</b> Ronald Schaffhirt.
|
|
|
<b>Macedonian:</b> Sindi Keesan.
|
|
|
<b>Malay:</b> Zarina Mustapha.
|
|
|
<b>Manx:</b> Éanna Ó Brádaigh.
|
|
|
<b>Marathi:</b> Shirish Kalele.
|
|
|
<b>Marquesan:</b> Kaliko Trapp.
|
|
|
<b>Middle English:</b> Frank da Cruz.
|
|
|
<b>Milanese:</b> Marco Cimarosti.
|
|
|
<b>Mongolian:</b> Tom Gewecke.
|
|
|
<b>Napoletano:</b> Diego Quintano.
|
|
|
<b>Navajo:</b> Tom Gewecke.
|
|
|
<a href="http://www.langmaker.com/db/mdl_nordicg.htm"><b>Nórdicg</b></a>:
|
|
|
Yẃlyan Rott.
|
|
|
<b>Norwegian:</b> Herman Ranes.
|
|
|
<b>Odenwälderisch:</b> Alexander Heß.
|
|
|
<b>Old Irish:</b> Michael Everson.
|
|
|
<b>Old Norse:</b> Andrés Magnússon.
|
|
|
<b>Papiamentu:</b> Bianca and Denise Zanardi.
|
|
|
<b>Pashto:</b> N.R. Liwal.
|
|
|
<b>Pfälzisch:</b> Dr. Johannes Sander.
|
|
|
<b>Picard:</b> Philippe Mennecier.
|
|
|
<b>Polish:</b> Juliusz Chroboczek, Paweł Przeradowski.
|
|
|
<b>Portuguese:</b> "Cláudio" Alexandre Duarte, Bianca and Denise
|
|
|
Zanardi, Pedro Palhoto Matos, Wagner Amaral.
|
|
|
<b>Québécois:</b> Laurent Detillieux.
|
|
|
<b>Roman:</b> Pierpaolo Bernardi.
|
|
|
<b>Romanian:</b> Juliusz Chroboczek, Ionel Mugurel.
|
|
|
<b>Ruhrdeutsch:</b> "Timwi".
|
|
|
<b>Russian:</b> Alexey Chernyak, Serge Nesterovitch.
|
|
|
<b>Sami:</b> Anne Colin du Terrail, Luc Carissimo.
|
|
|
<b>Sanskrit:</b> Siva Nataraja / Vincent Ramos.
|
|
|
<b>Sächsisch:</b> André Müller.
|
|
|
<b>Schwäbisch:</b> Otto Stolz.
|
|
|
<b>Scots:</b> Jonathan Riddell.
|
|
|
<b>Serbian:</b> Sindi Keesan, Ranko Narancic, Boris Daljevic, Szilvia Csorba.
|
|
|
<b>Slovak:</b> G. Adam Stanislav, Radovan Garabík.
|
|
|
<b>Slovenian:</b> Albert Kolar.
|
|
|
<b>Spanish:</b> <a href="http://www.panix.com/~aleida">Aleida
|
|
|
Muñoz</a>, Laura Probaos.
|
|
|
<b>Swahili:</b> Ronald Schaffhirt.
|
|
|
<b>Swedish:</b> Christian Rose, Bengt Larsson.
|
|
|
<b>Taiwanese:</b> Henry H. Tan-Tenn.
|
|
|
<b>Tagalog:</b> Jim Soliven.
|
|
|
<b>Tamil:</b> Vasee Vaseeharan.
|
|
|
<b>Tibetan:</b> D. Germano, Tom Gewecke.
|
|
|
<b>Thai:</b> Alan Wood's wife.
|
|
|
<b>Turkish:</b> Vaçe Kundakçı, Tom Gewecke, Merlign Olnon.
|
|
|
<b>Ukrainian:</b> Michael Zajac.
|
|
|
<b>Urdu:</b> Mustafa Ali.
|
|
|
<a href="http://nomfoundation.org/"><b>Vietnamese</b></a>: Dixon Au,
|
|
|
[James] Đỗ Bá Phước
|
|
|
<font face="PMingLiU">杜 伯 福</font>.
|
|
|
<b>Walloon:</b> Pablo Saratxaga.
|
|
|
<b>Welsh:</b> Geiriadur Prifysgol Cymru (Andrew).
|
|
|
<b>Yiddish:</b> Mark David,
|
|
|
<b>Zeneise:</b> Angelo Pavese.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
<dt><b>Tools Used to Create This Web Page:</b></dt>
|
|
|
|
|
|
<dd>The UTF8-aware <a href="k95.html">Kermit 95</a> terminal emulator on
|
|
|
Windows, to a Unix host with the <a
|
|
|
href="http://www.gnu.org/directory/emacs.html">EMACS</a> text editor. Kermit
|
|
|
95 displays UTF-8 and also allows keyboard entry of arbitrary Unicode BMP
|
|
|
characters as 4 hex digits, as shown <a href="glass.html">HERE</a>. Hex codes
|
|
|
for Unicode values can be found in <a
|
|
|
href="http://www.unicode.org/unicode/uni2book/u2.html">The Unicode
|
|
|
Standard</a> (recommended) and the <a
|
|
|
href="http://www.unicode.org/charts/">online code charts</a>. When
|
|
|
submissions arrive by email encoded in some other character set (Latin-1,
|
|
|
Latin-2, KOI, various PC code pages, JEUC, etc), I use the TRANSLATE command
|
|
|
of <a href="ckermit.html">C-Kermit</a> on the Unix host (<a
|
|
|
href="safe.html">where I read my mail</a>) to convert the character set to
|
|
|
UTF-8 (I could also use Kermit 95 for this; it has the same TRANSLATE
|
|
|
command). That's it -- no "Web authoring" tools, no locales, no "smart"
|
|
|
anything. It's just plain text, nothing more. By the way, there's nothing
|
|
|
special about EMACS -- any text editor will do, providing it allows entry of
|
|
|
arbitrary 8-bit bytes as text, including the 0x80-0x9F "C1" range. EMACS 21.1
|
|
|
actually supports UTF-8; earlier versions don't know about it and display the
|
|
|
octal codes; either way is OK for this purpose.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
<dt><b>Commentary:</b>
|
|
|
<dd>Date: Wed, 27 Feb 2002 13:21:59 +0100<br>
|
|
|
From: "Bruno DEDOMINICIS" <tt><b.dedominicis@cite-sciences.fr></tt><br>
|
|
|
Subject: Je peux manger du verre, cela ne me fait pas mal.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
I just found out your website and it makes me feel like proposing an
|
|
|
interpretation of the choice of this peculiar phrase.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
Glass is transparent and can hurt as everyone knows. The relation between
|
|
|
people and civilisations is sometimes effusional and more often rude. The
|
|
|
concept of breaking frontiers through globalization, in a way, is also an
|
|
|
attempt to deny any difference. Isn't "transparency" the flag of modernity?
|
|
|
Nothing should be hidden any more, authority is obsolete, and the new powers
|
|
|
are supposed to reign through loving and smiling and no more through
|
|
|
coercion...
|
|
|
|
|
|
<p>
|
|
|
|
|
|
Eating glass without pain sounds like a very nice metaphor of this attempt.
|
|
|
That is, frontiers should become glass transparent first, and be denied by
|
|
|
incorporating them. On the reverse, it shows that through globalization,
|
|
|
frontiers undergo a process of displacement, that is, when they are not any
|
|
|
more speakable, they become repressed from the speech and are therefore
|
|
|
incorporated and might become painful symptoms, as for example what happens
|
|
|
when one tries to eat glass.
|
|
|
|
|
|
<p>
|
|
|
|
|
|
The frontiers that used to separate bodies one from another tend to divide
|
|
|
bodies from within and make them suffer.... The chosen phrase then appears
|
|
|
as a denial of the symptom that might result from the destitution of
|
|
|
traditional frontiers.
|
|
|
|
|
|
<p>
|
|
|
Best,<br>
|
|
|
Bruno De Dominicis, Paris, France
|
|
|
</dl>
|
|
|
|
|
|
<p>
|
|
|
<b>Other Unicode pages onsite:</b>
|
|
|
<ul>
|
|
|
<li><a href="http://www.columbia.edu/~fdc/pace/">Peace in All Languages</a>
|
|
|
<li><a href="postal.html">Frank's Compulsive Guide to Postal Addresses</a>
|
|
|
(especially the <a href="postal.html#index">Index</a>)
|
|
|
<li><a href="st-erkenwald.html">Representing Middle English on the Web with UTF-8</a>
|
|
|
<li><a href="biblio.html">The Kermit Bibliography</a> (in UTF-8)
|
|
|
<li><a href="accents.html">Interchange of Non-English Computer Text</a>
|
|
|
(UTF-8 math and box-drawing)
|
|
|
<li><a href="utf8-t1.html">Unicode Table</a> (in UTF-8)
|
|
|
</ul>
|
|
|
<p>
|
|
|
<b>Unicode samplers offsite:</b>
|
|
|
<ul>
|
|
|
<li>Michael Everson's
|
|
|
<a href="http://www.evertype.com/scriptbib.html">Bibliography of Typography
|
|
|
and Scripts</a>
|
|
|
<li><a href="http://home.att.net/~jameskass/scriptlinks.htm">Sample Unicode
|
|
|
Test Pages and Script Links</a>
|
|
|
<li><a href="http://crism.maden.org/dunno.html">I don't know, I only work here</a>
|
|
|
<li><a href="http://www.trigeminal.com/samples/provincial.html">Anyone
|
|
|
can be provincial!</a>
|
|
|
<li><a href="http://www.macchiato.com/unicode/Unicode_transcriptions.html">Transcriptions of "Unicode"</a>
|
|
|
<li><a href="http://www.i18nguy.com/unicode-example.html">Example
|
|
|
Unicode Usage for Business Applications</a>
|
|
|
<li><a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html#apps">UTF-8 and
|
|
|
Unicode FAQ for Unix/Linux</a>
|
|
|
</ul>
|
|
|
<p>
|
|
|
<b>Unicode fonts:</b>
|
|
|
<ul>
|
|
|
<li><a href="http://www.alanwood.net/unicode/fonts.html">Unicode Fonts
|
|
|
for Windows Computers</a> (Alan Wood)
|
|
|
<li><a href="http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html">Unicode Fonts and
|
|
|
Tools for X11</a> (Markus Kuhn)
|
|
|
<li><a href="http://www.evertype.com/emono/">Everson Mono</a> (Michael
|
|
|
Everson)
|
|
|
<li><a href="http://www.monotype.com">Agfa Monotype</a>
|
|
|
</ul>
|
|
|
|
|
|
<p>
|
|
|
[ <a href="k95.html">Kermit 95</a> ]
|
|
|
[ <a href="glass.html">K95 Screen Shots</a> ]
|
|
|
[ <a href="ckermit.html">C-Kermit</a> ]
|
|
|
[ <a href="index.html">Kermit Home</a> ]
|
|
|
[ <a href="http://www.unicode.org/help/display_problems.html">Display Problems?</a> ]
|
|
|
[ <a href="http://www.unicode.org">The Unicode Consortium</a> ]
|
|
|
<hr>
|
|
|
<ADDRESS>
|
|
|
UTF-8 Sampler / <a href="index.html">The Kermit Project</a> /
|
|
|
<a href="http://www.columbia.edu">Columbia University</a> /
|
|
|
<a href="mailto:kermit@columbia.edu">kermit@columbia.edu</a>
|
|
|
</ADDRESS>
|
|
|
</body>
|
|
|
</html>
|