ascii – Page 11 – Tarik Billa

Character reading from file in Python

December 24, 2022 by Tarik

Ref: http://docs.python.org/howto/unicode Reading Unicode from a file is therefore simple: import codecs with codecs.open(‘unicode.rst’, encoding=’utf-8′) as f: for line in f: print repr(line) It’s also possible to open files in update mode, allowing both reading and writing: with codecs.open(‘test’, encoding=’utf-8′, mode=”w+”) as f: f.write(u’\u4500 blah blah blah\n’) f.seek(0) print repr(f.readline()[:1]) EDIT: I’m assuming that your … Read more

How do I get a list of all the ASCII characters using Python?

December 17, 2022 by Tarik

The constants in the string module may be what you want. All ASCII capital letters: >>> import string >>> string.ascii_uppercase ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ’ All printable ASCII characters: >>> string.printable ‘0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!”#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c’ For every single character defined in the ASCII standard, use chr: >>> ”.join(chr(i) for i in range(128)) ‘\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !”#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f’

How can I remove non-ASCII characters but leave periods and spaces?

December 16, 2022 by Tarik

You can filter all characters from the string that are not printable using string.printable, like this: >>> s = “some\x00string. with\x15 funny characters” >>> import string >>> printable = set(string.printable) >>> filter(lambda x: x in printable, s) ‘somestring. with funny characters’ string.printable on my machine contains: 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ !”#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c EDIT: On Python 3, filter will … Read more

“unmappable character for encoding” warning in Java

December 12, 2022 by Tarik

Try with: javac -encoding ISO-8859-1 file_name.java

How to get ASCII value of string in C#

December 2, 2022 by Tarik

From MSDN string value = “9quali52ty3”; // Convert the string into a byte[]. byte[] asciiBytes = Encoding.ASCII.GetBytes(value); You now have an array of the ASCII value of the bytes. I got the following: 57 113 117 97 108 105 53 50 116 121 51

Is ASCII code in matter of fact 7 bit or 8 bit?

December 1, 2022 by Tarik

ASCII was indeed originally conceived as a 7-bit code. This was done well before 8-bit bytes became ubiquitous, and even into the 1990s you could find software that assumed it could use the 8th bit of each byte of text for its own purposes (“not 8-bit clean”). Nowadays people think of it as an 8-bit … Read more

How many characters can UTF-8 encode?

December 1, 2022 by Tarik

UTF-8 does not use one byte all the time, it’s 1 to 4 bytes. The first 128 characters (US-ASCII) need one byte. The next 1,920 characters need two bytes to encode. This covers the remainder of almost all Latin alphabets, and also Greek, Cyrillic, Coptic, Armenian, Hebrew, Arabic, Syriac and Tāna alphabets, as well as … Read more

How to check if a String contains only ASCII?

December 1, 2022 by Tarik

From Guava 19.0 onward, you may use: boolean isAscii = CharMatcher.ascii().matchesAllOf(someString); This uses the matchesAllOf(someString) method which relies on the factory method ascii() rather than the now deprecated ASCII singleton. Here ASCII includes all ASCII characters including the non-printable characters lower than 0x20 (space) such as tabs, line-feed / return but also BEL with code … Read more

Why does Python print unicode characters when the default encoding is ASCII?

November 30, 2022 by Tarik

Thanks to bits and pieces from various replies, I think we can stitch up an explanation. By trying to print an unicode string, u’\xe9′, Python implicitly try to encode that string using the encoding scheme currently stored in sys.stdout.encoding. Python actually picks up this setting from the environment it’s been initiated from. If it can’t … Read more

What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa?

November 23, 2022 by Tarik

Let’s take a look at ASCII code table in binary. A 1000001 a 1100001 B 1000010 b 1100010 C 1000011 c 1100011 … Z 1011010 z 1111010 And 32 is 0100000 which is the only difference between lowercase and uppercase letters. So toggling that bit toggles the case of a letter.