Encode::JP



Encode::JP

NAME
SYNOPSIS
ABSTRACT
DESCRIPTION
Note on ISO−2022−JP(−1)?
BUGS
SEE ALSO

NAME

Encode::JP − Japanese Encodings

SYNOPSIS

    use Encode qw/encode decode/;
    $euc_jp = encode("euc−jp", $utf8);   # loads Encode::JP implicitly
    $utf8   = decode("euc−jp", $euc_jp); # ditto

ABSTRACT

This module implements Japanese charset encodings. Encodings supported are as follows.

  Canonical   Alias             Description
  −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
  euc−jp      /\beuc.*jp$/i     EUC (Extended Unix Character)
              /\bjp.*euc/i
          /\bujis$/i
  shiftjis    /\bshift.*jis$/i  Shift JIS (aka MS Kanji)
          /\bsjis$/i
  7bit−jis    /\bjis$/i         7bit JIS
  iso−2022−jp                   ISO−2022−JP                  [RFC1468]
                = 7bit JIS with all Halfwidth Kana
                  converted to Fullwidth
  iso−2022−jp−1                 ISO−2022−JP−1                [RFC2237]
                                = ISO−2022−JP with JIS X 0212−1990
                  support.  See below
  MacJapanese                   Shift JIS + Apple vendor mappings
  cp932       /\bwindows−31j$/i Code Page 932
                                = Shift JIS + MS/IBM vendor mappings
  jis0201−raw                   JIS0201, raw format
  jis0208−raw                   JIS0201, raw format
  jis0212−raw                   JIS0201, raw format
  −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

DESCRIPTION

To find out how to use this module in detail, see Encode.

Note on ISO−2022−JP(−1)?

ISO−2022−JP−1 ( RFC2237 ) is a superset of ISO−2022−JP ( RFC1468 ) which adds support for JIS X 0212−1990. That means you can use the same code to decode to utf8 but not vice versa.

  $utf8 = decode('iso−2022−jp−1', $stream);

and

  $utf8 = decode('iso−2022−jp',   $stream);

yield the same result but

  $with_0212 = encode('iso−2022−jp−1', $utf8);

is now different from

  $without_0212 = encode('iso−2022−jp', $utf8 );

In the latter case, characters that map to 0212 are first converted to U+3013 (0xA2AE in EUC-JP; a white square also known as ’Tofu’ or ’geta mark’) then fed to the decoding engine. U+FFFD is not used, in order to preserve text layout as much as possible.

BUGS

The ASCII region (0x00−0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium.

SEE ALSO

Encode



More Linux Commands

manpages/X11::Keysyms.3pm.html
X11::Keysyms.3pm (Manual - Linux man page).................
This module exports a hash mapping the names of X11 keysyms, such as A or Linefeed or Hangul_J_YeorinHieuh, onto the numbers that represent them. The first argu

manpages/kismet_drone.1.html
kismet_drone(1) Wireless sniffing and monitoring remote dron
kismet_drone supports all the capture sources available to Kismet. Instead of processing packets locally, kismet_drone makes them available via TCP to a remote

manpages/hcreate_r.3.html
hcreate_r(3) - hash table management - Linux manual page....
The three functions hcreate(), hsearch(), and hdestroy() allow the caller to create and manage a hash search table containing entries consisting of a key (a str

manpages/XkbKeyAction.3.html
XkbKeyAction(3) - Returns the key action - Linux man page...
XkbKeyAction.3 - A key action defines the effect key presses and releases have on the internal state of the server. For example, the expected key action associa

manpages/XSetState.3.html
XSetState(3) - GC convenience routines - Linux manual page
The XSetState function sets the foreground, background, plane mask, and function components for the specified GC. XSetState can generate BadAlloc, BadGC, and Ba

manpages/CPU_SET.3.html
CPU_SET(3) - macros for manipulating CPU sets (Man Page)....
The cpu_set_t data structure represents a set of CPUs. CPU sets are used by sched_setaffinity(2) and similar interfaces. The cpu_set_t data type is implemented

manpages/gnutls_pkcs11_obj_deinit.3.html
gnutls_pkcs11_obj_deinit(3) - API function - Linux man page
This function will deinitialize a certificate structure. SINCE 2.12.0 REPORTING BUGS Report bugs to <bug-gnutls@gnu.org>. GnuTLS home page: http://www.gnu.org/s

manpages/ospfclient.8.html
ospfclient(8) an example ospf-api client - Linux man page...
ospfclient is a an example ospf-api client to test the ospfd daemon. OPTIONS ospfd A router where the API-enabled OSPF daemon is running. lsatype The value has

manpages/Tcl_UtfToUniChar.3.html
Tcl_UtfToUniChar(3) - routines for manipulating UTF-8 string
These routines convert between UTF-8 strings and Tcl_UniChars. A Tcl_UniChar is a Unicode character represented as an unsigned, fixed-size quantity. A UTF-8 cha

manpages/setnetgrent.3.html
setnetgrent(3) - handle network group entries (Man Page)....
The netgroup is a SunOS invention. A netgroup database is a list of string triples (hostname, username, domainname) or other netgroup names. Any of the elements

manpages/eps2eps.1.html
eps2eps(1) - Ghostscript PostScript "distiller" (Man Page)
eps2eps.1 - ps2ps uses gs(1) to convert PostScript(tm) file input.ps to simpler, normalized and (usually) faster PostScript in output.ps. Normally the output is

manpages/Tk_PhotoSetSize.3.html
Tk_PhotoSetSize(3) - manipulate the image data stored in a p
Tk_FindPhoto returns an opaque handle that is used to identify a particular photo image to the other procedures. The parameter is the name of the image, that is





We can't live, work or learn in freedom unless the software we use is free.