Ruby string pack unpack detailed usage

The following is the Array # pack, String # unpack the template used in the list of characters. Characters can keep up with the back of the template that "length" of the figure. If you are using '*' to replace the "length", then that "all remaining characters" means.

Definition of the length of the template characters vary, in general, as

"iiii"

Such characters can be written in a row

"i4"

This way.

In the following note, short and long in length, respectively, 2 and 4 byte values (that is, 32-bit machines are usually referred to the size of short and long), which has nothing to do with the specific system. If `s',` S ', `l',` L 'behind the emergence of `_' or`! '(Such as "s!"), Then express the short or long depending on the system.

Please note: `i ',` I' (int) size will always depend on the system, and `n ',` N', `v ',` V' are the size of the system have nothing to do (Can not add `! ').

Template string spaces will be ignored. ruby 1.7 features: In addition, from the `# 'start to wrap Department or to the template between the end of the string will be seen as part of the Notes.

In the following description, if a particular problem for Array # pack and String # unpack have different explanations on the use of / to be separate, that is, the use of "Array # pack the narrative part of / String # unpack the narrative part of" the form of to illustrate this.

a: ASCII string (null character insert / retention follow-up to the null characters or spaces)

["abc"].pack("a") => "a"
      ["abc"].pack("a*") => "abc"
      ["abc"].pack("a4") => "abc\0"

      "abc\0".unpack("a4") => ["abc\0"]
      "abc ".unpack("a4") => ["abc "]

A: ASCII string (insert spaces / follow-up to delete the null characters and spaces)

["abc"].pack("A") => "a"
      ["abc"].pack("A*") => "abc"
      ["abc"].pack("A4") => "abc "

      "abc ".unpack("A4") => ["abc"]
      "abc\0".unpack("A4") => ["abc"]

Z: null end of the string (with a same / delete follow-up to the null character)

      ["abc"].pack("Z") => "a"
      ["abc"].pack("Z*") => "abc"
      ["abc"].pack("Z4") => "abc\0"

      "abc\0".unpack("Z4") => ["abc"]
      "abc ".unpack("Z4") => ["abc "]

b: bit string (from the lower bit to the higher level bit)

      "\001\002".unpack("b*") => ["1000000001000000"]
      "\001\002".unpack("b3") => ["100"]

      ["1000000001000000"].pack("b*") => "\001\002"

B: Bit String (from the superior position to the lower bit)

      "\001\002".unpack("B*") => ["0000000100000010"]
      "\001\002".unpack("B9") => ["000000010"]

      ["0000000100000010"].pack("B*") => "\001\002"

h: 16 hexadecimal string (lower half byte (nibble) first)

      "\x01\xfe".unpack("h*") => ["10ef"]
      "\x01\xfe".unpack("h3") => ["10e"]

      ["10ef"].pack("h*") => "\001\376"

H: 16 hexadecimal string (superior first half-byte)

      "\x01\xfe".unpack("H*") => ["01fe"]
      "\x01\xfe".unpack("H3") => ["01f"]

      ["01fe"].pack("H*") => "\001\376"

c: char (8bit signed integer)

      "\001\376".unpack("c*") => [1, -2]

      [1, -2].pack("c*") => "\001\376"
      [1, 254].pack("c*") => "\001\376"

C: unsigned char (8bit unsigned integer)

     "\001\376".unpack("C*") => [1, 254]

      [1, -2].pack("C*") => "\001\376"
      [1, 254].pack("C*") => "\001\376"

s: short (16bit signed integer, depending on the Endian) (s! not 16bit, it depends on the size of short)
Small Endian:

      "\001\002\376\375".unpack("s*") => [513, -514]

      [513, 65022].pack("s*") => "\001\002\376\375"
      [513, -514].pack("s*") => "\001\002\376\375"

Big Endian:

"\001\002\376\375".unpack("s*") => [258, -259]

      [258, 65277].pack("s*") => "\001\002\376\375"
      [258, -259].pack("s*") => "\001\002\376\375"

S: unsigned short (16bit unsigned integer, depending on the Endian) (S! Not 16bit, it depends on the size of short)

Small Endian:

      "\001\002\376\375".unpack("S*") => [513, 65022]

      [513, 65022].pack("s*") => "\001\002\376\375"
      [513, -514].pack("s*") => "\001\002\376\375"

Big Endian:

      "\001\002\376\375".unpack("S*") => [258, 65277]

      [258, 65277].pack("S*") => "\001\002\376\375"
      [258, -259].pack("S*") => "\001\002\376\375"

i:
int (signed integer, depending on the size of Endian and int)

Small Endian, 32bit int:

    "\001\002\003\004\377\376\375\374".unpack("i*") =>
      [67305985, -50462977]

      [67305985, 4244504319].pack("i*") => RangeError
      [67305985, -50462977].pack("i*") => "\001\002\003\004\377\376\375\374"

Big Endian, 32bit int:

  "\001\002\003\004\377\376\375\374".unpack("i*") => [16909060, -66052]

      [16909060, 4294901244].pack("i*") => RangeError
      [16909060, -66052].pack("i*") => "\001\002\003\004\377\376\375\374"

I: unsigned int (unsigned integer, depending on the size of Endian and int)

Small Endian, 32bit int:

 "\001\002\003\004\377\376\375\374".unpack("I*") => [67305985, 4244504319]

      [67305985, 4244504319].pack("I*") => "\001\002\003\004\377\376\375\374"
      [67305985, -50462977].pack("I*") => "\001\002\003\004\377\376\375\374"

Big Endian, 32bit int:

    "\001\002\003\004\377\376\375\374".unpack("I*") => [16909060, 4294901244]

      [16909060, 4294901244].pack("I*") => "\001\002\003\004\377\376\375\374"
      [16909060, -66052].pack("I*") => "\001\002\003\004\377\376\375\374"

l: long (32bit signed integer, depending on the Endian) (l! not 32bit, it depends on the size of long)
Small Endian, 32bit long:

     "\001\002\003\004\377\376\375\374".unpack("l*") => [67305985, -50462977]

      [67305985, 4244504319].pack("l*") => RangeError
      [67305985, -50462977].pack("l*") => "\001\002\003\004\377\376\375\374"

L: unsigned long (32bit unsigned integer, depending on the Endian) (L! Not 32bit, it depends on the size of long)

Small Endian, 32bit long:

    "\001\002\003\004\377\376\375\374".unpack("L*") => [67305985, 4244504319]

      [67305985, 4244504319].pack("L*") => "\001\002\003\004\377\376\375\374"
      [67305985, -50462977].pack("L*") => "\001\002\003\004\377\376\375\374"

q: ruby 1.7 features: long long (signed integer, depending on the Endian and long long size) (in C can not handle long long time, that is, 64bit)

Small Endian, 64bit long long:

"\001\002\003\004\005\006\007\010\377\376\375\374\373\372\371\370".unpack("q*")
      => [578437695752307201, -506097522914230529]

      [578437695752307201, -506097522914230529].pack("q*")
      => "\001\002\003\004\005\006\a\010\377\376\375\374\373\372\371\370"
      [578437695752307201, 17940646550795321087].pack("q*")
      => "\001\002\003\004\005\006\a\010\377\376\375\374\373\372\371\370"

Q: ruby 1.7 features: unsigned long long (unsigned integer, depending on the Endian and long long size) (in C can not handle long long time, that is, 64bit)

Small Endian, 64bit long long:

      "\001\002\003\004\005\006\007\010\377\376\375\374\373\372\371\370".unpack("Q*")
      => [578437695752307201, 17940646550795321087]

      [578437695752307201, 17940646550795321087].pack("Q*")
      => "\001\002\003\004\005\006\a\010\377\376\375\374\373\372\371\370"
      [578437695752307201, -506097522914230529].pack("Q*")
      => "\001\002\003\004\005\006\a\010\377\376\375\374\373\372\371\370"

m: be base64 encoded string. Every 60 octets (or at the end) add a newline code.
Base64 is an encoding method, only the use of ASCII code in 65 characters (including the [A-Za-z0-9 + /] This is 64 characters and will be used for padding the'='), 3 8 group ( 8bits * 3 = 24bits) into a binary code of 4 (6bits * 4 = 24bits) printable characters. Details please refer to RFC2045.

 [""].pack("m") => ""
      ["\0"].pack("m") => "AA==\n"
      ["\0\0"].pack("m") => "AAA=\n"
      ["\0\0\0"].pack("m") => "AAAA\n"
      ["\377"].pack("m") => "/w==\n"
      ["\377\377"].pack("m") => "//8=\n"
      ["\377\377\377"].pack("m") => "////\n"

      ["abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"].pack("m")
      => "YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJT\nVFVWV1hZWg==\n"
      ["abcdefghijklmnopqrstuvwxyz"].pack("m3")
      => "YWJj\nZGVm\nZ2hp\namts\nbW5v\ncHFy\nc3R1\ndnd4\neXo=\n"

      "".unpack("m") => [""]
      "AA==\n".unpack("m") => ["\000"]
      "AA==".unpack("m") => ["\000"]

      "YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJT\nVFVWV1hZWg==\n".unpack("m")
      => ["abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"]
      "YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWg==\n".unpack("m")
      => ["abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"]

M: After quoted-printable encoding of the string encoding

 ["a b c\td \ne"].pack("M") => "a b c\td =\n\ne=\n"

      "a b c\td =\n\ne=\n".unpack("M") => ["a b c\td \ne"]

n: network byte order (big Endian) of unsigned short (16bit unsigned integer)

 [0,1,-1,32767,-32768,65535].pack("n*")
      => "\000\000\000\001\377\377\177\377\200\000\377\377"

      "\000\000\000\001\377\377\177\377\200\000\377\377".unpack("n*")
      => [0, 1, 65535, 32767, 32768, 65535]

N: network byte order (big Endian) of unsigned long (32bit unsigned integer)

    [0,1,-1].pack("N*") => "\000\000\000\000\000\000\000\001\377\377\377\377"

      "\000\000\000\000\000\000\000\001\377\377\377\377".unpack("N*") => [0, 1, 4294967295]

v: "VAX" byte order (little Endian) of unsigned short (16bit unsigned integer)

     [0,1,-1,32767,-32768,65535].pack("v*")
      => "\000\000\001\000\377\377\377\177\000\200\377\377"

      "\000\000\001\000\377\377\377\177\000\200\377\377".unpack("v*")
      => [0, 1, 65535, 32767, 32768, 65535]

V: "VAX" byte order (little Endian) of unsigned long (32bit unsigned integer)

 [0,1,-1].pack("V*") => "\000\000\000\000\001\000\000\000\377\377\377\377"

      "\000\000\000\000\001\000\000\000\377\377\377\377".unpack("V*") => [0, 1, 4294967295]

f: single precision floating point (depending on the system)

     IA-32 (x86) (IEEE754  Single-precision little Endian  ):

      [1.0].pack("f") => "\000\000\200?"

    sparc (IEEE754  Single-precision big Endian  ):

      [1.0].pack("f") => "?\200\000\000"

d: double precision floating point number (depending on the system)

       IA-32 (IEEE754  A double-precision little Endian  ):

       [1.0].pack("d") => "\000\000\000\000\000\000\360?"

      sparc (IEEE754  Double big Endian  ):

        [1.0].pack("d") => "?\360\000\000\000\000\000\000"

e: Small Endian the single precision floating point (depending on the system)

 IA-32:
        [1.0].pack("e") => "\000\000\200?"
      sparc:
        [1.0].pack("e") => "\000\000\200?"

E: Small Endian of the double-precision floating-point numbers (depending on the system)

      IA-32:
       [1.0].pack("E") => "\000\000\000\000\000\000\360?"
      sparc:
       [1.0].pack("E") => "\000\000\000\000\000\000\360?"

g: big Endian the single precision floating point (depending on the system)

      IA-32:
       [1.0].pack("g") => "?\200\000\000"
      sparc:
       [1.0].pack("g") => "?\200\000\000"

G: big Endian of the double-precision floating-point numbers (depending on the system)

     IA-32:
       [1.0].pack("G") => "?\360\000\000\000\000\000\000"
      sparc:
       [1.0].pack("G") => "?\360\000\000\000\000\000\000"

p: point to the end of string null pointer

      [""].pack("p") => "\310\037\034\010"
      ["a", "b", "c"].pack("p3") => " =\030\010\340^\030\010\360^\030\010"
      [nil].pack("p") => "\000\000\000\000"

P: point to the structure (fixed-length string) pointer

      [nil].pack("P") => "\000\000\000\000"
      ["abc"].pack("P3") => "x*\024\010"

      ["abc"].pack("P4") => ArgumentError: too short buffer for P(3 for 4)
      [""].pack("P") => ArgumentError: too short buffer for P(0 for 1)

u: by uuencode encoded string

      [""].pack("u") => ""
      ["a"].pack("u") => "!80``\n"
      ["abc"].pack("u") => "#86)C\n"
      ["abcd"].pack("u") => "$86)C9```\n"
      ["a"*45].pack("u") => "M86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A\n"
      ["a"*46].pack("u") => "M86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A\n!80``\n"
      ["abcdefghi"].pack("u6") => "&86)C9&5F\n#9VAI\n"

U: utf-8

      [0].pack("U") => "\000"
      [1].pack("U") => "\001"
      [0x7f].pack("U") => "\177"
      [0x80].pack("U") => "\302\200"
      [0x7fffffff].pack("U") => "\375\277\277\277\277\277"
      [0x80000000].pack("U") => ArgumentError
      [0,256,65536].pack("U3") => "\000\304\200\360\220\200\200"

      "\000\304\200\360\220\200\200".unpack("U3") => [0, 256, 65536]
      "\000\304\200\360\220\200\200".unpack("U") => [0]
      "\000\304\200\360\220\200\200".unpack("U*") => [0, 256, 65536]

w: BER compressed integer

Performance by 7 to 1 byte, so the least number of bytes to the performance of any size integer 0 and above. The highest byte in data except at the end of the outside, certainly have a one (that is, the highest bit of data can be extended to express the location).
BER are the Basic Encoding Rules abbreviations (BER not only deal with integers. ASN.1 encoding also use it)

x: read into the null byte / 1 byte

X: Back 1 byte

@: The absolute position of mobile

Use Cases

Here are some pack / unpack the use case.

In fact, some questions do not need to use the pack, but we give the example that it is. Mainly because the pack can easily be encrypted, and we want to use the pack of people do not want to offer a few new ideas.

The value (character code) into a string array examples

     p [82, 117, 98, 121].pack("cccc")
      => "Ruby"

      p [82, 117, 98, 121].pack("c4")
      => "Ruby"

      p [82, 117, 98, 121].pack("c*")
      => "Ruby"

      s = ""
      [82, 117, 98, 121].each {|c| s << c}
      p s
      => "Ruby"

      p [82, 117, 98, 121].collect {|c| sprintf "%c", c}.join
      => "Ruby"

      p [82, 117, 98, 121].inject("") {|s, c| s << c}
      => "Ruby"

*

String into a value (character code) of an array of examples

     p "Ruby".unpack('C*')
      => [82, 117, 98, 121]

      a = []
      "Ruby".each_byte {|c| a << c}
      p a
      => [82, 117, 98, 121]

*

Can be used "x" to deal with null bytes

      p [82, 117, 98, 121].pack("ccxxcc")
      => "Ru\000\000by"

*

Can be used "x" to read characters

      p "Ru\0\0by".unpack('ccxxcc')
      => [82, 117, 98, 121]

*

Hex dump the array into a numerical example

 p "61 62 63 64 65 66".delete(' ').to_a.pack('H*').unpack('C*')
      => [97, 98, 99, 100, 101, 102]

      p "61 62 63 64 65 66".split.collect {|c| c.hex}
      => [97, 98, 99, 100, 101, 102]

*

And 16 in binary notation the number of pack, the specified length does not mean that the number of bytes generated, but A bit or half the number of bytes

      p [0b01010010, 0b01110101, 0b01100010, 0b01111001].pack("C4")
      => "Ruby"
      p ["01010010011101010110001001111001"].pack("B32") # 8 bits * 4
      => "Ruby"

      p [0x52, 0x75, 0x62, 0x79].pack("C4")
      => "Ruby"
      p ["52756279"].pack("H8")  # 2 nybbles * 4
      => "Ruby"

*

Template characters' a 'the length of the specified apply only to a string

      p  ["RUBY", "u", "b", "y"].pack("a4")
      => "RUBY"

      p ["RUBY", "u", "b", "y"].pack("aaaa")
      => "Ruby"

      p ["RUBY", "u", "b", "y"].pack("a*aaa")
      => "RUBYuby"

*

At template characters "a", if not enough length when filled with null characters

      p ["Ruby"].pack("a8")
      => "Ruby\000\000\000\000"

*

Small Endian and big Endian

      p [1,2].pack("s2")
      => "\000\001\000\002" #  In big Endian system output
      => "\001\000\002\000" #  In little Endian system output  

      p [1,2].pack("n2")
      => "\000\001\000\002" #  Big Endian system-independent  

      p [1,2].pack("v2")
      => "\001\000\002\000" #  Little Endian system-independent  

*

Network byte order signed long

      s = "\xff\xff\xff\xfe"
      n = s.unpack("N")[0]
      if n[31] == 1
        n = -((n ^ 0xffff_ffff) + 1)
      end
      p n
      => -2

*

Network byte order the signed long (No. 2)

      s = "\xff\xff\xff\xfe"
      p n = s.unpack("N").pack("l").unpack("l")[0]
      => -2

*

IP Address

     require 'socket'
      p Socket.gethostbyname("localhost")[3].unpack("C4").join(".")
      => "127.0.0.1"

      p "127.0.0.1".split(".").collect {|c| c.to_i}.pack("C4")
      => "\177\000\000\001"

*

sockaddr_in structure

     require 'socket'
      p [Socket::AF_INET,
         Socket.getservbyname('echo'),
         127, 0, 0, 1].pack("s n C4 x8")
      => "\002\000\000\a\177\000\000\001\000\000\000\000\000\000\000\000"

ruby 1.7 features: apart from pack / unpack, you can also use Socket.pack_sockaddr_in and Ways Socket.unpack_sockaddr_in.
*

'\ 0' end of the string's address

Template characters "p" and "P" is the C language in order to deal with the interface layer exists (for example, ioctl).

      p ["foo"].pack("p")
      => "8\266\021\010"

The results of string appears in a Mess, said that in fact it is the string "foo \ 0" the address (binary form). You can like this, put it into the form you are familiar with

      printf "%#010x\n", "8\266\021\010".unpack("L")[0]
      => 0x0811b638

Pack at the results of GC Recycling was before, address the meaning of objects (in this case are "foo \ 0") to ensure that the recovery will not be GC.

You can only use the results of pack to unpack ( "p") and unpack ( "P").

     p ["foo"].pack("p").unpack("p")
      => ["foo"]
      p "8\266\021\010".unpack("p")
      => -:1:in `unpack': no associated pointer (ArgumentError)
              from -:1

ruby 1.7 features: "p" and "P" is interpreted as NULL pointer, it is responsible for special treatment nil. (The following are common at the results on 32bit machines)

     p [nil].pack("p")        #=> "\000\000\000\000"
      p "\0\0\0\0".unpack("p") #=> [nil]

*

Structure Address

For example, express

struct (
int a;
short b;
long c;
) V = (1,2,3);

String are

v = [1,2,3]. pack ( "i! s! l!")

(Taking into account the byte alignment problem, you may need to do proper padding)

You can use the

p [v]. pack ( "P")
=> "\ 300 \ 265 \ 021 \ 010"

To get point to the structure's address.

分类:Ruby 时间:2009-04-16 人气:6523
分享到:
blog comments powered by Disqus

相关文章

  • Algorithm: the output string of characters in the full array 2010-05-26

    package test; import java.util.ArrayList; import java.util.List; public class Test { public static void main(String[] args) { String str = "abcdef"; List<String> list = new Test().completeArray(str, 0); System.out.println(list); System.out

  • String Manipulation - split a string by string 2010-03-30

    /* * Note the string decomposition function. strtok The more complex the application to properly. ! * You can also not strtok function , But to achieve string by splitting a string into difficult ! * Note the str to space must be readable and writabl

  • Oracle frequently used characters and string handling class functions 2010-03-15

    ##################################### # Commonly used class of character and string handling functions ##################################### # LOWER function function: to convert a string into lowercase letters example: select firstname, lastname fro

  • Oracle common type of characters and string handling functions 2010-03-15

    ##################################### # Commonly used class of character and string handling functions ##################################### # LOWER function function: to convert a string into lowercase letters example: select firstname, lastname fro

  • Document questions: operator string of characters appear in most 2010-06-24

    There is a string, don't know how many characters , May be long, calculate appears most what characters ? public class Test2 { public static void main(String args[]){ String str=" ddvdlkd Huang Huang Huang Huang Huang fdfd d"; getmax(str); } pub

  • ruby array of copies of a small problem 2009-03-15

    Copy the value of examples: >> A = [1,2] => [1, 2] >> B = Array.new (a) => [1, 2] >> B.delete (1) => 1 >> B => [2] >> A => [1, 2] Pointer copy examples: >> A = [1,2] => [1, 2] >> B = a => [1,

  • Python Cookbook - reverse the string of characters or words 2010-02-26

    You want to reverse the string of characters or words String itself is immutable, therefore, to reverse, we must re-create a copy. Reverse the following character revchars = astring[::-1] In order to reverse the words, you need to create a word list.

  • NULL and empty string 2010-10-16

    Mysql in NULL and empty string are two different things, and in the oracle NULL and empty string will do the same treatment; so when a column is set to empty when not allowed, mysql empty string inserted in this column is Yes, would be an error in th

  • Java to determine whether the string contains characters 2010-11-19

    Java to determine whether the string contains characters import java.util.regex.Matcher; import java.util.regex.Pattern; public class IfHanZi { public static void main(String[] args) { // Method a : String s1 = " I am a Chinese "; String s2 = &q

  • Title: a string of characters appeared the most times what is the number of occurrences. 2010-12-05

    Title: a string of characters appeared the most times what is the number of occurrences. import java.util.ArrayList; import java.util.HashMap; import java.util.Iterator; import java.util.List; import java.util.Map; import java.util.Set; public class

iOS 开发

Android 开发

Python 开发

JAVA 开发

开发语言

PHP 开发

Ruby 开发

搜索

前端开发

数据库

开发工具

开放平台

Javascript 开发

.NET 开发

云计算

服务器

Copyright (C) codeweblog.com, All Rights Reserved.

CodeWeblog.com 版权所有 闽ICP备15018612号

processed in 0.037 (s). 13 q(s)