[Gambas-user] perl unpack

Ron ron at ...1740...
Tue Sep 20 10:22:54 CEST 2011


Hi Caveat,

I'm using the routine only to decode a small string sent in a telnet socket app.

I have changed it a bit so it doesn't return byte[] but a string instead.

FOR iIdy = 1 TO 3
          IF iPtr < iLengthUU
            ' converts each block of 8 bits to its decimal value and
assigns to the output byte array
            sResult &= Chr(ToInt(Mid$(sBinFour, 1 + ((iIdy - 1) * 8), 8), 2))
            INC iPtr
          ENDIF
        NEXT

And seems to work ok for my purpose!
Some info: I use a perl script to report web visits (it parses apache
logs) it sends results over a telnet socket to my main project, this
script comes from the misterhouse project, so I wanted to create a
telnet socket interface for it, so other script can be used later.

I will see if I can build in some more checks you suggested.

Regards,
Ron.

2011/9/20 Caveat <Gambas at ...1950...>:
> Hi Ron_2nd
>
> Thanks for the compliment but I'm not sure it's project ready... the
> idea was just to show the principles.
>
> I doubt this is either fast, efficient, or bug free!
>
> I've assumed that everybody is perfect and that I'll never get any
> invalid uuencoded data (bytes out of range, wrong size information,
> missing padding).  You may want to add some kind of error checking here
> and there.
>
> Encoding will need a check on the size of the byte[] you're trying to
> encode and will need to break the data up into smaller chunks for
> serious amounts of data.  Conventionally, uuencoded data lines are 'M'
> long (32+45 = ascii 77 or 'M'), until the last line which may of course
> be shorter.  If you don't need to encode anything, then you're fine
> here.
>
> You can remove the declarations of the Integer threeIdx and the Byte[]
> threeChars from encodeUU (they're hangovers from when the routine did
> base64 encoding).  Do you need base64 routines too? :-)
>
> The decoding could well be incorporated into a framework that works on a
> line by line basis, so you can decode a whole file if needed.  Then you
> can receive binary files over an ascii connection.  You might need some
> extra code to deal with the file header information (and something
> similar for encoding but then writing the header, if you're going to do
> whole files).
>
> The approach of turning everything into strings of binary digits (that's
> the intToBase(number, BASE_BINARY) call) and bolting it all together as
> strings is fine from the perspective of seeing how it all works but for
> speed you might want to switch to a more mathematical approach using
> multiplication or bitwise operators.
>
> The basic principle would be something like (for decoding)...
>
> %<V5N9#H`
>
> You have 60 (<), 86 (V), 53 (5), 78 (N) in the first 4 bytes of encoded
> data (after the length byte, here it's % = Chr$(37) --> 37 -32 = 5)
>
> Take (60-32)x64x64x64 = 7340032
> Take (86-32)x64x64    =  221184
> Take (53-32)x64       =    1344
> Take (78-32)x1        =      46
>
> Total                 = 7562606
>
> This is the number that sequence of four six bit 'bytes' represents.
> (Note: Use 64 as your multiplier/divider for a 6-bit byte, 128 for a
> 7-bit byte and the familiar(?) 256 for an 8-bit byte).
>
> Now we need to break the 4 6-bit bytes into 3 8-bit bytes... so we kind
> of do the reverse, dividing first by 65536 (256x256), then 256, then
> 1...
>
> 7562606 / 65536   = 115 rem 25966   (s)
> 25966 / 256       = 101 rem 110     (e)
> 110 / 1           = 110             (n)
>
> So reading downwards, you see 115, 101, 110... which is, of course, s...
> e... n...
>
> You'll have another 4 bytes (9#H`) of 6-bit to decode, but you only need
> to get 2 8-bit bytes out of it...(you know that from the length byte,
> it's 5 and you already decoded 3 characters).
>
> 25x64x64x64 = 6553600
> 3x64x64     =   12288
> 40x64       =    2560
> 64x1        =      64
>
> So 6568512 / 65536 = 100 rem 14912    (d)
> 14912 / 256        = 58 rem 64        (:)
>
> But then we stop after "d:" as we have all 5 characters... so the last
> 6-bit byte has no impact on the final decoded string (we do nothing with
> the last rem 64, so it could be rem 56 or rem 35 or rem anything), it's
> just padding and can be ` like perl uses or space (most people use
> space)... but it can be any character.  Try it with the decode %<V5N9#H`
> == %<V5N9#HQ == %<V5N9#HA == %<V5N9#H# or with a space after the H...
>
> As mentioned above, you could even do some clever bit manipulations with
> AND, OR, XOR etc. but it's too early in the morning for me to work that
> one out, perhaps I can leave that as an exercise for the reader...;-)
>
> Kind regards,
> Caveat
>
> On Tue, 2011-09-20 at 07:42 +0200, Ron wrote:
>> Great work!
>>
>> I had a few vb code to start from, but the all where slightly
>> different, so where the results.
>> This shows it's sometimes better to just start from the basic info and
>> work from there line by line...
>>
>> Thanks alot!
>> Going to put this in my project...
>>
>> Regards,
>> Ron_2nd.
>>
>>
>> 2011/9/20 Caveat <Gambas at ...1950...>:
>> > Don't stress too much over the `, it's just a kind of non-standard
>> > padding character.  The % at the beginning of the string says we only
>> > have 5 characters to decode so we shouldn't worry...we SHOULD always
>> > have an exact multiple of 4 characters after the first length byte...
>> > but some of them may not matter...
>> >
>> > This should do it:
>> >
>> > ======================================================================
>> > Private Function decodeUU(codedStr As String) As Byte[]
>> >
>> >  Dim idx, idy, ptr As Integer
>> >  Dim result As Byte[]
>> >  Dim lengthUU, ascAChar, ascMin32 As Integer
>> >  Dim binFour As String = ""
>> >  ' First character's ascii code - 32 is the length
>> >  ascAChar = Asc(Left$(codedStr, 1))
>> >  ascMin32 = ascAChar - 32
>> >  lengthUU = ascMin32
>> >  Print "Expecting a length of: " & lengthUU
>> >  ' Set the size of the result array
>> >  result = New Byte[lengthUU]
>> >  ' Initialise pointer into the result array
>> >  ptr = 0
>> >  ' Step through the uuencoded string character by character starting at
>> > the 2nd character (1st is the length)
>> >  For idx = 2 To Len(codedStr)
>> >    ascAChar = Asc(Mid$(codedStr, idx, 1))
>> >    ' Only include what is not whitespace
>> >    If ascAChar > 31 And ascAChar < 97 Then
>> >      ' Subtract 32 from the ascii code of the character
>> >      ascMin32 = ascAChar - 32
>> >      ' Assemble a block of four 6-bit values
>> >      binFour = binFour & Right$("000000" & intToBase(ascMin32,
>> > BASE_BINARY), 6)
>> >      ' Once we have 4 binary 6-bit 'characters' in our string
>> >      If Len(binFour) = 24 Then
>> >        ' Treat the 4 6-bit characters as 3 8-bit characters
>> >        For idy = 1 To 3
>> >          ' Make sure we don't go trying to convert more than the length
>> > says we have to
>> >          If ptr < result.Length
>> >            Print "Bin to convert: " & Mid$(binFour, 1 + ((idy - 1) *
>> > 8), 8)
>> >            ' Converts each block of 8 bits to its decimal value and
>> > assigns to the output byte array
>> >            result[ptr] = toInt(Mid$(binFour, 1 + ((idy - 1) * 8), 8),
>> > BASE_BINARY)
>> >            Inc ptr
>> >          End If
>> >        Next
>> >        ' Be sure to clear out binFour for the next unit of UUencoding
>> >        binFour = ""
>> >      End If
>> >    End If
>> >  Next
>> >  Return result
>> >
>> > End
>> > ======================================================================
>> >
>> > You probably need the routines to convert between bases too:
>> >
>> > ======================================================================
>> > Private Function convertBase(numberIn As String, fromBase As Integer,
>> > toBase As Integer) As String
>> >
>> >  Dim value As Integer
>> >  value = toInt(numberIn, fromBase)
>> >  Return intToBase(value, toBase)
>> >
>> > End
>> >
>> > Private Function intToBase(numberIn As Integer, base As Integer) As
>> > String
>> >
>> >  Dim remain, numToDivide As Integer
>> >  Dim result As String = ""
>> >
>> >  numToDivide = numberIn
>> >  Do While numToDivide / base > 0
>> >    remain = numToDivide Mod base
>> >    numToDivide = (Int)(numToDivide / base)
>> >    result = DIGITS[remain] & result
>> >  Loop
>> >
>> >  Return result
>> >
>> > End
>> >
>> > Private Function toInt(inputStr As String, base As Integer) As Integer
>> >
>> >  Dim idx, mult, result, value As Integer
>> >  mult = 1
>> >  For idx = Len(inputStr) To 1 Step -1
>> >    ' If we're in a base with digits bigger than 9
>> >    ' we need the Find to return 10 for A, 11 for B, 12 for C etc.
>> >    value = DIGITS.Find(UCase(Mid$(inputStr, idx, 1))) * mult
>> >    result = result + value
>> >    mult = mult * base
>> >  Next
>> >  Return result
>> >
>> > End
>> > ======================================================================
>> >
>> > And don't forget a few Consts for convenience:
>> >
>> > ======================================================================
>> > Private Const TEST_STR As String = "%<V5N9#H`"
>> > Private Const BASE_BINARY As Integer = 2
>> > Private Const BASE_OCTAL As Integer = 8
>> > Private Const BASE_DENARY As Integer = 10
>> > Private Const BASE_HEX As Integer = 16
>> > Private DIGITS As String[] = ["0", "1", "2", "3", "4", "5", "6", "7",
>> > "8", "9", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
>> > "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"]
>> > ======================================================================
>> >
>> > Oh and just for fun, here's an encode function too... you will notice
>> > that I encode "send:" CORRECTLY... LOL!
>> >
>> > ======================================================================
>> > Private Function encodeUU(source As Byte[]) As String
>> >
>> >  Dim idx, idy, idxThree As Integer
>> >  Dim result As String
>> >  Dim aByte As Byte
>> >  Dim aBinChar, binCharGroup As String
>> >  Dim threeChars As Byte[]
>> >  binCharGroup = ""
>> >  result = result & Chr$(source.Count + 32)
>> >  For idx = 0 To source.Max
>> >    aByte = source[idx]
>> >    ' Convert the byte to exactly 8 digits of binary
>> >    ' so for e.g. pad 1 to become 00000001
>> >    aBinChar = Right$("00000000" & intToBase(aByte, BASE_BINARY), 8)
>> >    Print "aByte: " & aByte & " abinChar: " & aBinChar
>> >    ' Add bytes together to make blocks of 3 8-bit characters
>> >    binCharGroup = binCharGroup & aBinChar
>> >    ' Pad if we're at the end of the string and don't have a full 3-char
>> > block
>> >    If idx = source.Max Then
>> >      binCharGroup = Left$(binCharGroup & "000000000000000000000000",
>> > 24)
>> >    Endif
>> >    If Len(binCharGroup) = 24 Then
>> >      Print binCharGroup
>> >      ' Now treat the 3 blocks of 8 bits like 4 blocks of 6 bits....
>> >      For idy = 1 To 4
>> >        Print "char: " & idy & " has value: " & (toInt(Mid
>> > $(binCharGroup, 1 + ((idy - 1) * 6), 6), BASE_BINARY) + 32)
>> >        ' Append the Chr$ of the value of the 6-bit byte + 32 to our
>> > result
>> >        result = result & Chr$(toInt(Mid$(binCharGroup, 1 + ((idy - 1) *
>> > 6), 6), BASE_BINARY) + 32)
>> >      Next
>> >      binCharGroup = ""
>> >    Endif
>> >  Next
>> >  Return result
>> >
>> > End
>> > ======================================================================
>> >
>> > Kind regards,
>> > Caveat
>> >
>> > On Mon, 2011-09-19 at 13:24 +0200, Ron wrote:
>> >> I'm trying to decode this with gambas, no luck, anyone has an idea?
>> >>
>> >> #!/usr/bin/perl
>> >> print pack('u', "send:");
>> >>
>> >> %<V5N9#H`
>> >>
>> >> So decoding %<V5N9#H` should result in 'send:'
>> >>
>> >> The pack 'u' function does uuencoding  but all vb alike code doesn't
>> >> reproduce the correct result, or struggles with the `...
>> >>
>> >>
>> >> Thanks in advance!!
>> >>
>> >> Regards,
>> >> Ron_2nd.
>> >>
>> >> ------------------------------------------------------------------------------
>> >> BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
>> >> Learn about the latest advances in developing for the
>> >> BlackBerry® mobile platform with sessions, labs & more.
>> >> See new tools and technologies. Register for BlackBerry® DevCon today!
>> >> http://p.sf.net/sfu/rim-devcon-copy1
>> >> _______________________________________________
>> >> Gambas-user mailing list
>> >> Gambas-user at lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/gambas-user
>> >
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > All the data continuously generated in your IT infrastructure contains a
>> > definitive record of customers, application performance, security
>> > threats, fraudulent activity and more. Splunk takes this data and makes
>> > sense of it. Business sense. IT sense. Common sense.
>> > http://p.sf.net/sfu/splunk-d2dcopy1
>> > _______________________________________________
>> > Gambas-user mailing list
>> > Gambas-user at lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/gambas-user
>> >
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
> _______________________________________________
> Gambas-user mailing list
> Gambas-user at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gambas-user
>




More information about the User mailing list