[Gambas-user] perl unpack

Caveat Gambas at ...1950...
Tue Sep 20 09:44:09 CEST 2011


Hi Ron_2nd

Thanks for the compliment but I'm not sure it's project ready... the
idea was just to show the principles.

I doubt this is either fast, efficient, or bug free!

I've assumed that everybody is perfect and that I'll never get any
invalid uuencoded data (bytes out of range, wrong size information,
missing padding).  You may want to add some kind of error checking here
and there.

Encoding will need a check on the size of the byte[] you're trying to
encode and will need to break the data up into smaller chunks for
serious amounts of data.  Conventionally, uuencoded data lines are 'M'
long (32+45 = ascii 77 or 'M'), until the last line which may of course
be shorter.  If you don't need to encode anything, then you're fine
here.

You can remove the declarations of the Integer threeIdx and the Byte[]
threeChars from encodeUU (they're hangovers from when the routine did
base64 encoding).  Do you need base64 routines too? :-)

The decoding could well be incorporated into a framework that works on a
line by line basis, so you can decode a whole file if needed.  Then you
can receive binary files over an ascii connection.  You might need some
extra code to deal with the file header information (and something
similar for encoding but then writing the header, if you're going to do
whole files).

The approach of turning everything into strings of binary digits (that's
the intToBase(number, BASE_BINARY) call) and bolting it all together as
strings is fine from the perspective of seeing how it all works but for
speed you might want to switch to a more mathematical approach using
multiplication or bitwise operators.

The basic principle would be something like (for decoding)...

%<V5N9#H`

You have 60 (<), 86 (V), 53 (5), 78 (N) in the first 4 bytes of encoded
data (after the length byte, here it's % = Chr$(37) --> 37 -32 = 5)

Take (60-32)x64x64x64 = 7340032
Take (86-32)x64x64    =  221184
Take (53-32)x64       =    1344
Take (78-32)x1        =      46

Total                 = 7562606

This is the number that sequence of four six bit 'bytes' represents.
(Note: Use 64 as your multiplier/divider for a 6-bit byte, 128 for a
7-bit byte and the familiar(?) 256 for an 8-bit byte).

Now we need to break the 4 6-bit bytes into 3 8-bit bytes... so we kind
of do the reverse, dividing first by 65536 (256x256), then 256, then
1...

7562606 / 65536   = 115 rem 25966   (s)
25966 / 256       = 101 rem 110     (e)
110 / 1           = 110             (n)

So reading downwards, you see 115, 101, 110... which is, of course, s...
e... n... 

You'll have another 4 bytes (9#H`) of 6-bit to decode, but you only need
to get 2 8-bit bytes out of it...(you know that from the length byte,
it's 5 and you already decoded 3 characters).

25x64x64x64 = 6553600
3x64x64     =   12288
40x64       =    2560
64x1        =      64

So 6568512 / 65536 = 100 rem 14912    (d)
14912 / 256        = 58 rem 64        (:)

But then we stop after "d:" as we have all 5 characters... so the last
6-bit byte has no impact on the final decoded string (we do nothing with
the last rem 64, so it could be rem 56 or rem 35 or rem anything), it's
just padding and can be ` like perl uses or space (most people use
space)... but it can be any character.  Try it with the decode %<V5N9#H`
== %<V5N9#HQ == %<V5N9#HA == %<V5N9#H# or with a space after the H... 

As mentioned above, you could even do some clever bit manipulations with
AND, OR, XOR etc. but it's too early in the morning for me to work that
one out, perhaps I can leave that as an exercise for the reader...;-)

Kind regards,
Caveat

On Tue, 2011-09-20 at 07:42 +0200, Ron wrote:
> Great work!
> 
> I had a few vb code to start from, but the all where slightly
> different, so where the results.
> This shows it's sometimes better to just start from the basic info and
> work from there line by line...
> 
> Thanks alot!
> Going to put this in my project...
> 
> Regards,
> Ron_2nd.
> 
> 
> 2011/9/20 Caveat <Gambas at ...1950...>:
> > Don't stress too much over the `, it's just a kind of non-standard
> > padding character.  The % at the beginning of the string says we only
> > have 5 characters to decode so we shouldn't worry...we SHOULD always
> > have an exact multiple of 4 characters after the first length byte...
> > but some of them may not matter...
> >
> > This should do it:
> >
> > ======================================================================
> > Private Function decodeUU(codedStr As String) As Byte[]
> >
> >  Dim idx, idy, ptr As Integer
> >  Dim result As Byte[]
> >  Dim lengthUU, ascAChar, ascMin32 As Integer
> >  Dim binFour As String = ""
> >  ' First character's ascii code - 32 is the length
> >  ascAChar = Asc(Left$(codedStr, 1))
> >  ascMin32 = ascAChar - 32
> >  lengthUU = ascMin32
> >  Print "Expecting a length of: " & lengthUU
> >  ' Set the size of the result array
> >  result = New Byte[lengthUU]
> >  ' Initialise pointer into the result array
> >  ptr = 0
> >  ' Step through the uuencoded string character by character starting at
> > the 2nd character (1st is the length)
> >  For idx = 2 To Len(codedStr)
> >    ascAChar = Asc(Mid$(codedStr, idx, 1))
> >    ' Only include what is not whitespace
> >    If ascAChar > 31 And ascAChar < 97 Then
> >      ' Subtract 32 from the ascii code of the character
> >      ascMin32 = ascAChar - 32
> >      ' Assemble a block of four 6-bit values
> >      binFour = binFour & Right$("000000" & intToBase(ascMin32,
> > BASE_BINARY), 6)
> >      ' Once we have 4 binary 6-bit 'characters' in our string
> >      If Len(binFour) = 24 Then
> >        ' Treat the 4 6-bit characters as 3 8-bit characters
> >        For idy = 1 To 3
> >          ' Make sure we don't go trying to convert more than the length
> > says we have to
> >          If ptr < result.Length
> >            Print "Bin to convert: " & Mid$(binFour, 1 + ((idy - 1) *
> > 8), 8)
> >            ' Converts each block of 8 bits to its decimal value and
> > assigns to the output byte array
> >            result[ptr] = toInt(Mid$(binFour, 1 + ((idy - 1) * 8), 8),
> > BASE_BINARY)
> >            Inc ptr
> >          End If
> >        Next
> >        ' Be sure to clear out binFour for the next unit of UUencoding
> >        binFour = ""
> >      End If
> >    End If
> >  Next
> >  Return result
> >
> > End
> > ======================================================================
> >
> > You probably need the routines to convert between bases too:
> >
> > ======================================================================
> > Private Function convertBase(numberIn As String, fromBase As Integer,
> > toBase As Integer) As String
> >
> >  Dim value As Integer
> >  value = toInt(numberIn, fromBase)
> >  Return intToBase(value, toBase)
> >
> > End
> >
> > Private Function intToBase(numberIn As Integer, base As Integer) As
> > String
> >
> >  Dim remain, numToDivide As Integer
> >  Dim result As String = ""
> >
> >  numToDivide = numberIn
> >  Do While numToDivide / base > 0
> >    remain = numToDivide Mod base
> >    numToDivide = (Int)(numToDivide / base)
> >    result = DIGITS[remain] & result
> >  Loop
> >
> >  Return result
> >
> > End
> >
> > Private Function toInt(inputStr As String, base As Integer) As Integer
> >
> >  Dim idx, mult, result, value As Integer
> >  mult = 1
> >  For idx = Len(inputStr) To 1 Step -1
> >    ' If we're in a base with digits bigger than 9
> >    ' we need the Find to return 10 for A, 11 for B, 12 for C etc.
> >    value = DIGITS.Find(UCase(Mid$(inputStr, idx, 1))) * mult
> >    result = result + value
> >    mult = mult * base
> >  Next
> >  Return result
> >
> > End
> > ======================================================================
> >
> > And don't forget a few Consts for convenience:
> >
> > ======================================================================
> > Private Const TEST_STR As String = "%<V5N9#H`"
> > Private Const BASE_BINARY As Integer = 2
> > Private Const BASE_OCTAL As Integer = 8
> > Private Const BASE_DENARY As Integer = 10
> > Private Const BASE_HEX As Integer = 16
> > Private DIGITS As String[] = ["0", "1", "2", "3", "4", "5", "6", "7",
> > "8", "9", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
> > "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"]
> > ======================================================================
> >
> > Oh and just for fun, here's an encode function too... you will notice
> > that I encode "send:" CORRECTLY... LOL!
> >
> > ======================================================================
> > Private Function encodeUU(source As Byte[]) As String
> >
> >  Dim idx, idy, idxThree As Integer
> >  Dim result As String
> >  Dim aByte As Byte
> >  Dim aBinChar, binCharGroup As String
> >  Dim threeChars As Byte[]
> >  binCharGroup = ""
> >  result = result & Chr$(source.Count + 32)
> >  For idx = 0 To source.Max
> >    aByte = source[idx]
> >    ' Convert the byte to exactly 8 digits of binary
> >    ' so for e.g. pad 1 to become 00000001
> >    aBinChar = Right$("00000000" & intToBase(aByte, BASE_BINARY), 8)
> >    Print "aByte: " & aByte & " abinChar: " & aBinChar
> >    ' Add bytes together to make blocks of 3 8-bit characters
> >    binCharGroup = binCharGroup & aBinChar
> >    ' Pad if we're at the end of the string and don't have a full 3-char
> > block
> >    If idx = source.Max Then
> >      binCharGroup = Left$(binCharGroup & "000000000000000000000000",
> > 24)
> >    Endif
> >    If Len(binCharGroup) = 24 Then
> >      Print binCharGroup
> >      ' Now treat the 3 blocks of 8 bits like 4 blocks of 6 bits....
> >      For idy = 1 To 4
> >        Print "char: " & idy & " has value: " & (toInt(Mid
> > $(binCharGroup, 1 + ((idy - 1) * 6), 6), BASE_BINARY) + 32)
> >        ' Append the Chr$ of the value of the 6-bit byte + 32 to our
> > result
> >        result = result & Chr$(toInt(Mid$(binCharGroup, 1 + ((idy - 1) *
> > 6), 6), BASE_BINARY) + 32)
> >      Next
> >      binCharGroup = ""
> >    Endif
> >  Next
> >  Return result
> >
> > End
> > ======================================================================
> >
> > Kind regards,
> > Caveat
> >
> > On Mon, 2011-09-19 at 13:24 +0200, Ron wrote:
> >> I'm trying to decode this with gambas, no luck, anyone has an idea?
> >>
> >> #!/usr/bin/perl
> >> print pack('u', "send:");
> >>
> >> %<V5N9#H`
> >>
> >> So decoding %<V5N9#H` should result in 'send:'
> >>
> >> The pack 'u' function does uuencoding  but all vb alike code doesn't
> >> reproduce the correct result, or struggles with the `...
> >>
> >>
> >> Thanks in advance!!
> >>
> >> Regards,
> >> Ron_2nd.
> >>
> >> ------------------------------------------------------------------------------
> >> BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
> >> Learn about the latest advances in developing for the
> >> BlackBerry® mobile platform with sessions, labs & more.
> >> See new tools and technologies. Register for BlackBerry® DevCon today!
> >> http://p.sf.net/sfu/rim-devcon-copy1
> >> _______________________________________________
> >> Gambas-user mailing list
> >> Gambas-user at lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/gambas-user
> >
> >
> >
> > ------------------------------------------------------------------------------
> > All the data continuously generated in your IT infrastructure contains a
> > definitive record of customers, application performance, security
> > threats, fraudulent activity and more. Splunk takes this data and makes
> > sense of it. Business sense. IT sense. Common sense.
> > http://p.sf.net/sfu/splunk-d2dcopy1
> > _______________________________________________
> > Gambas-user mailing list
> > Gambas-user at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gambas-user
> >






More information about the User mailing list