[Gambas-user] perl unpack
Ron
ron at ...1740...
Tue Sep 20 10:22:54 CEST 2011
Hi Caveat,
I'm using the routine only to decode a small string sent in a telnet socket app.
I have changed it a bit so it doesn't return byte[] but a string instead.
FOR iIdy = 1 TO 3
IF iPtr < iLengthUU
' converts each block of 8 bits to its decimal value and
assigns to the output byte array
sResult &= Chr(ToInt(Mid$(sBinFour, 1 + ((iIdy - 1) * 8), 8), 2))
INC iPtr
ENDIF
NEXT
And seems to work ok for my purpose!
Some info: I use a perl script to report web visits (it parses apache
logs) it sends results over a telnet socket to my main project, this
script comes from the misterhouse project, so I wanted to create a
telnet socket interface for it, so other script can be used later.
I will see if I can build in some more checks you suggested.
Regards,
Ron.
2011/9/20 Caveat <Gambas at ...1950...>:
> Hi Ron_2nd
>
> Thanks for the compliment but I'm not sure it's project ready... the
> idea was just to show the principles.
>
> I doubt this is either fast, efficient, or bug free!
>
> I've assumed that everybody is perfect and that I'll never get any
> invalid uuencoded data (bytes out of range, wrong size information,
> missing padding). You may want to add some kind of error checking here
> and there.
>
> Encoding will need a check on the size of the byte[] you're trying to
> encode and will need to break the data up into smaller chunks for
> serious amounts of data. Conventionally, uuencoded data lines are 'M'
> long (32+45 = ascii 77 or 'M'), until the last line which may of course
> be shorter. If you don't need to encode anything, then you're fine
> here.
>
> You can remove the declarations of the Integer threeIdx and the Byte[]
> threeChars from encodeUU (they're hangovers from when the routine did
> base64 encoding). Do you need base64 routines too? :-)
>
> The decoding could well be incorporated into a framework that works on a
> line by line basis, so you can decode a whole file if needed. Then you
> can receive binary files over an ascii connection. You might need some
> extra code to deal with the file header information (and something
> similar for encoding but then writing the header, if you're going to do
> whole files).
>
> The approach of turning everything into strings of binary digits (that's
> the intToBase(number, BASE_BINARY) call) and bolting it all together as
> strings is fine from the perspective of seeing how it all works but for
> speed you might want to switch to a more mathematical approach using
> multiplication or bitwise operators.
>
> The basic principle would be something like (for decoding)...
>
> %<V5N9#H`
>
> You have 60 (<), 86 (V), 53 (5), 78 (N) in the first 4 bytes of encoded
> data (after the length byte, here it's % = Chr$(37) --> 37 -32 = 5)
>
> Take (60-32)x64x64x64 = 7340032
> Take (86-32)x64x64 = 221184
> Take (53-32)x64 = 1344
> Take (78-32)x1 = 46
>
> Total = 7562606
>
> This is the number that sequence of four six bit 'bytes' represents.
> (Note: Use 64 as your multiplier/divider for a 6-bit byte, 128 for a
> 7-bit byte and the familiar(?) 256 for an 8-bit byte).
>
> Now we need to break the 4 6-bit bytes into 3 8-bit bytes... so we kind
> of do the reverse, dividing first by 65536 (256x256), then 256, then
> 1...
>
> 7562606 / 65536 = 115 rem 25966 (s)
> 25966 / 256 = 101 rem 110 (e)
> 110 / 1 = 110 (n)
>
> So reading downwards, you see 115, 101, 110... which is, of course, s...
> e... n...
>
> You'll have another 4 bytes (9#H`) of 6-bit to decode, but you only need
> to get 2 8-bit bytes out of it...(you know that from the length byte,
> it's 5 and you already decoded 3 characters).
>
> 25x64x64x64 = 6553600
> 3x64x64 = 12288
> 40x64 = 2560
> 64x1 = 64
>
> So 6568512 / 65536 = 100 rem 14912 (d)
> 14912 / 256 = 58 rem 64 (:)
>
> But then we stop after "d:" as we have all 5 characters... so the last
> 6-bit byte has no impact on the final decoded string (we do nothing with
> the last rem 64, so it could be rem 56 or rem 35 or rem anything), it's
> just padding and can be ` like perl uses or space (most people use
> space)... but it can be any character. Try it with the decode %<V5N9#H`
> == %<V5N9#HQ == %<V5N9#HA == %<V5N9#H# or with a space after the H...
>
> As mentioned above, you could even do some clever bit manipulations with
> AND, OR, XOR etc. but it's too early in the morning for me to work that
> one out, perhaps I can leave that as an exercise for the reader...;-)
>
> Kind regards,
> Caveat
>
> On Tue, 2011-09-20 at 07:42 +0200, Ron wrote:
>> Great work!
>>
>> I had a few vb code to start from, but the all where slightly
>> different, so where the results.
>> This shows it's sometimes better to just start from the basic info and
>> work from there line by line...
>>
>> Thanks alot!
>> Going to put this in my project...
>>
>> Regards,
>> Ron_2nd.
>>
>>
>> 2011/9/20 Caveat <Gambas at ...1950...>:
>> > Don't stress too much over the `, it's just a kind of non-standard
>> > padding character. The % at the beginning of the string says we only
>> > have 5 characters to decode so we shouldn't worry...we SHOULD always
>> > have an exact multiple of 4 characters after the first length byte...
>> > but some of them may not matter...
>> >
>> > This should do it:
>> >
>> > ======================================================================
>> > Private Function decodeUU(codedStr As String) As Byte[]
>> >
>> > Dim idx, idy, ptr As Integer
>> > Dim result As Byte[]
>> > Dim lengthUU, ascAChar, ascMin32 As Integer
>> > Dim binFour As String = ""
>> > ' First character's ascii code - 32 is the length
>> > ascAChar = Asc(Left$(codedStr, 1))
>> > ascMin32 = ascAChar - 32
>> > lengthUU = ascMin32
>> > Print "Expecting a length of: " & lengthUU
>> > ' Set the size of the result array
>> > result = New Byte[lengthUU]
>> > ' Initialise pointer into the result array
>> > ptr = 0
>> > ' Step through the uuencoded string character by character starting at
>> > the 2nd character (1st is the length)
>> > For idx = 2 To Len(codedStr)
>> > ascAChar = Asc(Mid$(codedStr, idx, 1))
>> > ' Only include what is not whitespace
>> > If ascAChar > 31 And ascAChar < 97 Then
>> > ' Subtract 32 from the ascii code of the character
>> > ascMin32 = ascAChar - 32
>> > ' Assemble a block of four 6-bit values
>> > binFour = binFour & Right$("000000" & intToBase(ascMin32,
>> > BASE_BINARY), 6)
>> > ' Once we have 4 binary 6-bit 'characters' in our string
>> > If Len(binFour) = 24 Then
>> > ' Treat the 4 6-bit characters as 3 8-bit characters
>> > For idy = 1 To 3
>> > ' Make sure we don't go trying to convert more than the length
>> > says we have to
>> > If ptr < result.Length
>> > Print "Bin to convert: " & Mid$(binFour, 1 + ((idy - 1) *
>> > 8), 8)
>> > ' Converts each block of 8 bits to its decimal value and
>> > assigns to the output byte array
>> > result[ptr] = toInt(Mid$(binFour, 1 + ((idy - 1) * 8), 8),
>> > BASE_BINARY)
>> > Inc ptr
>> > End If
>> > Next
>> > ' Be sure to clear out binFour for the next unit of UUencoding
>> > binFour = ""
>> > End If
>> > End If
>> > Next
>> > Return result
>> >
>> > End
>> > ======================================================================
>> >
>> > You probably need the routines to convert between bases too:
>> >
>> > ======================================================================
>> > Private Function convertBase(numberIn As String, fromBase As Integer,
>> > toBase As Integer) As String
>> >
>> > Dim value As Integer
>> > value = toInt(numberIn, fromBase)
>> > Return intToBase(value, toBase)
>> >
>> > End
>> >
>> > Private Function intToBase(numberIn As Integer, base As Integer) As
>> > String
>> >
>> > Dim remain, numToDivide As Integer
>> > Dim result As String = ""
>> >
>> > numToDivide = numberIn
>> > Do While numToDivide / base > 0
>> > remain = numToDivide Mod base
>> > numToDivide = (Int)(numToDivide / base)
>> > result = DIGITS[remain] & result
>> > Loop
>> >
>> > Return result
>> >
>> > End
>> >
>> > Private Function toInt(inputStr As String, base As Integer) As Integer
>> >
>> > Dim idx, mult, result, value As Integer
>> > mult = 1
>> > For idx = Len(inputStr) To 1 Step -1
>> > ' If we're in a base with digits bigger than 9
>> > ' we need the Find to return 10 for A, 11 for B, 12 for C etc.
>> > value = DIGITS.Find(UCase(Mid$(inputStr, idx, 1))) * mult
>> > result = result + value
>> > mult = mult * base
>> > Next
>> > Return result
>> >
>> > End
>> > ======================================================================
>> >
>> > And don't forget a few Consts for convenience:
>> >
>> > ======================================================================
>> > Private Const TEST_STR As String = "%<V5N9#H`"
>> > Private Const BASE_BINARY As Integer = 2
>> > Private Const BASE_OCTAL As Integer = 8
>> > Private Const BASE_DENARY As Integer = 10
>> > Private Const BASE_HEX As Integer = 16
>> > Private DIGITS As String[] = ["0", "1", "2", "3", "4", "5", "6", "7",
>> > "8", "9", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
>> > "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"]
>> > ======================================================================
>> >
>> > Oh and just for fun, here's an encode function too... you will notice
>> > that I encode "send:" CORRECTLY... LOL!
>> >
>> > ======================================================================
>> > Private Function encodeUU(source As Byte[]) As String
>> >
>> > Dim idx, idy, idxThree As Integer
>> > Dim result As String
>> > Dim aByte As Byte
>> > Dim aBinChar, binCharGroup As String
>> > Dim threeChars As Byte[]
>> > binCharGroup = ""
>> > result = result & Chr$(source.Count + 32)
>> > For idx = 0 To source.Max
>> > aByte = source[idx]
>> > ' Convert the byte to exactly 8 digits of binary
>> > ' so for e.g. pad 1 to become 00000001
>> > aBinChar = Right$("00000000" & intToBase(aByte, BASE_BINARY), 8)
>> > Print "aByte: " & aByte & " abinChar: " & aBinChar
>> > ' Add bytes together to make blocks of 3 8-bit characters
>> > binCharGroup = binCharGroup & aBinChar
>> > ' Pad if we're at the end of the string and don't have a full 3-char
>> > block
>> > If idx = source.Max Then
>> > binCharGroup = Left$(binCharGroup & "000000000000000000000000",
>> > 24)
>> > Endif
>> > If Len(binCharGroup) = 24 Then
>> > Print binCharGroup
>> > ' Now treat the 3 blocks of 8 bits like 4 blocks of 6 bits....
>> > For idy = 1 To 4
>> > Print "char: " & idy & " has value: " & (toInt(Mid
>> > $(binCharGroup, 1 + ((idy - 1) * 6), 6), BASE_BINARY) + 32)
>> > ' Append the Chr$ of the value of the 6-bit byte + 32 to our
>> > result
>> > result = result & Chr$(toInt(Mid$(binCharGroup, 1 + ((idy - 1) *
>> > 6), 6), BASE_BINARY) + 32)
>> > Next
>> > binCharGroup = ""
>> > Endif
>> > Next
>> > Return result
>> >
>> > End
>> > ======================================================================
>> >
>> > Kind regards,
>> > Caveat
>> >
>> > On Mon, 2011-09-19 at 13:24 +0200, Ron wrote:
>> >> I'm trying to decode this with gambas, no luck, anyone has an idea?
>> >>
>> >> #!/usr/bin/perl
>> >> print pack('u', "send:");
>> >>
>> >> %<V5N9#H`
>> >>
>> >> So decoding %<V5N9#H` should result in 'send:'
>> >>
>> >> The pack 'u' function does uuencoding but all vb alike code doesn't
>> >> reproduce the correct result, or struggles with the `...
>> >>
>> >>
>> >> Thanks in advance!!
>> >>
>> >> Regards,
>> >> Ron_2nd.
>> >>
>> >> ------------------------------------------------------------------------------
>> >> BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
>> >> Learn about the latest advances in developing for the
>> >> BlackBerry® mobile platform with sessions, labs & more.
>> >> See new tools and technologies. Register for BlackBerry® DevCon today!
>> >> http://p.sf.net/sfu/rim-devcon-copy1
>> >> _______________________________________________
>> >> Gambas-user mailing list
>> >> Gambas-user at lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/gambas-user
>> >
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > All the data continuously generated in your IT infrastructure contains a
>> > definitive record of customers, application performance, security
>> > threats, fraudulent activity and more. Splunk takes this data and makes
>> > sense of it. Business sense. IT sense. Common sense.
>> > http://p.sf.net/sfu/splunk-d2dcopy1
>> > _______________________________________________
>> > Gambas-user mailing list
>> > Gambas-user at lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/gambas-user
>> >
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
> _______________________________________________
> Gambas-user mailing list
> Gambas-user at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gambas-user
>
More information about the User
mailing list