B4J Question Best way to get 15K integers from spreadsheet to B4J

Discussion in 'B4J Questions' started by SeaBee, Jul 23, 2019.

  1. Philip Chatzigeorgiadis

    Philip Chatzigeorgiadis Member Licensed User

    Salty and elegant!
  2. SeaBee

    SeaBee Member Licensed User

    That, on the face of it, appears verbose in the extreme, but, as you say, it rather depends on what the JIT does to it.

    It would also appear that all-positive numbers, and very long byte arrays, can be stored more efficiently. It makes me wonder if it would be worthwhile to add 100 to all the arguments, and then create several large array of byte arrays with the x and y reversed. Probably only make a marginal saving, though, in return for more complex code, so probably not worth it. I guess it's the price of using a basic to java translator/compiler. My problem is that I have written so much stuff in various flavours of basic, that I can code much faster than in other less verbose languages like C#.

    Anyway, I hate squirly brackets!

    I shall now build a data formatting program which will use your method of extraction from Excel to build the array of byte arrays, then convert to ascii, and then obfuscate into an external file using something less extreme than Hamlet - maybe the opening paragraph of Alan Turing's 'On computable numbers, with an application to the Enscheidungsproblem' of 1936. :eek::D
  3. Erel

    Erel Administrator Staff Member Licensed User

    Haven't ready all posts, however you should never put the data inside the code. From all the possible ways to choose this is the worst option.

    It is trivial to load a CSV file, a JSON file, a database file, a text file, a B4XSerialized file or any other way you like.
  4. emexes

    emexes Well-Known Member Licensed User

    That ups the limit to 596 polynomials x 6 coefficients ~= 3500 coefficients total, per method. Tested working with five sets of 596 polynomials ~= 17000 coefficients total. So it can be done, but might need some shoehorning.

    That might be enough of a reason to flip to the external file solution.

    The byte-coefficients-in-one-big-string is still a plausible option. If the string did ever grow > 64 kB (ie fourfold) then it could be split amongst methods and/or code files.

    It'd be useful to have a more detailed description than just "15k integers", like how large are the polynomials, does that size vary, how many are there, what are these subsets, etc. But presumably there must be limits to what you can reveal, so... no worries, we'll forge on in darkness :)
  5. SeaBee

    SeaBee Member Licensed User

    I came to that conclusion in my posts #15 and #22, purely for practical reasons - mostly size and performance related. These numbers are actually all constants, and I already have over a thousand other constants already declared as such in the code. The difference is that the internal constants will be used thousand of times for each program run, whereas these only a few.

    As a matter of interest, as my degree is in mathematics, not computer science, why is it considered so bad to have data inside the code? The data cannot be changed under any circumstances unless the whole edifice is changed - which is why it has taken me so long to get this far.
  6. SeaBee

    SeaBee Member Licensed User

    Sorry - I didn't make myself clear. I meant adding 100 to each argument so all arguments are positive. This would of course mean that they would have to be Int values, as they would exceed the +127 limit.

    Now I have decided that I will have to use an external file, it becomes irrelevant.
  7. emexes

    emexes Well-Known Member Licensed User

    Understood. B4A isn't doing type value range checking here, and it all gets masked to fit in 8 bits. ALthough I didn't actually check the intermediate Java that the >127 values weren't causing additional code to be generated, as with the Double casts.

    Not that it matters. End result is that it did not much improve the initialization code-byte-per-data-byte ratio.
    Last edited: Jul 24, 2019
  8. SeaBee

    SeaBee Member Licensed User

    Yesterday I had to do real work i.e. for real money, so I didn't get much further forward. I formatted a small subset of the arguments in Excel and dumped them into a B4J program that ultimately created a series of character strings (UTF-8) in an external file, as I did for the old HP75C. When doing this, I realized I had another problem.

    HP did not use two's complement for bytes - they had a completely positive range from 0 to 255, with printable characters for each number. With java, I just have values 33 through 127.

    My next step is, therefore, to revisit my original polynomials and remove some of my optimizations, which will give me more polynomials but with smaller arguments. As the polynomials are additive, it will be a comparatively straight forward but time consuming operation. Maths is what I do, anyway - or 'math' if you're an American! :D
  9. emexes

    emexes Well-Known Member Licensed User

    Map signed coeff to 35..253 then multiply by 128/127 to skip over 127
  10. SeaBee

    SeaBee Member Licensed User

    No need - there were only about a hundred polynomials where I had to remove earlier optimization - I had optimized to get any particular group to less than 100 polynomials. By going back to the previous versions for the large valued arguments I have reduced the range to -35 >> + 57, giving me a spread of 93 - just one to spare! I have a couple of hundred extra polynomials, and a lot more arguments, but hey - who's counting! :D
  11. emexes

    emexes Well-Known Member Licensed User

    I don't know about you, but these numeric discussions make me dizzy (with confusion, not excitement ;-)

    Post #30 brevity due to being written at red traffic light on way home.

    If you use UTF-8 then you can have character values up to 21 bits wide. Characters 0..127 are encoded as one byte ie 1:1, and characters 128..2047 are encoded using 2 bytes. There is already a String-to-Char-array function in B4X, and I'm pretty sure there will be an Asc() function to convert those Chars to Ints.

    Depending on the risk in using new polynomials with smaller coefficients, it might be better to go back to using the original tried-and-trusted versions. Or maybe you're so close to the finish line that it's simpler to complete the track you're on.

    Are all the polynomials of the same degree, like: is it an array of 3000 polynomials x 5 coefficients each, or are there significant variations to the polynomial degrees?
  12. SeaBee

    SeaBee Member Licensed User

    It's all done! I now have 8 separate files containing all the arguments encoded as single (8 bit) UTF-8 characters, totalling a little over 16kb in content, due to the additional polynomials and arguments from the de-optimization. The size on disk is less than 20kb. RESULT!

    Incidentally, the reason I am doing this in B4J is that ultimately I want the app to run on an Android tablet as well as a laptop, so B4J is an obvious pathway. Had I used VB.NET, which I have been using since version 1, life would have been a lot easier. In fact, a lot of the maths involved was done in VB.NET, and I have already transferred some of the code to B4J and tested it.

    Anyway, thanks for all your help and encouragement - between us we have got the job done!
  13. emexes

    emexes Well-Known Member Licensed User

    Just a thought - call me Nervous Neddie if you like - but if it was me, I'd perhaps write a quick Sub that dumps the final coefficient arrays to the log, copy all lines, paste to Excel, data convert to columns, then compare them against the source just to make sure nothing got lost in the trip.

    I have found it to be a Law of Nature that stuff I check is Always Right, and stuff I don't check is Always Wrong. Go figure :-/

    But... time lost on the former is nothing compared to time lost on the latter.

  14. SeaBee

    SeaBee Member Licensed User

    Tomorrow I start to write the routine to suck the characters out of the files and put them into a binary array. I shall read the characters a row at a time, and each file can be logically split into two components to keep the Sub size down.

    I have actually done validity checks at each stage. so I am fairly confident the data is correct, but I shall match the input array to the output array. I also have some specific results calculated on a mainframe to 16 significant figures (using different algorithms). If my numbers match to 12, I shall have reached the required accuracy.
  15. emexes

    emexes Well-Known Member Licensed User

    Sorry. Should have known I was preaching to the converted. Still, better twice than never.

    Only twelve?! What sort of a slipshod operation are you running?!?!
  16. emexes

    emexes Well-Known Member Licensed User

    Or, if the polynomials are of fixed length, perhaps just read the entire file into a single string, use .Replace() to delete all the spaces and Char(13)s and Char(10s), then step through the string eg if all polynomials have six terms:
    BigString = BigString.Replace(" """).Replace(Char(13),"").Replace(Char(10),"")

    Dim ExpectedNumPolys As Int = BigString.Length / 6

    Dim NumPolys As Int = 0
    For I = 0 To BigString.Length - 6 step 6
        PolyWaffle(NumPolys) = 
    Array As Byte( _
            CharToCoefficient(BigString.CharAt(I + 
    5)), _
            CharToCoefficient(BigString.CharAt(I + 
    4)), _
            CharToCoefficient(BigString.CharAt(I + 
    3)), _
            CharToCoefficient(BigString.CharAt(I + 
    2)), _
            CharToCoefficient(BigString.CharAt(I + 
    1)), _
            CharToCoefficient(BigString.CharAt(I + 
    0)) _
        NumPolys = NumPolys + 
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice