B4J Tutorial jPOI to the rescue...my case today...

Mashiane

Expert
Licensed User
Phew

What a long day. This morning I receive a request to consolidate 1.1GB worth of data with 511 files, excel spreadsheet files from a particular tab that has about 120+ columns and I should use a similar template to write all the consolidated rows.

Thanks to b4j and jPoi, following on Erels way of doing things, all went well, well with challenges..

Challenges & Solutions

C1. I'm running a 64 bit windows machine and I have 32 bit java jdk installed. I experienced heap size errors. The excel file sizes descend from 10MB up until 4MB at most.

S1. Install 64 Bit JDK and configure paths and in Main set

B4X:
#VirtualMachineArgs: -Xms3G -Xmx3G
C2. jPoi reads everything on the sheet rows and columns provided it has data. I'm running a sub that builds column names from a to z... this creates about 702 columns. The maximum number of columns in excel are in the range of 16K. I know for a fact that the columns in my spreadsheets are 120-, so I need to tweak my code to read the columns to 702. There is a file that shows column size of 802 and this crashes my code.

B4X:
'define the column names as a list from a > zz
Sub PoiColumnNames() As List
    Dim lst As List
    lst.initialize
    Dim alphalist1 As List = CreateList(",","a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z")
    Dim alphalist2 As List = CreateList(",","a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z")
    Dim l1Tot As Int = alphalist1.Size - 1
    Dim l2Tot As Int = alphalist2.Size - 1
    Dim l1cnt As Int
    Dim l2cnt As Int
    'first alphabets
    For l1cnt = 0 To l1Tot
        lst.Add(alphalist1.Get(l1cnt))
    Next
    'rest of the alphabets
    For l1cnt = 0 To l1Tot
        For l2cnt = 0 To l2Tot
            Dim strKey As String = $"${alphalist1.Get(l1cnt)}${alphalist2.Get(l2cnt)}"$
            lst.add(strKey)
        Next
    Next
    Return lst
End Sub
S2. Limit the number of columns read to 701

B4X:
return a list of excel rows as maps, each records key is the column name
'this reads all rows and columns in the excel sheet
Sub PoiGetRows(ws As PoiSheet) As List
    'define the column names list
    Dim colNames As List = PoiColumnNames
    Dim lst As List
    lst.Initialize
    Dim rowEnd As Int = ws.LastRowNumber
    Dim rowStart As Int = ws.FirstRowNumber
    Dim rowCnt As Int
    Dim celCnt As Int
    Dim celTot As Int
    'loop through each row
    For rowCnt = rowStart To rowEnd
        Dim row As PoiRow = ws.GetRow(rowCnt)
        row.IsInitialized
        'how many columns are there
        celTot = row.Cells.Size - 1
        If celTot > 701 Then celTot = 701
        'each row should be saved as a map using column alphabets
        Dim rowMap As Map
        rowMap.Initialize
        For celCnt = 0 To celTot
            'get the cellname from the list
            Dim cellName As String = colNames.Get(celCnt)
            'define the poicell
            Dim cellP As PoiCell = row.GetCell(celCnt)
            cellP.IsInitialized
            'get the cell value
            Dim cellValue As String = PoiGetCell(cellP)
            rowMap.Put(cellName,cellValue)
        Next
        lst.add(rowMap)
    Next
    Return lst
End Sub
For each row, this sub creates a map from a specified row number that will have a=data, b=data, c=data, d=data etc, where data is the actual contents of the file.

C3. We are using a template that is already defined with R1 - R5000, this has some empty rows.

S3 Because jPoi reads all available specified content, some and most of the rows are blank. On first run the database had 1M records from all consolidated records from the excel sheets, so we needed to mark some columns as required so that of the information is not available in those columns, the rows are not needed. This resulted in an overall 85K records with 120 columns and a 20MB sqlite db.

C4. Write the data back to the original template.
S4. This I quickly did manually with export to excel and then used copy and paste. My excel kept on hanging but will explore writing directly to a new template file next time.

*hand clap* for jPoi.

Ta!

PS: I'm using an excel module I wrote with consolidated methods for jPoi.


 

Attachments

Mashiane

Expert
Licensed User
C5: After submitting the data Im told that they also need all the data if the formular columns.. I had excluded these columns as my assumption was they they are formulars and will be generated automatically. Wrong assumption.

formulars.png


S5: Thanks to a quick response from the forum that I was pointed to the right direction.

This covers it. Now re-runing my script to extract the 85K rows again.
 
Top