B4J Library XLUtils / jPOI 5 - Read and write MS Excel workbooks

Status
Not open for further replies.
As written here, I plan to make it easier to read and write Excel workbooks.
The solution is based on three components:
- Apache POI - https://poi.apache.org/
Large open source project that provides APIs for Microsoft documents. Note that the files are accessed directly, it doesn't depend on the Excel program.
- jPOI - This is a wrapper for Apache POI.
- XLUtils - A b4xlib (B4J only) that adds more features and utilities.

jPOI is not a new library, however a new version is included here based on POI 5.0.0. The updated library is not 100% backward compatible with the previous version. It fixes several mistakes in the previous wrapper.
It should be simple to update.

Installation:

- Download POI dependencies and put them in the additional libraries folder: www.b4x.com/b4j/files/poi5_dependencies.zip
- Download XLUtils. It includes the updated jPOI. Put them in the additional libraries as well.

A few concepts that you need to understand:

- 0 based / 1 based indices - jPOI works with 0 based rows and columns. Cell A1 = (0, 0). Working with the indices can be annoying as no one knows the index of column M (and if you do then you surely don't know the index of column AQ).
XLUtils includes a type named XLAddress. It makes it easy to switch between the 0 based indices and the addresses "names", as they appear in Excel.
You can create an address in three ways:
B4X:
'identical result:
address = xl.AddressZero(0, 0)
address = xl.AddressName("A1")
address = xl.AddressOne("A", 1)
'another example:
address = xl.AddressZero(3, 4)
address = xl.AddressName("D5")
address = xl.AddressOne("D", 5)

- XLReader - Tool to read data from a workbook. Can read the complete workbook, a single sheet or one or more ranges.
Ranges are strings such as: "sheet1!F2:M100" or named ranges. Named ranges are set in Excel (search for name manager in Excel). This is a very useful feature.

1618990811140.png



- Empty or missing cells - In some cases Excel omits blank cells. There are no PoiCells behind empty cells. XLReader will fill such cells, when needed, with empty strings. Later when you read the data you can use result.GetDefault to replace empty cells with other values.
- Dates - Dates are stored as ticks, a Long typed number.
- Formulas - Excel stores a cached value together with the formula itself. PoiCell.ValueCached will return the cached value for formula cells. XLReader will also return the cached value.


XLReader - XL.Reader

XLReader is useful when you want to read data from an Excel workbook. You can read a workbook, a sheet or one or more ranges in a single line of code:
B4X:
'three examples
Dim result As XLReaderResult = XL.Reader.ReadRange(File.DirAssets, "Countries of the world.xlsx", "Sheet1!B3:E1000") 'read specific range
Dim result As XLReaderResult = XL.Reader.ReadRange(File.DirAssets, "Countries of the world.xlsx", "Countries") 'named range
Dim result As XLReaderResult = XL.Reader.ReadSheetByName(File.DirAssets, "Countries of the world.xlsx", "Sheet1") 'complete sheet
Call result.LogResult(True) to log useful information about the result.

XLReader will fill missing cells when needed. The actual behavior depends on whether reading a complete sheet or a range.
This is less important unless you access the underlying read data directly.

The recommended way to extract data from XLReaderResult is with the Get or GetDefault methods.
XLReaderResult.Get expects a XLAddress object and it returns the cell value. If the cell is empty or missing, it will return the value of XLReaderResult.DefaultEmptyCellValue (empty string by default).
GetDefault will return the passed default value if the cell is empty or missing.
B4X:
Dim Number As Int = result.Get(xl.AddressName("A4"))
Dim Number As Int = result.GetDefault(xl.AddressName("A4"), 12)
XLReaderResult also includes a TopLeft and BottomRight addresses fields. They can be used to find the end of document.

An example of reading data from a workbook and showing some of the data with B4XTable is attached.

The code:
B4X:
Dim result As XLReaderResult = XL.Reader.ReadRange(File.DirAssets, "Countries of the world.xlsx", "Sheet1!B3:E1000")
B4XTable1.AddColumn(result.Get(XL.AddressName("B4")), B4XTable1.COLUMN_TYPE_TEXT)
B4XTable1.AddColumn(result.Get(XL.AddressName("D4")), B4XTable1.COLUMN_TYPE_NUMBERS)
B4XTable1.AddColumn(result.Get(XL.AddressName("E4")), B4XTable1.COLUMN_TYPE_NUMBERS)
Dim TableData As List
TableData.Initialize
For Row1Based = 6 To result.BottomRight.Row0Based + 1
    Dim Country As String = result.Get(XL.AddressOne("B", Row1Based))
    If Country = "" Then Exit 'Not really needed in this case, but it is good practice.
    Dim Population As Int = result.Get(XL.AddressOne("D", Row1Based))
    Dim Area As Int = result.Get(XL.AddressOne("E", Row1Based))
    TableData.Add(Array(Country, Population, Area))
Next
B4XTable1.SetData(TableData)

1618992650765.png


Don't forget to download POI dependencies (link is above).

Tips:

- Building in debug mode is significantly faster as the POI jars are quite large and they are merged in release mode.
- More tutorials and examples: https://www.b4x.com/android/forum/pages/results/?query=xlutils

Updates:

- V2.04 - Word - new SetMargins method. This method, as SetLandscapeOrientation, depends on poi-ooxml-full (XL_FULL). See https://www.b4x.com/android/forum/t...d-write-ms-excel-workbooks.129969/post-832993
- V2.03 - Word - Adds support for setting the page orientation to landscape: https://www.b4x.com/android/forum/t...d-write-ms-excel-workbooks.129969/post-832993
- V2.02 - Word - bookmarks, page breaks and urls: https://www.b4x.com/android/forum/threads/xlutils-word-bookmarks-urls-and-page-breaks.131826/
- V2.01 - Adds support for merging cells in Word tables: https://www.b4x.com/android/forum/threads/xlutils-creating-ms-word-documents.131732/post-831464
- V2.00 - Adds support for creating MS Word documents: https://www.b4x.com/android/forum/threads/xlutils-creating-ms-word-documents.131732/
- V1.14 - Support for reading hyperlinks: https://www.b4x.com/android/forum/t...d-write-ms-excel-workbooks.129969/post-828352
- V1.13 - Support for generating PDF documents with XL.PowerShellConvertToPdf: https://www.b4x.com/android/forum/threads/xlutils-generate-pdf-reports.131119/
- v1.12 - New methods and properties: XLSheetWriter.SetFitToPage, PrintSetup and PrintArea.
- v1.11 - Support for tables: https://www.b4x.com/android/forum/threads/xlutils-creating-tables.130567/
- v1.10 - Styling implementation was rewritten. XLStyle.BoldFont renamed to FontBold.
- v1.05 - Adds support for conditional formatting: https://www.b4x.com/android/forum/threads/xlutils-conditional-formatting.130454/
- v1.04 - XLStyle.LinkFont renamed to XLStyle.FontLink to be consistent with the other font related methods. New XLWorkbookWriter.RemoveSheetAt method.
- XLUtils v1.03 - Adds support for outlining and links: https://www.b4x.com/android/forum/threads/xlutils-hyperlinks-and-outlining-grouping.130413/
- XLUtils v1.02 + jPOI 5.01 are attached. Many changes.

Tutorial about writing workbooks: https://www.b4x.com/android/forum/threads/xlutils-writing-excel-workbooks.130356/
 

Attachments

  • Example.zip
    50.9 KB · Views: 445
  • ColorsTable.xlsx
    9.8 KB · Views: 272
  • jPOI.zip
    18.2 KB · Views: 428
  • XLUtils.b4xlib
    25 KB · Views: 206
Last edited:

Erel

Administrator
Staff member
Licensed User
Internal packager:
The b4xlib includes the required configuration to create a standalone package, however it will only work with the next version of B4J.
For now you can add this long string to the main module to make the packager work:
B4X:
#PackagerProperty: IncludedModules = jdk.charsets
#PackagerProperty: AdditionalModuleInfoString =provides org.apache.poi.extractor.ExtractorProvider with org.apache.poi.extractor.MainExtractorFactory, org.apache.poi.ooxml.extractor.POIXMLExtractorFactory; provides org.apache.poi.sl.draw.ImageRenderer with org.apache.poi.sl.draw.BitmapImageRenderer, org.apache.poi.xslf.draw.SVGImageRenderer; provides org.apache.poi.ss.usermodel.WorkbookProvider with org.apache.poi.hssf.usermodel.HSSFWorkbookFactory, org.apache.poi.xssf.usermodel.XSSFWorkbookFactory;opens org.apache.xmlbeans.metadata.system.sXMLTOOLS; opens org.apache.xmlbeans.metadata.system.sXMLSCHEMA;opens org.apache.xmlbeans.metadata.system.sXMLLANG;opens org.apache.xmlbeans.metadata.system.sXMLCONFIG;opens org.apache.poi.schemas.ooxml.system.ooxml;
 

MicroDrie

Active Member
Licensed User
When I try to run the example the IDE show missing poi-ooxml-5.0.0.jar. I downloaded the POI 5.0.0. jars and put them in the additional library's folder. The next run shows an error missing C:\Program Files (x86)\Anywhere Software\B4J\libraries\xmlbeans-4.0.0.jar.

What did I do wrong?
 

Erel

Administrator
Staff member
Licensed User
New version with support for writing workbooks will be released tomorrow:
B4X:
Dim Workbook As XLWorkbookWriter = xl.CreateWriterBlank
Dim sheet1 As XLSheetWriter = Workbook.CreateSheetWriterByName("Sheet1")
Dim TitleStyle As XLStyle = Workbook.CreateStyle
TitleStyle.ForegroundColor(xl.COLOR_GREY_80_PERCENT).BoldFontColor(12, xl.COLOR_WHITE).HorizontalAlignment("CENTER")
sheet1.PutString(xl.AddressName("A1"), "Product").SetStyles(sheet1.LastAccessed, Array(TitleStyle, Workbook.CreateStyle.BorderLeft("THIN"))) 'title style + border
sheet1.PutString(xl.AddressName("B1"), "Price").SetStyle(sheet1.LastAccessed, TitleStyle)
sheet1.PutString(xl.AddressName("C1"), "Quantity").SetStyle(sheet1.LastAccessed, TitleStyle)
sheet1.PutString(xl.AddressName("D1"), "Total").SetStyles(sheet1.LastAccessed, Array(TitleStyle, Workbook.CreateStyle.BorderRight("THIN")))

Dim RowStyles As List = Array(Workbook.CreateStyle.ForegroundColor(xl.COLOR_LIGHT_TURQUOISE), Workbook.CreateStyle.ForegroundColor(xl.COLOR_WHITE))
Dim ProductStyle As XLStyle = Workbook.CreateStyle.BorderLeft("THIN")
Dim PriceStyle As XLStyle = Workbook.CreateStyle.DataFormat("$#,##0.0") 'taken from Excel custom formats
Dim QuantityStyle As XLStyle = Workbook.CreateStyle.DataFormat("#,##0")
Dim TotalStyle As XLStyle = Workbook.CreateStyle.DataFormat("$#,##0").BorderRight("THIN")

For i = 1 To 25
    Dim Row1Based As Int = 1 + i 'starting from row 2.
    Dim RowStyle As XLStyle = RowStyles.Get(Row1Based Mod RowStyles.Size)
    sheet1.PutString(xl.AddressOne("A", Row1Based), "Product #" & i).SetStyles(sheet1.LastAccessed, Array(ProductStyle, RowStyle))
    sheet1.PutNumber(xl.AddressOne("B", Row1Based), Rnd(0, 100000) / 100).SetStyles(sheet1.LastAccessed, Array(PriceStyle, RowStyle))
    sheet1.PutNumber(xl.AddressOne("C", Row1Based), Rnd(0, 5000)).SetStyles(sheet1.LastAccessed, Array(QuantityStyle, RowStyle))
    sheet1.PutFormula(xl.AddressOne("D", Row1Based), $"${xl.AddressOneToString("B", Row1Based)} * ${xl.AddressOneToString("C", Row1Based)}"$)
    sheet1.SetStyles(sheet1.LastAccessed, Array(TotalStyle, RowStyle))
Next
'add bottom border
Dim LastRow1 As Int = sheet1.LastAccessed.Row0Based + 1
sheet1.SetStylesToRange(xl.CreateXLRange(xl.AddressOne("A",LastRow1), xl.AddressOne("D", LastRow1)), Array(Workbook.CreateStyle.BorderBottom("THIN")))

'add sum field:
sheet1.PutFormula(xl.AddressOne("D", LastRow1 + 2), $"SUM(D2:${xl.AddressOneToString("D", LastRow1)})"$)
Dim BoldStyle As XLStyle = Workbook.CreateStyle.BoldFont(Workbook.DefaultFontSize)
sheet1.SetStyles(sheet1.LastAccessed, Array(TotalStyle, BoldStyle))
sheet1.PutString(xl.AddressOne("C", LastRow1 + 2), "Total:").SetStyle(sheet1.LastAccessed, BoldStyle)
'We forgot to add a title:
'shift the rows down
sheet1.ShiftRows(0, LastRow1 - 1, 1)
'merge the cells
sheet1.AddMergedRegion(xl.CreateXLRange(xl.AddressName("A1"), xl.AddressName("D1")))
'set the title
sheet1.PutString(xl.AddressName("A1"), "Important Products List")
sheet1.SetStyle(sheet1.LastAccessed, Workbook.CreateStyle.Font(15).HorizontalAlignment("CENTER"))
'Set the columns widths to fit the content
'To make sure that the "total" column is measured correctly we need to first evaluate the formulas:
Workbook.EvaluateFormulas

For col0 = xl.AddressName("A").Col0Based To xl.AddressName("D").Col0Based
    sheet1.AutoSizeColumn(col0)
Next
'save the workbook
Workbook.SaveAs("C:\Users\H\Downloads\", "Products.xlsx")

Result:

1619621412663.png


There are several concepts that developers should understand. Overall it is quite simple.
 

Erel

Administrator
Staff member
Licensed User
V1.02 released with many changes and support for writing workbooks:

 

Erel

Administrator
Staff member
Licensed User
- v1.10 - Styling implementation was rewritten. XLStyle.BoldFont renamed to FontBold.

XLUtils goes a long way to reuse cell styles automatically. The previous implementation was based on POI. It relied on the ability to get the current style properties and set them back without affecting the style. This works in 95%, but there are cases where it has unwanted side effects (for example the fill color misbehaves when adding a table).

Starting from v1.10 the styles work a bit differently.
XLSheetWriter.SetStyle / SetStyles / SetStylesToRange - existing style properties are discarded. The "set" methods should be used by default. If you want to set multiple XLStyles then use SetStyles. The advantage of these methods is that only the explicitly listed properties are actually set.

XLSheetWriter.AddStyle / AddStyles / AddStylesToRange - existing style properties are kept and the new style properties are applied. This is similar to the previous approach.
 

Erel

Administrator
Staff member
Licensed User

Erel

Administrator
Staff member
Licensed User
V1.14 - adds support for reading hyperlinks with XLReader.
B4X:
XL.Reader.FetchHyperlinks = True
Dim result As XLReaderResult = XL.Reader.ReadRange(...) 'or one of the other read methods
Dim hyperlink As XLHyperlink = result.GetHyperlink(XL.AddressName("B22"))
If hyperlink.IsInitialized Then
    Log(hyperlink.Address)
    Log(hyperlink.Label)
    Log(hyperlink.LinkType)
End If
 

Erel

Administrator
Staff member
Licensed User
Status
Not open for further replies.
Top