B4J Question Suggestions/Ideas to Retrieve Data from ExpressSQL server

walterf25

Expert
Licensed User
Longtime User
Hi all, I wanted to get everyone's suggestions or ideas about what the more efficient way of retrieving all the data from an ExpressSQL server that is running on an old PC, This PC unfortunately can not be connected to the network as it is not supported by our I.T. department.

This PC is connected to an Arbin Battery Test System which runs constantly 24/7 collecting information on different types of batteries. At the moment the only way we have of retrieving the data collected is by Connecting an Ethernet cable directly from a laptop to the P.C. and using the Remote Desktop Client, then we run a macro inside an Excel workbook provided by the manufacturer of the Arbin system, which does different queries on the Master Database to retrieve the specific data pertaining to a specific test running.

Arbin2.JPG


Arbin.JPG


The requirement now is the following:
  1. Retrieve the data stored in the Main Database which runs in an old PC running windows 7 that can not be connected to the network.
  2. Upload the Data to bigquery tables.
And here is what I have started so far.

I have created a B4J Server which is running on the old PC running windows 7. This server opens a connection to the local ExpressSQL server running on the PC.
I have also created another B4J app which runs on my Dell Laptop, this app sends HTTP requests to the B4J server, The server processes these requests and sends queries to the ExpressSQL server to retrieve the data from the Master Database. The queries consist of selecting certain data from specific tables and sending the results back to the B4J app running on my laptop.

The problem I am facing now is that some tables contain up to 7 Gigs of data if not more, so what I am doing on the B4J server side is that I am writing each row of the results to a csv file which gets saved to a shared folder, but as you all can imagine querying that much data results in a out of memory exception, not to mention that it can take a very long time to process said query. The PC running Windows 7 only has 8G of RAM.

The out of memory exceptions happen while I try to iterate through the Results, one thing I have thought about is iterate through chunks of the results, for example iterate through the first 10k rows, write those rows to the csv file, then iterate through the next 10k rows etc.

The idea is that once all the results from any query have been saved to the csv file, the B4J app running on my Dell laptop will read those files, upload those files to a storage bucket, then insert the data from those files to bigquery tables under the same names.

I hope this makes sense, what I am hoping here, is to know any suggestions or any other ideas about how else can this be accomplished, I have been working on this project on and off, but would love to hear other people's approach or how you guys would tackle this issue. The main purpose for this is basically to have the B4J app running on my laptop run at least once or twice a day and update on the cloud so that other engineers can access this data without having to manually go into the Arbin's PC.

Thanks in advance everyone, can't wait to hear other people's opinions on this.

Walter
 
Last edited:

Magma

Expert
Licensed User
Longtime User
Hi there...

what about backing up (at least for first time) ExpressSQL server... to a file and then copy this file to the target pc you wanna have the data and recover...

The best utility for this work... tested and works perfect... is sqlbackupandftp (for 2 db is free)
...do that for first time - or with a manual way if you want (there is way to take dump sql)... but i prefer this utility...

If the db is 8 GB - your work with this utility will end in maximum 5-10 mins...

and then day by day or hour by hour with the way you already have done... (or you can set a period of time - you know better)
 
Upvote 0

DarkoT

Active Member
Licensed User
Hi,
I will go in two directions. As first - I'm totaly agree with @Magma idea about SqlbackupAndFtp - which I used in company more then 4 years - without any problem. You will solve two things - backup and you will have access to all data in database basicaly awery x minutes. Or - if you can access machine directly, you can run backup scipt from command promt. If you want, here is the list how to make backup easy with script on sql server:

SQL:
-- Script for backup

DECLARE @BackupFile VARCHAR(100),
        @Folder VARCHAR(100),
        @FileName VARCHAR(100),
        @Db VARCHAR(40)

SELECT @Db = 'IPS_DAT',
       @Folder = 'C:\WORK\BACKUP\'

-- Backup file name will contains dayname
SELECT @BackupFile = 'BACKUP_' + @Db + '_' + Datename(dw,getdate()) + '.BAK'
SELECT @FileName = @Folder + @BackupFile


BACKUP DATABASE @Db TO  DISK = @FileName WITH NOFORMAT, INIT, NAME = @BackupFile, SKIP, REWIND, NOUNLOAD,  STATS = 10
GO

Store this script to f.ex. backup.sql and create *.bat or *.cmd (like -> mybackup.cmd)

Code for running backup from Cmd:
B4X:
@echo off
sqlcmd -S "servername" -U "username" -P "password" -i mybackup.sql

Another solution will be that you writte WebApi which include jetty server and will be connected to Db; then can you directly access db over web browser... If you need more info - I used @aeric template for web api...
 
Upvote 0

emexes

Expert
Licensed User
The problem I am facing now is that some tables contain up to 7 Gigs of data if not more

That's a lot of battery readings, even if it's in bytes, not table rows. Gig = 10^9 (or 2^30) = 1000 million, right?

@DarkoT's reply turned up while I was crafting my suggestion, and trumps all over it :rolleyes: but, just in case:
When we did long-term tests (1 week = 168 hours) at 10 Hz = 6 million records, we'd "compress" the results into groups of samples where the readings differed by at most some specified range. Perhaps you could try something like that, so that each row in your table had a start and end time, and "open"-high-low-"close" battery readings, instead of eg 5000 rows of near-identical data.

Or, if not for the table, at least for the transferred data.

Oddly enough, I am testing some CR2032 cells at the moment, but with a multimeter, and resistor-held-on-by-wooden-clothespeg loads, with the cell voltages dropping at 2 mV per hour. Which I was originally horrified by, but in retrospect if 2.4 volts is the lower limit and they started at 3.04 volts loaded, then we're looking at 320 hours (tops) which is close to the "designed" expected test duration of two weeks (tops).
 
Last edited:
Upvote 0

walterf25

Expert
Licensed User
Longtime User
When we did long-term tests (1 week = 168 hours) at 10 Hz = 6 million records, we'd "compress" the results into groups of samples where the readings differed by at most some specified range. Perhaps you could try something like that, so that each row in your table had a start and end time, and "open"-high-low-"close" battery readings, instead of eg 5000 rows of near-identical data.

Or, if not for the table, at least for the transferred data.
This is similar to what we are doing, keep in mind that the data I am trying to retrieve is all the data since we started using this system, so we have various databases full of data from years ago, so yeah that's why the size of each database is so big.

Walter
 
Upvote 0

EnriqueGonzalez

Well-Known Member
Licensed User
Longtime User
If you can connect 2 computers in between then:

Create a B4J app that connects to the db
The same app connects to the db in the other PC
Transfer the data by every 10000 rows.
Create and close transactions every 100000 rows or use checkpoints
 
Upvote 0
Top