Android Tutorial [B4X] CloudKVS - synchronized key / value store

CloudKVS solves a difficult and common problem. The user needs to work with online data, however as this is a mobile app we cannot assume that the device will always be connected to the remote server.

SS-2016-02-15_13.23.32.png


With CloudKVS the app always works with a local database. If the device can connect to the remote server then the local store will be synchronized with the online store.

The store is implemented as a key/value store. It is similar to a persistent Map collection.
The values are serialized with B4XSerializator.
The following types are supported as values:
Lists, Maps, Strings, primitives (numbers), user defined types and arrays (only arrays of bytes and arrays of objects are supported).
Custom types should be declared in the main module.
Including combinations of these types (a list that holds maps for example).

This is a cross platform solution. The clients can be implemented with B4A, B4i or B4J and the data can be shared between the different platforms.
Note that ClientKVS class source code is exactly the same on all three platforms.

Working with ClientKVS is almost as simple as working with a regular Map.

User field

To allow more flexibility items are grouped by a "user" field.
For example:
B4X:
ckvs.Put("User1", "Key1", 100)
Log(ckvs.Get("User1", "Key1")) '100
Log(ckvs.GetDefault("User2", "Key1", 0)) '0 because User2/Key1 is different than User1/Key1

The synchronization (from the remote store to the local store) is based on the user field.
The SetAutoRefresh method sets the user(s) that will be fetched from the remote store.

For example if we want to auto-synchronize the data of "User1":
B4X:
ckvs.SetAutoRefresh(Array("User1"), 5)
You can pass multiple users in the array. The second parameter is the interval measured in minutes.
This means that the client will check for new data every 5 minutes.
New data in this case means data that was uploaded from other clients.
The NewData event is raised when new data was fetched from the remote server.
Note that auto-refresh is not relevant for local updates. Local updates are uploaded immediately (if possible).

Multiple clients can use the same 'user name'.

Defaults

GetDefaultAndPut:
B4X:
Dim Score As Int = ckvs.GetDefaultAndPut ("User1", "Score", 0)
If there is a User1/Score value in the local store then it will be returned. Otherwise it will return 0 and also put 0 in the store.
This is useful as it allows us later in the program to get the score with: ckvs.Get("User1", "Score").
Defaults put in the store are treated specially. If there is already a non-default value in the remote store then the default value will not overwrite the non-default value. The non-default value will be synchronized to the local store once there is a connection.

Notes & Tips

- CloudKVS is fault tolerant. The local store includes a 'queue' store which holds the changes that were not yet synchronized.
- For performance reasons it is better to use larger values (made of maps or lists) than to use many small items.
- In B4A it is recommended to initialize ClientKVS in the Starter service.
- The B4J server project can run as is. It accepts a single command line argument which is the port number. If you want to run it on a VPS: https://www.b4x.com/android/forum/threads/60378/#content
- In the examples the auto refresh interval is set to 0.1 (6 seconds). In most cases it is better to use larger intervals (1 minute+).
- The keys and user names are case sensitive.
- On older versions of Anrdoid there is a limit of 2mb per field. You will see the following error if you try to put a larger value: java.lang.IllegalStateException: Couldn't read row 0, col 0 from CursorWindow


Projects

The three client projects and the server project are attached.
If you want to add this feature to an existing project then you need to add:
1. CloudKVS and CallSubUtils modules.
2. The two custom types Item and Task to the main module.
3. The following libraries are required: SQL, RandomAccessFile and HttpUtils2.

The server project depends on jBuilderUtils library. The library is attached.

Development Test Server

You can use this link for the ServerUrl during development:
https://www.b4x.com:51041/cloudkvs
Note that the messages are limited to 100k and more importantly the database is deleted every few days. The clients will stop updating after the database is deleted. You can delete the local database (or uninstall and install the app again) for the clients to work again.
Remember to use unique ids for the user value.
 

Attachments

  • B4J_ServerKVS.zip
    2.8 KB · Views: 1,587
  • B4i_ClientKVS.zip
    8.7 KB · Views: 853
  • jBuilderUtils.zip
    2.3 KB · Views: 1,622
  • B4J_ClientKVS.zip
    6.4 KB · Views: 1,122
  • B4A_ClientKVS.zip
    11.6 KB · Views: 1,563
Last edited:

Widget

Well-Known Member
Licensed User
Erel,

The CloudKVS looks like a really capable solution for simple cloud storage.
Although it can handle a lot of requests compared to most SQL solutions, have you given any thought to making CloudKVS scalable? This will add redundancy should the computer with the database fail, so the client request would then shift over to the next available CloudKVS server. It will also speed things up if one computer isn't fast enough. (The database speed may slow down depending on the size of the values being updated like binary data such as images.)

I only mention this because if an application does catch on and becomes popular, the developer doesn't want to see his work come crashing down because his backend server can't handle the volume or it crashed on him. He'd be pretty embarrassed and his company's reputation would be in tatters.

Would it be possible (or practical) to have several CloudKVS DB servers (running Sqlite for example) with a load balancer computer (CloudKVS LB) up front that is receiving the requests from the client, and redirecting the request to one of several CloudKVS DB servers on the same network?

Here is how I think it could work.

The load balancer computer would itself be a CloudKVS server (called CloudKVS LB where "LB"=Load Balancer) with a SQLite memory database that keeps the connection info of the user and a list of available CloudKVS DB servers that are operational (haven't crashed and are still responding), and redirects the request to the next available CloudKVS Db computer. Every 60 seconds (or even 5 minutes depending how current the data has to be between servers), each CloudKVS DB server would sync with the other CloudKVS DB servers on the network so they have the same data. Every 30 minutes it could even sync with another CloudKVS server that is at a remote location which could also be part of another bank of CloudKVS servers. When the client app running on the phone connects to the server, he would be given an updated list of available remote servers (IP Addresses) should the ISP hosting the CloudKVS go down the next time he tries to connect.

The neat part of all this, I think it can be done with just computers running CloudKVS. It would be like building a CloudKVS server farm from Lego blocks. Yes there will be some overhead every minute or so syncing between servers, but I think it will be small (due to larger transaction size) compared to the improved reliability and throughput for the client.

Will it work? Or do I have my head stuck in the clouds? :rolleyes:

TIA
 

Widget

Well-Known Member
Licensed User
The bottleneck is the database. It should be simple to switch to MySQL or a different SQL engine that perform better.

Synchronizing multiple CloudKVS instances is possible but will probably be challenging to implement correctly. I don't think that it is a good approach.

I've actually used MySQL years ago on another project. It doesn't scale well either. Its MYISAM engine (fastest) uses table locking. It is great at bulk loading data and building indexes, but it is not good for concurrency with more than around 25 users with heavy updates. I've had MYISAM tables up to 100 million rows and it worked reasonably well for a low number of connections. Its InnoDb engine uses record locking with transactions with better concurrency, but is much slower for < 25 connections because of its greater overhead per update. I don't believe Innodb is free anymore for the community version. Since Oracle bought MySQL, they tightened up its licensing policies. I will look for another solution that has better replication methods.
 

JakeBullet70

Well-Known Member
Licensed User
https://mariadb.org/

It is from the original developers of MySQL.

I only mention this because if an application does catch on and becomes popular, the developer doesn't want to see his work come crashing down because his backend server can't handle the volume or it crashed on him. He'd be pretty embarrassed and his company's reputation would be in tatters.

s InnoDb engine uses record locking with transactions with better concurrency, but is much slower for < 25 connections because of its greater overhead per update.

I think that 25 users regarding the above statement (application does catch on and becomes popular) is not an issue. All in all just toss more RAM - move to SSD drives in your DB server.
 
Last edited:

Widget

Well-Known Member
Licensed User
https://mariadb.org/

It is from the original developers of MySQL.

I think that 25 users regarding the above statement (application does catch on and becomes popular) is not an issue. All in all just toss more RAM - move to SSD drives in your DB server.

InnoDb is a CPU hog and requires at least 6 cores. If you want very good performance from InnoDb then 36 cores is ideal. More RAM will of course help, and 32GB to 62GB is recommended for a db under heavy load. Unfortunately even with all this hardware thrown at InnoDb, you're not going to get ACID transactions. You still need to turn off flushing transactions at commit using "innodb_flush_log_at_trx_commit = 0" if you want any type of speed. Turning it on will reduce the speed of InnoDb by nearly 90%. Ouch!

MariaDb from Michael Widenius (MySQL author) is a good alternative to MYISAM and inexpensive. I haven't researched its TPS though. Might be worth a look. Thanks.
 

JakeBullet70

Well-Known Member
Licensed User
flushing transactions at commit using "innodb_flush_log_at_trx_commit = 0" if you want any type of speed. Turning it on will reduce the speed of InnoDb by nearly 90%.

That is a huge OUCH! I myself was MsSQL for many years at work (and still do) and only got into MySQL when I went to b4a.
Oh and @DonManfred has a MariaDB connector lib.
 

Widget

Well-Known Member
Licensed User
That is a huge OUCH! I myself was MsSQL for many years at work (and still do) and only got into MySQL when I went to b4a.
Oh and @DonManfred has a MariaDB connector lib.

I don't want to get off of topic too much, but what finally turned me off of MySQL after using it for a few years, was the licensing change when Oracle took over. After talking with MySQL AB (aka Oracle) sales people about whether I could use the free community license on my own web server, he couldn't tell me for sure (or didn't want to tell me) but strongly suggested I pay for a MySQL license. At $700 (USD) per server, I decided to use only non-royalty based databases because I don't want any licensing hassles to come back and bite me. IMHO, there are better (transactional) performance (and certainly cheaper) alternatives out there like PostgreSQL, MongoDB (NOSQL) and Firebird to name a few. I believe commercial applications that use MariaDB for a database *may* still require a license depending on which libraries are included with your application. This may dissuade me from considering MariaDB (at least for commercial Windows apps that I write) and I will have to look into it further. https://mariadb.com/kb/en/mariadb/licensing-faq/

I am going to look into MongoDB for constructing a web service of my own or perhaps get it to work with CloudKVS. It has the ability to scale well with its built-in replication and is royalty free (Apache License v2.0). It should be a better solution than MySQL as a cloud database.
 

OliverA

Expert
Licensed User
How is it possible to refresh only the given key of a user?
As is, it is not. But it should not be necessary. You will only get keys that have been updated since the last sync. If that key is part of the update, you'll get it synced back down to the client.
 

incendio

Well-Known Member
Licensed User
What's the difference between this and jRDC? And how its performance to send/retrieve huge data compare to jRDC?
 

ocalle

Active Member
Licensed User
Excellent Solution, a little question, can be installed on Linux Servers? as Debian for example
 
Top