B4J Question XML/JSON Fields-Values Automatic Matching

Magma

Expert
Licensed User
Longtime User
Hi there,

the question (not exactly only for B4J - may be a tactic how to manage something) came from Chit chat https://www.b4x.com/android/forum/threads/because-you-can.167192/ when talking with my friend @aeric who did a great job with Web Server API (Pakai Framework).

Well..
In Greece we have about 30 different E-Invoicing Providers having DIFFERENT JSON or XML (MyDATA - the old/first public invoicing for all) - the Json are having different structure and different fields but sometimes if mapping the right you need... less work to do your work...

Sometimes need loops for extracting values from a post, or need just to match..

I am attaching a WORD (docx) file in a zip with 3 different providers for the same type of invoice.. have in mind that invoice types in Greece are about ~30... the providers are also about 30 and there may be some different flow working them..

If having same API for all will be great... but is a dream...

My opinion is to use some AI tech - to magic MATCH them, feed the AI xmls, jsons, examples...
and create me templates to match them...

May be with the power of Pakai (aeric's) and the power of AI (there are many examples here how to use it in b4x) and may be some power of new Erel's features (using python)... can create a web server wrapper or offline wrapper to do my job better...

automatic converting - matching values and elements for xmls-json ...

What do you think ? - Have in mind some code will help... try to create a project like this "MAGIC-MATCH-XML/JSON" >?
 

Attachments

  • timologia xml - json - parochos - ilyda - mydata.zip
    27.8 KB · Views: 114

aeric

Expert
Licensed User
Longtime User
Before that, what AI suggest you?
My answer may be same, similar or different from AI.

For me, it is same steps.
Understand the requirements and think of strategy to solve the problem.

Even though you have 30+ different format, you can find what distinguish one of them among the others.

Once you can differentiate them, it is easier to process.
You know the structure or tree.
You know the list or map on which level to parse next.

If you are using a server such as mine, you can make a "Post" with the input json data in request body and the server will process accordingly to the 30 different types.
At the end you get the output you desire to have.
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
A better option is to let the user manually call the API according to the different format.
Example:
When a user call the /api/invoice/type/1 with invoice type 1, it should process successfully.
If he submits an invoice type 2 to the same API, an error would be expected and give an error "document type not matched"
Something like that...
 
Upvote 0

Magma

Expert
Licensed User
Longtime User
A better option is to let the user manually call the API according to the different format.
Example:
When a user call the /api/invoice/type/1 with invoice type 1, it should process successfully.
If he submits an invoice type 2 to the same API, an error would be expected and give an error "document type not matched"
Something like that...
Well...


i think you didn't full understand the "problem"...

I was thinking a REST-API Server (perhaps Pakai) as a middleware for 30 different REST-API Servers... the documents invoices has 30 different types...

The middleware will automatic translate between different APIs.. will have 30 examples for every 30 types of invoices for 30 Rest-APIs and will try to tranform/translate the APIs.. matching and AI-auto-matching in start...

a Wrapper between APIs or middleware...
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
Well...


i think you didn't full understand the "problem"...

I was thinking a REST-API Server (perhaps Pakai) as a middleware for 30 different REST-API Servers... the documents invoices has 30 different types...

The middleware will automatic translate between different APIs.. will have 30 examples for every 30 types of invoices for 30 Rest-APIs and will try to tranform/translate the APIs.. matching and AI-auto-matching in start...

a Wrapper between APIs or middleware...
Nowadays everything or everyone talks about AI but the question is do we really need to use AI for the task?

Thinks about it.
Sometimes we only need an "Automation Integrated" and I believe some software or middleware providers put AI into their products are actually refer to this.

I can prove it (my claim above) to you.
My answer is, develop the solution with just B4J.

You don't necessary need to have AI. Unless you want to market it as it is so powerful.
The fact is, the problem can be solved at the time even before AI is a thing.
 
Upvote 0

Magma

Expert
Licensed User
Longtime User
Nowadays everything or everyone talks about AI but the question is do we really need to use AI for the task?

Thinks about it.
Sometimes we only need an "Automation Integrated" and I believe some software or middleware providers put AI into their products are actually refer to this.

I can prove it (my claim above) to you.
My answer is, develop the solution with just B4J.

You don't necessary need to have AI. Unless you want to market it as it is so powerful.
The fact is, the problem can be solved at the time even before AI is a thing.
The problem... that the number of "30" will not stop there... will be 40 - 50 ... 60 ...may be in some years fall/drop in 10, and may be our Governors think better and set limits and one API for all (that is the right solution)... but until then everything is changing and may be the automatic or the AI is a solution...

Ofcourse everything can do with AI, we can do it alone... but sometimes the automation, the loop is for robots :) ..
I am not prefering it / ofcourse i can train or we can train from scratch a software to do that job... but every template need a lot of job... why you think is preferable to do it all alone ?
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
For me, using AI is overkill for some "simple" tasks.
I may be wrong or can you 100% sure you are not?
If you can say 30 types, there must be a way to tell.
 
Upvote 0

Magma

Expert
Licensed User
Longtime User
For me, using AI is overkill for some "simple" tasks.
I may be wrong or can you 100% sure you are not?
If you can say 30 types, there must be a way to tell.
well it seems easy, i was not talking for 30 types...that will change.. but for providers...

also types... but types... not only different on elements/nodes... but also may be having different validations and limits too... yeap / different flow

and yes the final customization must set by human eye/hand - but a first "hand" of automating - wouldn't be nice and helpful to do it with AI..

I am not talking for a continuous service... just for customizing templates/matching fields-elements-nodes / lists...
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
If you want stands firm to your statement that you must use AI to do it, I have nothing more to say.

If you can sit down and think.
Can I do it without AI?

I will say yes.
And it can even do a better or more efficient job than AI could.

Can you guarantee 100% AI doesn't miss?

Ask you a question, let say today the gov add 10 new types, can your AI adapt to it instantly?
Without AI I can say it is much faster, as we (the human) don't need to train with new dataset which may not yet available. We just adapt to always up to date news in real time.
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
If you said raw document recognition (as scanned pdf or image), you may need AI to do it.
Aren't we have OCR solution for that?

But if we are now talking about a format that already processed or submitted through an API client, it is expected in a well standard json or xml.

All we need to do is to parse this input. We don't need to "recognize" it!
 
Upvote 0

Magma

Expert
Licensed User
Longtime User
If you want stands firm to your statement that you must use AI to do it, I have nothing more to say.

If you can sit down and think.
Can I do it without AI?

I will say yes.
And it can even do a better or more efficient job than AI could.

Can you guarantee 100% AI doesn't miss?

Ask you a question, let say today the gov add 10 new types, can your AI adapt to it instantly?
Without AI I can say it is much faster, as we (the human) don't need to train with new dataset which may not yet available. We just adapt to always up to date news in real time.
I am keeping your opinion. It is valuable, no disagree. The only think i am not so sure is the time... because time is ticking always (and not stops)

This is a real life problem - actually is a small war for companies.

Ofcourse AI can make you fail... and the answers and options... always wrong...

I will give the user the option - as at the companies too... because as I am thinking it a lot - is it real my problem or for the company-provider that needs more accounts to survive?

May be a universal solution of matching-those-json will be the right choice...

need a lot of thinking
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
I don't know your experience with data processing.
I used to work with many different type of data.
I even worked to design OCR recognition template to detect fields from scanned PDF using ABBYY recognition server.

I believe some tasks can be just automated.

In the era of e-invoicing, we are only deal with document which in json or xml.
I do my research on my local implementation.
Currently there may be 30 type of combinations as you said.
There is no headache about it.

As a software engineer, you need to take into consideration of errors.
What if the logic or engine you build fail or the target server you are submitting rejected your input?
You need to design a fallback to handle this situation.
You should not expect your generated output can guarantee 100% the target (government server) accepted the data and return to you successful all the time.
 
Upvote 0

Magma

Expert
Licensed User
Longtime User
I don't know your experience with data processing.
I used to work with many different type of data.
I even worked to design OCR recognition template to detect fields from scanned PDF using ABBYY recognition server.

I believe some tasks can be just automated.

In the era of e-invoicing, we are only deal with document which in json or xml.
I do my research on my local implementation.
Currently there may be 30 type of combinations as you said.
There is no headache about it.

As a software engineer, you need to take into consideration of errors.
What if the logic or engine you build fail or the target server you are submitting rejected your input?
You need to design a fallback to handle this situation.
You should not expect your generated output can guarantee 100% the target (government server) accepted the data and return to you successful all the time.
The target is not a government server... but 30 different providers servers with different flow and APIs :)
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
The target is not a government server... but 30 different providers servers with different flow and APIs :)
Alright. Do they make changes everyday?
Do they make announcements?

What makes AI do differently than human?
(Joke: Human is lazy, AI is not)

So you need a MCP or agentic AI to monitoring in real time the unexpected changes?

I believe you need to do integration with the providers. I mean Communication. No?
 
Upvote 0

Magma

Expert
Licensed User
Longtime User
Alright. Do they make changes everyday?
Do they make announcements?

What makes AI do differently than human?
(Joke: Human is lazy, AI is not)

So you need a MCP or agentic AI to monitoring in real time the unexpected changes?

I believe you need to do integration with the providers. I mean Communication. No?
I believe you need to do integration with the providers. I mean Communication. No?
Ofcourse

What makes AI do differently than human?
I am not saying AI is better... but for sure can do the fist pass

Alright. Do they make changes everyday?
When something is new, always have changes (About 4 years now in Greece - every month having changes - that's why not coming big enterprises, for those changes, and ofcourse for High TAX)

Well... I am human - i know that I am lazy :) But I think that now we are loosing time for what road to take, I am telling you that I will choose the road of Development and for option (for end-user) the AI Road, and we are keep saying which is better? AI vs Human... ofcourse as i said already in other threads I am not with those bots... even if the do 95% of backoffice jobs...

We are not disagree... but I prefer to have an option for the end-user | because as i said I am going to do the Match/Quiz-APP ---> I am not going to match all these providers...

So if the end user wants to go automatically will be there an AI switch? is it bad ...that ?
 
Upvote 0

epiCode

Active Member
Licensed User
Who needs this conversion?
How many conversions are needed with respect to time?
How often do they change format ?
 
Upvote 0

Magma

Expert
Licensed User
Longtime User
Who needs this conversion?
How many conversions are needed with respect to time?
How often do they change format ?
thousands of developers exist and will come after years
30~60*30*30 (conversions can be reversed and from provider to provider)
not always changing format more adding new fields... every 2-3 months, new validations and new limits...
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
I don't know is it appropriate to discuss this topic here in public.
This is so technical and specific.
I think it deserves a complete consultation or project management.
I should have charged for consultation.

To let us understand the problem now. Maybe you better clarify a few things. You are confusing me with many things.

Let us ask some 4W1H.
  1. Who are you?
  2. What is your role?
  3. What is your position in the flow?
  4. Who are the other parties in the flow?
  5. Who are the Providers?
  6. Who are the End Users?
  7. Who are the Developers you are talking about?
  8. What is the input you are dealing with?
  9. From Who you get this input?
  10. What is the output from you?
  11. Who the output you need to send or transfer to?
  12. How many types of input?
  13. How many types of output?
  14. Why are there so much types of input?
  15. Why are there so much types of output?
  16. When you get this input?
  17. When you send your output?
  18. What is your actual concern?
 
Upvote 0

Magma

Expert
Licensed User
Longtime User
I don't know is it appropriate to discuss this topic here in public.
This is so technical and specific.
I think it deserves a complete consultation or project management.
I should have charged for consultation.

To let us understand the problem now. Maybe you better clarify a few things. You are confusing me with many things.

Let us ask some 4W1H.
  1. Who are you?
  2. What is your role?
  3. What is your position in the flow?
  4. Who are the other parties in the flow?
  5. Who are the Providers?
  6. Who are the End Users?
  7. Who are the Developers you are talking about?
  8. What is the input you are dealing with?
  9. From Who you get this input?
  10. What is the output from you?
  11. Who the output you need to send or transfer to?
  12. How many types of input?
  13. How many types of output?
  14. Why are there so much types of input?
  15. Why are there so much types of output?
  16. When you get this input?
  17. When you send your output?
  18. What is your actual concern?
  1. Who are you? I am Magma... i think you know me...
  2. What is your role? To protect the PLANET, but for start let's protect the developers of Greece :)
  3. What is your position in the flow? OK... i my position is to develop a translation/match app for jsons because i want it to do it... free for all.. don;t forget i am in union and i want to help developers of union.
  4. Who are the other parties in the flow? ok... developers of any invoice app <-> providers servers / have differents api and flow every time <-> gov api (only for providers)
  5. Who are the Providers? BIG COMPANIES or small companies trying their luck / big company have also their invoicing apps :)
  6. Who are the End Users? end users for me: are developers (developers also having end users - customers using their apps)
  7. Who are the Developers you are talking about? Greek Devs
  8. What is the input you are dealing with? ASK our GOV :)
  9. From Who you get this input? ASK our GOV :)
  10. What is the output from you? ASK our GOV :)
  11. Who the output you need to send or transfer to? ASK our GOV :)
  12. How many types of input? ASK our GOV :)
  13. How many types of output? ASK our GOV :)
  14. Why are there so much types of input? ASK our GOV :)
  15. Why are there so much types of output? ASK our GOV :)
  16. When you get this input? ASK our GOV :)
  17. When you send your output? ASK our GOV :)
  18. What is your actual concern? ---
If you mean if that will make me rich.. no will not.. Because not thinking like this...

I should have charged for consultation
me too :) many times...


I don't see the question --> is there... a tool doing that job ? already ?
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
parties in the flow? ok... developers of any invoice app <-> providers servers / have differents api and flow every time <-> gov api (only for providers)
If you can't draw the flow chart, can you listed out the parties from Left to Right indicate Start to End for each steps?

ASK our GOV :)
Since you didn't provide answers for these questions and I could not ask your government for the answers, I can only make my own assumptions.

I assume the flow is as following:

Scenario 1
1. End User (Buyer/Customer/usually consumer of the products and services) makes a purchase with --> 2. Providers (Seller/Merchant/usually business owners providing products or services) generate and send e-Invoice in the form of JSON/XML to --> 3. Government e-Invoicing API Server (MYDATA?) and the server validate the e-Invoice then send back the validated e-Invoice back to --> 4. Providers and generate a validated e-Invoice to 5. End User (who may use ERP system, apps or custom software developed by Developers).

Scenario 2
The process continues from #5 where the End User can accept or reject the e-Invoice...
There are many other scenarios but let us focus on Scenario 1 at this point.

I assume Providers are the one who concern about sales profit.
Hence they are also the party that making request or consuming the APIs provided by the government API server (MYDATA?).
Meanwhile End User concern about tax submission or claim the tax refund where the e-Invoice is a proof of purchase.

Here you want to slot in as an intermediate Middleware provider so the flow changed to this:
2. Providers (Seller/Merchant/usually business owners providing products or services) generate and send e-Invoice in the form of JSON/XML to 3. Middleware provider that accepts different types or non standard types of e-invoices then output the correct format required by the API of --> 4. Government e-Invoicing API Server (MYDATA?)

So your actual concern is to make the Providers' life easier for providing an intermediate service to make the API submission smoother.

Am I right so far?
 
Upvote 0
Top