We are living in a wold with a lot of changes/developments...
The following sentence is for any developer here:
I still support the idea for B4X Open Source Software even i have no time to breath (I know - you are in the same situation) - but i will give my little stone to help you because I know you will support me. I am still at the bottom, and I want ethical support and ofcourse a lot of money for everyday life... why the life is difficult ?
Now let's back to our Tutorial...
What you will need ?
- Need B4J, Python 3.12+, also need to install some PIP..
- Read some tutorials and how PyBridge - Python works with B4X... @Erel has everything we want...
- Need some knowledge of B4J Server Apps... not too much...
- And some time to understand how this will work...
What exactly doing this micro-app ?
Running a small light web server and the same time a Python Worker Service (one extra Server that is in background / changing ports for his life)
The first time running this worker you will see it running slow, loading LLM - is because the first time downloading the LLM, next times - no need more than 30secs~1min to load LLM into RAM.
So when user gonna type address from web-browser http://127.0.0.1:8080/convert?text=I want water - It will return a base64 text to browser
or just go at: http://127.0.0.1:8080/convert
But what is doing in back side?
Well as we ve already said a web server runs behind for the needings of show (browser)... also a websocket for the right use of Python script that is ready and waiting for conversions !
What pip installations need for LLM?
Do not forget that first time running my project - you ll have a delay because of LLM downloading...
But hey what is Text-Embedding ?
Text embedding is a technique in Artificial Intelligence that converts text (words, sentences, or entire documents) into a list of numbers, called vectors.
While humans read letters, computers only understand numbers. Embeddings bridge this gap by translating human language into a mathematical format that captures the meaning of the text, not just the spelling.
What exactly is it?
Imagine a giant multi-dimensional map where every piece of text has its own set of "coordinates."
Semantic Proximity: Words with similar meanings, like "king" and "queen," are placed close together on this map.
Contextual Capture: Modern embeddings can distinguish between "bank" (a river bank) and "bank" (a financial institution) based on the surrounding words.
Fixed Dimensions: Whether it is a single word or a whole paragraph, the output is usually a vector of a fixed length (e.g., 768 or 1536 numbers).
Why is it important?
Embeddings are the fundamental building blocks of modern Large Language Models (LLMs) and search engines.
1. Improved Search (Semantic Search)
Traditional search looks for exact keyword matches. Embedding-based search finds concepts. If you search for "feline care," the system can find articles about "cats" because their vectors are mathematically similar.
2. Efficiency in Machine Learning
Processing raw text is computationally expensive and difficult for algorithms. Vectors allow computers to perform fast mathematical operations (like calculating the distance between two points) to find similarities instantly.
3. Cross-Lingual Capabilities
Advanced models can map "apple" (English) and "μήλο" (Greek) to the same vector space. This allows a system to understand that they represent the same object, regardless of the language used.
4. Personalization
Streaming services and social media platforms use embeddings to represent your interests. If the vector of a new video is close to the vector of videos you’ve watched before, it gets recommended to you.
Common Use Cases
RAG (Retrieval-Augmented Generation): Providing AI models with specific company data by finding the most relevant documents via embeddings.
Clustering: Automatically grouping thousands of customer reviews into topics (e.g., "shipping issues," "product quality").
Anomaly Detection: Identifying "weird" text or potential fraud by spotting vectors that don't fit the usual patterns.
Hope you understand the whole thing...
and hope to understand why we must think more about B4X Open Source Software partnerships..
There is no way for one man/person to have all the knowledge - the knowledge must be shared if u wanna target to big projects....
If you found value in this project and believe in the bigger vision behind it, I invite you to support my work. Your contributions do more than just fund this tutorial—they give me the resources to scale my ideas, create better content, and take this entire journey to the next level. Let's build something great together.
So if someone want to spend big money for visions like mine jst tell me (PM) - to give my proffesional account, also giving a receipt for that.
The following sentence is for any developer here:
I still support the idea for B4X Open Source Software even i have no time to breath (I know - you are in the same situation) - but i will give my little stone to help you because I know you will support me. I am still at the bottom, and I want ethical support and ofcourse a lot of money for everyday life... why the life is difficult ?
Now let's back to our Tutorial...
What you will need ?
- Need B4J, Python 3.12+, also need to install some PIP..
- Read some tutorials and how PyBridge - Python works with B4X... @Erel has everything we want...
- Need some knowledge of B4J Server Apps... not too much...
- And some time to understand how this will work...
What exactly doing this micro-app ?
Running a small light web server and the same time a Python Worker Service (one extra Server that is in background / changing ports for his life)
The first time running this worker you will see it running slow, loading LLM - is because the first time downloading the LLM, next times - no need more than 30secs~1min to load LLM into RAM.
So when user gonna type address from web-browser http://127.0.0.1:8080/convert?text=I want water - It will return a base64 text to browser
or just go at: http://127.0.0.1:8080/convert
But what is doing in back side?
Well as we ve already said a web server runs behind for the needings of show (browser)... also a websocket for the right use of Python script that is ready and waiting for conversions !
What pip installations need for LLM?
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install sentence-transformers numpy
Do not forget that first time running my project - you ll have a delay because of LLM downloading...
But hey what is Text-Embedding ?
Text embedding is a technique in Artificial Intelligence that converts text (words, sentences, or entire documents) into a list of numbers, called vectors.
While humans read letters, computers only understand numbers. Embeddings bridge this gap by translating human language into a mathematical format that captures the meaning of the text, not just the spelling.
Imagine a giant multi-dimensional map where every piece of text has its own set of "coordinates."
Semantic Proximity: Words with similar meanings, like "king" and "queen," are placed close together on this map.
Contextual Capture: Modern embeddings can distinguish between "bank" (a river bank) and "bank" (a financial institution) based on the surrounding words.
Fixed Dimensions: Whether it is a single word or a whole paragraph, the output is usually a vector of a fixed length (e.g., 768 or 1536 numbers).
Embeddings are the fundamental building blocks of modern Large Language Models (LLMs) and search engines.
1. Improved Search (Semantic Search)
Traditional search looks for exact keyword matches. Embedding-based search finds concepts. If you search for "feline care," the system can find articles about "cats" because their vectors are mathematically similar.
2. Efficiency in Machine Learning
Processing raw text is computationally expensive and difficult for algorithms. Vectors allow computers to perform fast mathematical operations (like calculating the distance between two points) to find similarities instantly.
3. Cross-Lingual Capabilities
Advanced models can map "apple" (English) and "μήλο" (Greek) to the same vector space. This allows a system to understand that they represent the same object, regardless of the language used.
4. Personalization
Streaming services and social media platforms use embeddings to represent your interests. If the vector of a new video is close to the vector of videos you’ve watched before, it gets recommended to you.
RAG (Retrieval-Augmented Generation): Providing AI models with specific company data by finding the most relevant documents via embeddings.
Clustering: Automatically grouping thousands of customer reviews into topics (e.g., "shipping issues," "product quality").
Anomaly Detection: Identifying "weird" text or potential fraud by spotting vectors that don't fit the usual patterns.
Hope you understand the whole thing...
and hope to understand why we must think more about B4X Open Source Software partnerships..
There is no way for one man/person to have all the knowledge - the knowledge must be shared if u wanna target to big projects....
If you found value in this project and believe in the bigger vision behind it, I invite you to support my work. Your contributions do more than just fund this tutorial—they give me the resources to scale my ideas, create better content, and take this entire journey to the next level. Let's build something great together.
So if someone want to spend big money for visions like mine jst tell me (PM) - to give my proffesional account, also giving a receipt for that.
Attachments
Last edited: