Looking for web site that lets me know if a word is a Noun, Verb, etc.

rleiman

Well-Known Member
Licensed User
Hi,

Is there a web site I can call that will allow me to send it a word and it will tell me if that word is a Noun, Verb, etc. ?

I'm hoping to find that kind of site that maybe would return xml that I can use to parse out the returned result.
 

rleiman

Well-Known Member
Licensed User
Maybe there is something like this out there that I use for geocoding but instead for words:

http://where.yahooapis.com/geocode?q=522 cleermont dr. huntsville al


I think you need to handle it manually. Something like this.
Use this website Dictionary.com | Find the Meanings and Definitions of Words at Dictionary.com to check a word.

For example:
htt p://dictionary.reference.com/browse/hello
htt p://dictionary.reference.com/browse/computer

Read the HTML result then find the tags

You can know it is Noun or Verb...
 

rleiman

Well-Known Member
Licensed User
Since I only know how to do XML parsing can you show me the coding needed to call the web page and locate the tags in the returned HTML and place that into a variable?

Thanks.

I think you need to handle it manually. Something like this.
Use this website Dictionary.com | Find the Meanings and Definitions of Words at Dictionary.com to check a word.

For example:
htt p://dictionary.reference.com/browse/hello
htt p://dictionary.reference.com/browse/computer

Read the HTML result then find the tags

You can know it is Noun or Verb...
 

NJDude

Expert
Licensed User
In this case you will have to do some "scraping", that means, read the webpage source and extract what you need, sort of like a custom-parsing if you will, that's known as scraping.

Take a look at the attached file, it's a bare-bones project that does what you need.

NOTE: Dictionary.com has an API, check HERE
 

Attachments

Last edited:

rleiman

Well-Known Member
Licensed User
Hi NJDude,

Thanks so much for that project. It really does what I'm looking for. I will use that. I hope they don't change anything with their web pages because that will make the app not work. At least I can use scraping until my registration with dictionary.com is accepted.

:cool:

In this case you will have to do some "scraping", that means, read the webpage source and extract what you need, sort of like a custom-parsing if you will, that's known as scraping.

Take a look at the attached file, it's a bare-bones project that does what you need.

NOTE: Dictionary.com has an API, check HERE
 

nfordbscndrd

Well-Known Member
Licensed User
Hi,

Is there a web site I can call that will allow me to send it a word and it will tell me if that word is a Noun, Verb, etc. ?

I'm hoping to find that kind of site that maybe would return xml that I can use to parse out the returned result.
I don't have a web site like you want, but I have a database of about 160,000 words with parts of speech. (Note that many words have multiple possible different parts of speech; e.g.: "work" can be a verb or a noun.) The table includes all forms for each word, such as "work, works, worked, working".

An SQL version of the file with just the words is over 16MB which your app would have to download from your web site to each user's device because it is too big to distribute with an app.

For your purposes, you would have to extract the links to parts of speech from an Access 2007 database to SQL to use with the Words SQL file which I've already extracted. Let me know if you want the files.
 

rleiman

Well-Known Member
Licensed User
Yes, please upload the database as that can be useful to me and other B4A developers.

Is there a way to convert it to sqlite?

I don't have a web site like you want, but I have a database of about 160,000 words with parts of speech. (Note that many words have multiple possible different parts of speech; e.g.: "work" can be a verb or a noun.) The table includes all forms for each word, such as "work, works, worked, working".

An SQL version of the file with just the words is over 16MB which your app would have to download from your web site to each user's device because it is too big to distribute with an app.

For your purposes, you would have to extract the links to parts of speech from an Access 2007 database to SQL to use with the Words SQL file which I've already extracted. Let me know if you want the files.
 

nfordbscndrd

Well-Known Member
Licensed User
Yes, please upload the database as that can be useful to me and other B4A developers.

Is there a way to convert it to sqlite?
I converted the Words table to sqlite already. Attached is the code for an app which downloads the wordsdb.db (15 MB) to your device. The app does spell checking and some other things, but the words db file does not contain the parts of speech ("POS").

Here is a direct link to the words.db file in zip format (6 MB).

Here is a link to the Access 2007 data files (AIC Cortex.accdb and AIC Words.accdb): aic.zip (32 MB). The POS entries in the LinksType table in the AIC Cortex database start with ID# 30010.

The page at www.aeyec.com explains the database structure (amidst a lot of other info).

In short, the Cortex table (which contains no text, only numeric links) links a word entry's ID# from the Words table to a POS entry's ID# in the LinkTypes table, but the latter table contains a lot of other types of LinkTypes in addition to POS. For your purposes, you can just extract the POS LinkTypes from the table.
 

Attachments

Last edited:

rleiman

Well-Known Member
Licensed User
Thanks so much for the database and links. :D

I converted the Words table to sqlite already. Attached is the code for an app which downloads the wordsdb.db (15 MB) to your device. The app does spell checking and some other things, but the words db file does not contain the parts of speech ("POS").

Here is a direct link to the words.db file in zip format (6 MB).

Here is a link to the Access 2007 data files (AIC Cortex.accdb and AIC Words.accdb): aic.zip (32 MB). The POS entries in the LinksType table in the AIC Cortex database start with ID# 30010.

The page at www.aeyec.com explains the database structure (amidst a lot of other info). I recently reworked the file and it currently does not have a table of contents. Just scroll down until you see the info.

In short, the Cortext table (which contains no text, only numeric links) links a word entry's ID# from the Words table to a POS entry's ID# in the LinkTypes table, but the latter table contains a lot of other types of LinkTypes in addition to POS. For your purposes, you can just extract the POS LinkTypes from the table.

I see that I haven't looked at this stuff since last March, so I hope I've told you everything right. I was going to add some more stuff to the words app, but got busy on other stuff.
 

catyinwong

Active Member
Licensed User
In this case you will have to do some "scraping", that means, read the webpage source and extract what you need, sort of like a custom-parsing if you will, that's known as scraping.

Take a look at the attached file, it's a bare-bones project that does what you need.

NOTE: Dictionary.com has an API, check HERE
Downloaded the example and the following error comes out:

B4X:
B4A Version: 8.30
Parsing code.    (0.00s)
Compiling code.    (0.10s)
Compiling layouts code.    (0.01s)
Organizing libraries.    (0.00s)
Generating R file.    (0.14s)
Compiling debugger engine code.    (1.58s)
Compiling generated Java code.    Error
B4A line: 48
HttpClient1.Execute(request,1)
javac 1.8.0_181
src\FordSoft\b4a\AICWords\main.java:426: error: cannot access ClientProtocolException
_httpclient1.Execute(processBA,_request,(int) (1));
                    ^
  class file for org.apache.http.client.ClientProtocolException not found
1 error
Anyone knows why?
 

DonManfred

Expert
Licensed User
Anyone knows why?
1. You should ALWAYS(!!) create a new thread for any Question you have. Do not post to existing threads. This thread is 7 Years OLD...

2. Solution is to use okhttputils2 instead of the deprecated http library
 
Last edited:
Top