API Method "spam_check"

 

 

In order to use SPAM_CHECK method, you have to buy a separate license — BlackLists API. Our prices are here: https://cleantalk.org/price-database-api

You can purchase it here on your billing page: https://cleantalk.org/my/bill/api

This method should be used for bulk checks of IP, Email for spam activity. For other purposes, use other methods:

 

Method's response is information about the existence of IP and e-mail records in our database for a long period of time.

It may differ from the result of blacklist checking on our public page — https://cleantalk.org/blacklists/spam-ip — the website shows current spam status only.

 

What's the difference in data that methods SPAM_CHECK and CHECK_NEWUSER (or CHECK_MESSAGE) return?

They check spam activity for different time periods and serve different purposes as well:

 • SPAM_CHECK takes the last 6 months of spam history and responses accordingly. It is also needed to know if an e-mail exists or not.

 • CHECK_NEWUSER (or CHECK_MESSAGE) takes the current status of a record, i.e. last 2 weeks of spam history, and is needed to check any IP or e-mail on the fly. Many tests are being performed for a record (relevance to website text, language, nickname, JavaScript tests and etc.) to define whether it is spam right now or not.

Note: our servers count each record that you request for checking. If you made 1 multiple records request, in your CleanTalk dashboard you will see a few checks (as many as there were records in the request).

 

Call Requires GET Parameters:

 

Optional GET Parameters:

  • ip — IP address to check (IPv4 or IPv6 standard format)
  • email — e-mail address to check (The result is given for the last 6 months)
  • date — date to check for statistics in YYYY-MM-DD format (It can be applied only to IP addresses)
  • email_<SHA256> - email SHA256 hash
  • ip4_<SHA256>- IPv4 address SHA256 hash
  • ip6_<SHA256>- IPv6 address SHA256 hash

 

Example request with IP:

https://api.cleantalk.org/?method_name=spam_check&auth_key=123456&email=stop_email@example.com&ip=127.0.0.1

 

Example request with date:

https://api.cleantalk.org/?method_name=spam_check&auth_key=123456&ip=127.0.0.1&date=2017-01-31

 

Example request with email hash:

https://api.cleantalk.org/?method_name=spam_check&auth_key=123456&email=email_08c2495014d7f072fbe0bc10a909fa9dca83c17f2452b93afbfef6fe7c663631

 

Example request with IP v4 hash:

https://api.cleantalk.org/?method_name=spam_check&auth_key=12345&ip=ip4_f46604ded89bbd0e8e478172a9a650f4825a763053ad2e3582c8286864ec4074

 

API returns JSON string, for example:

{"data":{"127.0.0.1":{"domains_count":0,"domains_list":null,"spam_rate":0,"submitted":"2021-08-10 04:32:53","updated":"2021-10-14 10:58:10","frequency":9,"in_antispam":0,"in_security":0,"appears":0,"network_type":"good_bots","country":null,"sha256":"12ca17b49af2289436f303e0166030a21e525d266e209267433801a8fd4071a0"},"stop_email@example.com":{"frequency":3,"submitted":"2021-01-01 02:25:06","updated":"2021-01-15 03:01:02","spam_rate":1,"appears":1,"disposable_email":1,"exists":null,"sha256":"6d42ca0235d72b01a2b086ad53b5cfac24b5a444847fad70250e042d7ca8bf59"}}}

 

or for email hash:

{"data":{"email_08c2495014d7f072fbe0bc10a909fa9dca83c17f2452b93afbfef6fe7c663631":{"frequency":4,"submitted":"2018-10-16 01:53:49","updated":"2021-04-15 08:49:53","spam_rate":1,"exists":1,"appears":0,"disposable_email":1}}}

 

or for IP v4 hash:

{"data":{"ip4_f46604ded89bbd0e8e478172a9a650f4825a763053ad2e3582c8286864ec4074":{"domains_count":0,"domains_list":null,"spam_rate":1,"submitted":"2018-07-29 07:09:45","updated":"2019-09-25 15:33:00","frequency":13627,"in_antispam":0,"in_security":0,"appears":0,"network_type":"hosting","country":"GB"}}}

 

 

Responses Explanation:

data — usually an array of the checked records presented in the following format: "record":{array of checked results}. Sometimes the response 'data' returns a string 'In progress', it means that a concurrent PHP-process is working with exactly the same parameters — auth_key, method_name and records.

Important! The response "In progress" has a significant meaning and it has to be counted. If your API call has a lot of elements then they will be checked partially in one go and you will get the "In progress" response. And only the next API call will check the rest of the elements.

An array of check results may contain the next fields:

  • appears — the marker that defines the record status in the blacklists 0|1 (shows if the record is blacklisted right now),
  • sha256 — address sha256 hash
  • network_type — special net-type if any (there might be new types in the future):
  • 'hosting' - IP address belongs to the network hosting company
  • 'public' - IP address belongs to ISP but this network or AS has a hight spam activity
  • 'paid_vpn' - IP address belongs to VPN Service Provider
  • 'tor' - IP address belongs to Tor Exit Nodes (very spam active addresses)
  • 'unknown' - IP address has no special type for now
  • empty value - exactly like 'unknown'
  • spam_rate — a rating of spam activity from 0 to 100%. 100% means certain spam (the ratio of blocked requests to all). The "spam_rate" parameter can have a value from 0 to 1: "spam_rate": "1" — 100% of requests were spam, "spam_rate": "0.75" — 75% of requests were blocked as spam, 
  • submitted - date and time of the first spam activity,
  • updated - date and time of the last status update,
  • country — letter country code of the IP, in ISO 3166-1 alpha-2
  • exists — check email for existence (0 - not exists, 1 - exists, null - Empty status, the address is not in our database),
  • disposable_email — check email for disposable (0 - normal, 1 - disposable, null - Empty status, the address is not in our database),
  • frequency — is a number of websites that reported spam activity of the record. It can be from 0 up to 9999 (shows total activity from the first time the record was caught)
  • spam_frequency_24h — is a number of spam requests from the address that were blocked by Anti-Spam in the last 24 hours.
  • in_antispam — IP address found in Anti-Apam blacklist (0 - not found, 1 - found),
  • in_antispam_previous — the previous Anti-Apam blacklist status. It can show was the record blacklisted or not (0 - wasn't blacklisted, 1 - was blacklisted, NULL - no change)
  • in_antispam_updated — the date of changing Anti-Apam blacklist status (NULL - no change),
  • in_security — IP address found in security blacklist (brute-force) (0 - not found, 1 - found),
  • domains_count — number of domains found on IPv4 address.
  • domains_list - list of hosted domains/sites on IPv4 address. Method shows first 1000 domains.

In the case of 'date' parameter response contains results on a given date only. You can check by POST request up to 1000 records at one time.

You can check email for existence and disposable only with the GET request and only 1 address at one time.

If the record hasn't been showing any activity for 14 days it will be removed from blacklists but the history will stay.

 

Multiple Records Check:

You can submit multiple records to test per 1 call, to do that use POST options:

  • data — string with records to check separated by ','.

 

Example:

wget -O- --post-data='data=stop_email@example.com,10.0.0.1,10.0.0.2' https://api.cleantalk.org/?method_name=spam_check&auth_key=123456

 

Response:

{"data":{"stop_email@example.com":{"appears":1,"frequency":"999","updated":"2019-04-24 23:33:00"},"10.0.0.1":{"appears":0},"10.0.0.2":{"appears":0}}}

 

Restrictions:

If you get calls limit, API returns error notice. Example:

{"error_message":"Calls limit exceeded.","error_no":10}

The current calls limit is 100 per 60 seconds.

 

If you get data elements limit in the spam_check method, API returns error notice. Example:

{"error_message":"Received 1001 records to check, maximum 1000 records check perl call.","error_no":8}

The current data elements limit is 1000.

The recommended timeout is no more than 180 seconds.

 

Analysis and processing errors are returned individually for each record. Example:

{"data":{"10.0.0.3":{"error":"Database error"}}}

'Database error' informs about data retrieving problem - slow database server answer and so on. So all you need is to repeat the request after a few minutes.

{"data":{"10.0.0.266":{"error":"Can't check this record: Wrong format"}}}

There might be new error descriptions in the future.

 

Notice: if data are sent in the request, then the request may be implemented a little longer.

 

 

gmail.com addresses features
 

If gmail.com address includes dots, then it will be checked as an address without dots (it's the same for gmail.com), there is an additional field "email" without dots in server's response:

 

https://api.cleantalk.org/?method_name=spam_check&auth_key=123456&email=1234.test.te@gmail.com

 

Response:

 

{"data":{"1234.test.te@gmail.com":{"appears":0,"sha256":"1cab88c5f6304f48ac75e8a175a0351a7d6bfd7fbd55d2f90eab96213dcdf639","disposable_email":0,"email":"1234testte@gmail.com"}}}

 

 

Description of Several Examples of API Responses

 

Below you will find several examples and descriptions of the parameters for possible server response options, such as:

  • "appears" — the marker that defines the record status in the blacklists 0|1;
  • "spam_rate" — a rating of spam activity from 0 to 100%. 100% means certain spam (the ratio of blocked requests to all). The "spam_rate" parameter can have a value from 0 to 1: "spam_rate": "1" — 100% of requests were spam, "spam_rate": "0.75" — 75% of requests were blocked as spam,
  • "submitted" - date and time of the first spam activity,
  • "updated" - date and time of the last status update,
  • "in_antispam" — IP address found in Anti-Apam blacklist (0 - not found, 1 - found),
  • "in_security" — IP address found in security blacklist (brute-force) (0 - not found, 1 - found),
  • "frequency" — is a number of websites that reported spam activity of the record. It can be from 0 up to 9999;
  • "network_type" — the category of using the address space.

 

Example:

"ip":{"appears":1}

Explanation:

  • "appears":1 — IP is in blacklists;

 

Example:

"ip":{"appears":0}

Explanation:

  • "appears":0 — IP is not in blacklists;

 

Example:

"ip":{"appears":0, "frequency":"15"}

Explanation:

  • "appears":"0" — IP is not in blacklists;
  • "frequency":"15" — 15 websites reported about spam activity of this IP.

 

Example:

"ip":{ "appears":"0","spam_rate":"1","frequency":"1"}

Explanation:

  • "appears":0 — IP is not in blacklist;
  • "spam_rate":"1" — there was 1 request and it was detected as spam;
  • "frequency":"1" — 1 web-site reported about spam activity of this IP;

 

Example: 

"email":{ "appears":0,"spam_rate":"1","frequency":"1","exists":"1"}

Explanation:

  • "appears":"0" — email is not in blacklist;
  • "spam_rate":"1" — there was 1 request and it was detected as spam;
  • "frequency":"1" — 1 web-site reported about spam activity of this email;
  • "exists":"1" — the real email that exists on a mail server.

 

Method SPAM_CHECK should be used exclusively for mass checking of IPs and e-mails for spam activity. For other purposes you could use other methods:

  • check_message (requires Standard Anti-Spam Licence) — API spam filter to check posts and comments.
  • check_newuser (requires Standard Anti-Spam Licence) — Registration check.
  • send_feedback (requires Standard Anti-Spam Licence) — Send feedback to CleanTalk.
  • backlinks_check (requires BlackList API Licence) — Mass check for backlinks in spam comments.
  • Universal Anti-Spam plugin - this plugin can be installed on any custom websites or CMS.

 

 

Nginx Anti-Spam Module

 

CleanTalk offers the possibility to use the anti-spam service on the webserver level. We have developed our anti-spam module for Nginx:
https://github.com/CleanTalk/nginx-cleantalk-service

After installing the CleanTalk anti-spam module for Nginx on your web server, each POST request will be checked for spam with the CleanTalk API meaning that every spam request will be blocked. Your statistics of the CleanTalk anti-spam module for Nginx is always available in your CleanTalk Dashboard.

 

Filtration condition

 

You can combine parameter values to evaluate the spam activity of the checked IP or Email address. 

Example: "appears" is the main parameter you should pay attention to. Responce "appears": "1" means that the checked address is in the blacklists at the moment. Make note, the checked address may not be in blacklists at this time, so "appears" will be equal to "0". In this way, you can use the additional API parameters.

What can be considered as spam: 

  • If "appears": "1";
  • If "appears": "0" and "spam_rate" is higher than "0.7";
  • If "appears": "0", "spam_rate" is higher than "0.5", and the date of the last spam activity was more than 30 days ago;
  • If "appears": "0", "spam_rate": "1", "frequency": "5" or more, and "updated" less than 30 days ago;
  • If "appears": "0", "frequency": "200" or more, and "updated" less that 90 days ago.

You should evaluate the required parameters and their combinations. If you would like to evaluate spam activity only of those addresses that are currently in the blacklists, then it is enough to use the "appears" parameter. 

 

 

Was this information helpful?

It would also be interesting