Content alerts

Table of Contents

Defacement

What is it

A content change of a website that has been classified as "defacement".

How is it detected

nimbusec detects defacements typically from external, by its Cloud Scan. The data fetched is then analysed in our data centers by either defacement signatures ("Basic Analysis") and or multiple statistical analysis ("Semantic Defacement Analysis").

Basic Analysis

Basically this is a fast approach that may detect defacements by signatures. Therefore the content of a website is extracted (plain text without tags and attributes). This content is then matched against an increasing set of rules which represent the content of typically seen defacements. Included are also bad words in multiple variations e.g. hacked by | h4cked | h4ck3d!!! | ...

To sum it up:

  • Content is matched on rules, representing typical defacements
  • Bad words are looked up in the content

Semantic Defacement Analysis

This is the advanced defacement analysis. It is split up in several modules which rate a given website. When the sum of all modules exceeds a given treshold we can asume that it is a defacement.

Modules:

  • Structure
  • Language
  • Topic
  • Spam
  • Time

The structure module takes a deep look in the structure of a website and splits it into HTML-tags, attributes, classes and its content. It compares the data of the current scan with the previous scans. That enables us to extract those parts that were added or removed on a website and put it in relation to the past. If the structure changed dramatically and in a way that is not typical to known updates of the website the rating of this module will be higher.

The language module detects the spoken language(s) per website. If the language of an added part differs to the rest of the page, the rating will increase. If the whole website changes its language the rating will be critical for this module. We also determine the languages spoken on a website. Assuming a single plage changes its language, but it matches one of the known languages of the whole website the rating will not increase so much.

The topic module uses machine learning algorithms to determine the topic(s) on a website, and every single page. If an article is added to a page, the topic for this article is matched to the known topic of the whole page in the past. The rating will, as always, increase if there is a mismatch.

The spam module works similar to the spam filter in a mail client. We train the filter with lots of real world defacement examples.

The last module in the chain is the time module. It compares the update times of a website. This modules works better on fast intervals and worse on longer scan intervals (e.g. once a week will not work for this).

Alert levels

  • RED: Identification of known defacement based on defacement signature
  • YELLOW: Identification of defacement based on strong changes in DOM, content language, topic model, etc.

Red Alert

On red alerts a defacement was seen based on signatures. This can be held as defacements similar to the detection rate of anti virus products and should be taken really seriously. Automated reaction can be considered.

Yellow Alert

Our goal are zero false positives. Because of the advanced methods used here, we cannot guarantee 100% detection of a defacement, but we see the changes and tell if they are really suspicious.

These alerts have to be taken serious as well, but we recommend to have a look at the page and verify the issue yourself.

The defacement result detail gives you a lot of information and has in mind to give you on first sight what went wrong.

Also screenshots of the landing page are rendered to see what our crawlers saw at the time of the scan.

The content result gives you the following information:

  • Occured on: the time of the detection
  • Reason: Description of the modules which voted for this issue
  • Region: the region where the scan was performed from
  • Viewport: a mobile or desktop browser client setting can be used
  • Path: URL to the page/resource where the change was seen
  • Change: Usually for this issue --> Defacement detected
  • Name: Name of the defacement, if we have one (e.g. Hacker group). So you can identify the defacement better.

Try to verify the result by browsing the path, shown in the result. Request the website directly and also over a search engine. You may get different results. Do this only from a secure environment, as defaced websites are also likely to spread malware!

In case the website got defaced, and you don't have a incident response chain yourself, we prepared a short guide here.

  1. It is always good to create a backup of the webspace before changing anything.
  2. Redirect to a maintainance page, or in worst case to just a blank page, to not bring site visitors to danger.
  3. Investigate to find the weak spot (Outdated CMS or other application, some plugin, .. )
  4. Fix the vulnerabilities and remove evidence

The detected change might have been part of a major, but intentional content change. Contact the website's administrator for information.

When to mark as False Positive

If no problem shows and you are certain that no defacement took place, mark the alert as "false positive". This will create a rule that should prevent such alerts in the future. But every content is different and therefore hard to compare and analyse - If you have a specific alert that shows up unless you marked it as false positive, please contact us and we will work out a solution.

If you need in depth information about why this alert was generated, please contact nimbusec support.

Still need help? Get in touch!
Last updated on 22nd May 2018