Q&A’s: Data Masking

Common Data Masking Questions:

Do you copy data out of production? Static data masking is a simple, easy, and effective way to protect it and prevent a breach.

1. Why mask? Because we can’t protect the data outside of production: Imagine copying customer data for testing. How could you protect it after copying it? Without data masking, you will expose all names, addresses, phone numbers, emails, financial information, and more. Static masking replaces these values with good fakes so you can test without jeopardizing your confidential information or that of the people who entrusted it to you.

2. Reverse the masking? Impossible. That’s the point: Unlike encryption, static masking is a one-way transformation. The masked data resembles the original but doesn’t reveal it. This irreversible process ensures your sensitive information isn’t exposed even if the masked data falls into the wrong hands.

3. Data integrity? Is a must. Otherwise, the application won’t function properly in the test environment, or the test will be ineffective. The masking process must preserve data validity, consistency, and referential integrity. It’s like an elaborate disguise: everything looks different but has to work the same way.

4. A single algorithm? Of course not. There are many ways to mask each type of data. Choosing the strategy that fits your requirements will ensure you achieve your security goals while getting the most out of your data.

For example, value manipulation will preserve some aspects of the original data but can potentially offer weaker security. Data generation will provide perfect security but may impair test quality. Data profiling and custom profiles are two other strategies that balance security and test quality.

5. Should I worry about performance? Yes and no: data masking performance isn’t an issue unless it’s so slow that masking is impossible. Let’s explain more:

There’s a common preconception that static data masking is inherently slow and resource-intensive, but it’s not a big deal since we only have to do it once after copying the data. Some would say just once.

The Truth

It doesn’t matter if a masking process takes 30 seconds, 5 minutes, or half an hour. It’s not something that runs too often, and it never runs on production systems, slowing down business-critical processes.

However, it’s not entirely true that it doesn’t matter since masking becomes impractical if it takes days or weeks to run. Nor is it true that masking runs only once, as it must run every time you refresh your test data. As masking becomes faster and easier, you can update your test data more frequently, getting more out of your data.

Performance Culprits

Slow masking is usually due to one of these reasons:

  • Product selection: different solutions offer different performance capabilities. Common reasons include code quality, database APIs, transaction size, etc. For example, chatty protocols combined with high latency can result in very slow masking.
  • Database performance: like any database-driven product, masking performance also depends on the performance of the underlying database. Most databases aren’t normally tuned for masking.
  • Triggers: can be one of the most challenging problems as these small pieces of code execute whenever data changes. When updating millions of rows, a trigger will run millions of times, causing the masking process to run forever. However, triggers are often essential for data validity and integrity, and you shouldn’t automatically disable them.

Taming the “performance beast”

Addressing these issues will allow data masking to become an integral component of a dynamic data lifecycle rather than a slow, unusable burden everyone wants to avoid.

Here are some ideas to consider:

Product selection is always challenging. Like all IT purchases, with data masking, you should also test several solutions in your environment using your network, database, and data volume. While trials can be time-consuming, they are the only way to ensure the solutions work well in your environment. Be careful not to rely on brand recognition, market analysts, or friendly advice, as they can backfire and result in a failed project.

Database performance can be improved with a little know-how. Data masking is a very write-intensive process that requires different database tuning since most applications are read-intensive. To temporarily improve database performance during masking, you can, for example, remove indexes and constraints, stop archiving, suspend replication, etc. Pre- and post-masking actions can help automate these actions during masking. Work with your DBAs to maximize your database write speed.

Finally, trigger performance issues can be challenging and require time and know-how. First, identify the triggers that run when you mask your data. Determine which are relevant to the data you are masking and disable the rest. Second, convert the necessary triggers into a vertical update procedure and use that procedure during masking instead of the triggers. It works because a single update of all the rows is much faster than millions of small updates. Core Audit can help speed up this process by identifying the SQLs that need to be rewritten as vertical updates.

Masking is essential

When looking at the details of performance problems, data integrity, etc., it’s easy to lose sight of the big picture: why is data masking so important?

  • Reduces risk: eliminating sensitive data outside of production will dramatically reduce your exposure and risk profile.
  • Simplifies compliance: Masking is essential to reduce the scope of various compliance and data privacy regulations. Systems that contain masked data aren’t typically subject to compliance.
  • Improves development: Masked data drives development and test environments, improving product quality, shortening development cycles, and accelerating project timelines.

Final Thoughts

Data masking is a critical component in the data lifecycle, enabling us to use our data to drive and improve many aspects of our business. From product development to testing, data analytics, and more, securely using our data outside of production lets us multiply the value we derive from it.

Masking is simple and essential but not trivial. Many projects fail for a variety of reasons, such as using an inappropriate solution, failing to define the right masking policies, performance issues, and more.

Through our experience working with customers, we found customers always succeed when they have the right solution and a support team committed to their success. Missing either component greatly decreases the chances of success, and lacking both guarantees failure.

Contact us today at info@bluecoreresearch.com to learn more about how we can help you mask and secure your data.

Database Visibility: Poll Results

Database Visibility:
What Everyone Is Missing

Recent polls of cybersecurity professionals show most respondents (82%) have partial or no visibility into their databases and need it. Few said they have good visibility (7%) or don’t need it (11%).

The surveys were conducted in various LinkedIn groups in English and Spanish, asking: “Do you have visibility into what’s happening within your database?”. Almost all English respondents claimed to have partial visibility (40%) or have no visibility and need it (53%).

In the Spanish survey, more people said they have good visibility or don’t require it, thereby lowering the demand for visibility from 93% to the 82% average.

The challenge of database visibility

Databases store a lot of sensitive corporate and third-party data, such as financial data, personal information, and much more. This data must not be compromised, leaked, or manipulated, but most of us have no visibility into who accesses it, when, or how.

We know we need visibility. Without it, we cannot design effective controls. It is a generally accepted first step in security: don’t fly blind. You can’t control what you can’t see. So how come database visibility is something most organizations lack?

Databases are challenging because of the massive activity volume and strict performance requirements. How do you get visibility into billions of SQLs? How do you even collect the information without impacting performance, let alone process and consume it? How do you gain control and know when someone is doing something malicious when there is so much noise? Even when limiting the scope to confidential table accesses, there is too much activity to comprehend.

But without visibility, how could you set up controls, reports, or alerts? What do you look out for? How will you know if a data breach occurred? And when one does, how did it happen?

How to obtain and improve visibility

As mentioned above, database visibility is challenging due to the high activity and performance requirements. It’s unlikely you’ll get far on your own without the right solution and underlying technologies.

Simple tools rely on built-in database functionality. These often have a high-performance impact while providing almost no visibility. Even expensive tools often use older technologies that can’t deliver the desirable visibility and value.

In recent years, Core Audit introduced significant advances and innovations in capture, processing, storage, and analysis technologies. The result offers impressive and unparalleled visibility. Talk to us to learn more and see how we can help you protect your data.

Final Thoughts

These days, we need to protect more data and more systems while facing greater risks than ever before. These challenges are significant, and we need the visibility and control provided by the most recent advancements.

These latest polls show that most cybersecurity professionals are aware of the importance of visibility. However, most organizations still lack visibility into their databases.

Contact us today at marketing@bluecoreresearch.com to learn more about how we can help you improve visibility and protect your sensitive information.

Anomaly Analysis

Anomaly Analysis

How can you control activity in busy systems like databases? How will you know there’s a malicious SQL inside billions of those? Read to learn more.

Anomaly analysis uses behavioral analysis, helping you save time while expanding your control to vast activity volumes.

These capabilities are made possible by the unique security repository technology in Core Audit. The anomaly analysis engine dynamically creates behavioral profiles based on the activity captured by the security repository, allowing you to compare the present with the past. Contrasting current activity with historical behaviors quickly highlights differences indicative of security problems.

That lets you break the preconception that you can’t control large activity volumes like the database or application. Anomaly analysis empowers you to detect the needle in the haystack to find a single offensive SQL within billions.

How does it work?

Like all IT systems, database activity is repetitive. While values change, patterns persist. While the anomaly analysis engine is capable of more, most analyses center on five aspects:

  1. New activity. Something seen today but not in the past. Like a new user, a new program, a new SQL, and more.
  2. High activity volume. Activity that exists both now and in the past but happens now more.
  3. Time of day – Activity that occurs at a different time now than in the past.
  4. Combined dimensions. A change in a single dimension can be a new user. A change in multiple dimensions is a new user + IP combination. Even if the user and IP are known, that user may have never used that IP.
  5. Filters. Focus different anomalies on different areas of interest. Like sensitive tables, the application, particular users, and more. Various subsets of the activity pose distinct risks, but they are also likely to exhibit different behaviors and, therefore, benefit from other types of anomalies.

There are many ways to select the type of anomalies to use. It could be behaviors you expect, patterns you observed in proactive forensics, the Core Audit wizards, the Core Audit Control Guide, or just trial and error. But one of the crucial features of the anomaly engine is that you can immediately test an anomaly and find the results. That makes choosing and tuning the anomalies a relatively simple process.

Benefits

We initially created the anomaly engine to help answer a common customer request. Looking at traditional declarative auditing, some customers said, “I don’t want to tell Core Audit what to report on. I want Core Audit to tell me what to look at.”

As the technology evolved, it helped secure subsets of the activity previously thought impossible to control. For example, to protect the application with its billions of SQLs.

Anomalies have also proven effective in detecting SQL injection and injection attempts. SQL injection, inevitably, causes the application to do something it isn’t supposed to do, thereby creating a new SQL construct that’s easy to identify.

Today, anomalies are a powerful tool to reduce reporting volume, securing huge activity volumes with a relatively low false positive ratio.

Final Thoughts

Anomaly analysis is vital to modern security, allowing you to do more with less. Control higher volumes from more sources with fewer people, less time, and a lower skill set.

Instead of pouring over endless reports, anomalies do much of the heavy lifting, quickly pointing you to potential problems.

While they’re not a magic solution that solves everything, anomalies can and should be a key element in any security strategy.

Talk to us to learn more and experience the Core Audit difference.

Proactive Forensics

Proactive Forensics

Visibility is the first step in any security effort. You can’t secure what you can’t see. Learn more about the value of visibility and how to achieve it.

One of the popular myths about security is that you can get it out of the box. Just install something, and voila! You’re magically secured. But that never works.

Regardless of what you’re trying to secure, your first step should always be understanding the activity. You should know how the system is used, by whom, when, etc. Gaining this kind of visibility into a live production system is fundamental to figuring out how to secure it.

“You can’t secure what you can’t see.”

However, gaining visibility into IT systems is not simple. It also gets increasingly challenging when the systems process large activity volumes. Databases can process thousands of SQLs every second. All this activity is nearly impossible to understand without the right tools.

Core Audit is a comprehensive security solution that includes, among its many capabilities, the ability to perform proactive forensic investigations. These will give you insight into what’s happening in your database. That’s the first step we recommend when setting up your security.

Once you know what’s happening in your systems, you can design effective reports, configure alerts that make sense, define practical anomaly analysis, and more.

But proactive forensics is not just about setting up your security. It also allows you to identify gaps in your security measures. As activity evolves, we must adjust the controls to fit the new activity patterns, and it’s impossible to do that without visibility. Otherwise, your controls will gradually become outdated and eventually obsolete.

Proactive forensics also lets you identify poor security practices. People sharing accounts, connecting from insecure locations, dumping tables out of the database, and more. While not a breach, these popular bad practices increase your exposure and make a data breach more likely.

There are many motives for regular activity review, but they all share the same underlying reason. We should include people in the security process. No matter the reports, alerts, or automation we create, regular human review can easily flag behaviors and security needs that a machine never will.

Talk to us to learn more and try Core Audit to see the difference.


Fitted Security

Fitted Security

Avoiding the pitfalls of the security trends to design a security strategy that fits your environment and optimizes your posture given the available resources.

Many organizations design their cybersecurity strategy and decide what solutions to purchase based on industry trends and best practices. The outcome is often imbalanced and inappropriate to the organization’s risk profile and security needs.

Best-practice implementations are usually one-size-fits-all and not tailored to the specific environment. Being predictable, there are usually tools and guides on the internet that can defeat them. A generic approach of this nature also fails to leverage advantages readily available when examining the specifics of the environment.

Many solutions come with out-of-the-box security. They contain built-in policies and signatures designed for a low false-positive rate in any environment. These have difficulty identifying attack variations and can never identify other vectors, resulting in a weaker defense.

This paper is about how to do security differently.

Perimeter Vs. Data Centric

Most security solutions fall into one of two categories: Perimeter security or Data-centric security.

Perimeter security aims to prevent attackers from getting into the network and accessing internal systems. A compromised desktop is not a data breach, but once attackers have this access, they can take the next step and try to gain access to the application or the database. A data breach would only occur if the attackers can navigate from that compromised desktop to the data.

Perimeter security has two strategic limitations:

  • Internal threats – when a threat is already inside the network and behind the perimeter, it cannot be secured by these types of defenses. The same applies to threats outside the network with VPN access (.e.g. partners, consultants, and more).
  • Statistical defense – perimeter protections almost always aim to reduce the number of penetrations but not to prevent them altogether. Regardless of how good our spam filters are, we continue to get spam emails. No matter how good our personnel training is, people still click on those phishing emails.

Data-centric is a methodology that revolves around the data. It starts with where the data is stored – the database and expands outwards to the applications that process that data.

Data-centric security includes database security, application security, IAM, and more. A strong data-centric posture can prevent a data breach regardless of how many attackers penetrated the perimeter. For example, cloud-based applications rely mainly on data-centric measures.

Perimeter security is a parallel defense where attackers only need to breach one of the measures to get in. Attackers that fail to penetrate the firewall can try the mail system, social engineering, etc. Perimeter security is as strong as its weakest link.

Data-centric security is a layered security resulting in a serial defense where attackers must penetrate all the layers. A successful application breach that fails to execute an attack against the database is a failed attack. Data-centric security is the last line of defense and aims to be as air-tight as possible.

Perimeter and Data-centric defenses are not mutually exclusive, and well-balanced strategies will include both.

Landscape Considerations

When building a security strategy, we must consider the threats we perceive as relevant. For example, are we worried about external threats, internal threats, or both? If internal threats are a significant concern, perimeter security will not be effective against those, and we should beef up the data-centric components.

We must also consider the effectiveness of different measures in the current landscape. For example, as more and more people work remotely, network security and endpoint security are less effective. Once people work from home, they are outside the corporate firewall and use their home computers. With minimal control over all these remote mini-offices with VPN connections, our perimeter defense becomes weaker than expected.

BYOD (Bring-Your-Own-Device) poses a similar challenge where many uncontrolled and insecure phones and tablets roam the corporate network inside our perimeter.

As the modern landscape changes and the effectiveness of perimeter controls diminishes, there’s a growing shift to internal defenses. Part of this shift includes a redefinition of the perimeter line – protecting the data center network is more important than the corporate network.

Asset Considerations

There are several considerations when choosing which systems to focus our defenses and how to allocate resources:

  • Risk – some systems pose a higher risk than others. For example, a firewall is necessary because, without it, we will have countless attackers inside our network. We need to defend the application because all the people using it or with network access to it form a large surface area. We must also protect our database because it stores all the data, and a breach would be catastrophic.
  • Effectiveness – some systems are easier to secure with better results, while others can be expensive to protect or have limited security benefits. For example, firewalls are an effective measure and are cheap and easy to deploy – that’s an easy choice. Application or database security (depending on the solution used) could be difficult or expensive to deploy but with good results. IAM can be costly to deploy but with limited value.
  • Attack paths – some systems are on a critical path between the attackers and the data. Others are on paths that attackers could bypass. For example, an attacker must communicate with the database to extract the data. An attack is unlikely to succeed without it. However, whether an application is on such a path depends on the environment. Sometimes, the main path to the data is through a single application. In other cases, multiple applications or direct database connections offer alternative paths.

Balancing the risk, effectiveness, and attack paths is a central consideration in cybersecurity strategy, and there are multiple balancing methods. However, it’s important to remember there are many variables. For example, the effectiveness of the security depends on the solution and technology you’re evaluating since different solutions will differ in cost and effectiveness.

IDS and IPS

Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) are fundamentally different methodologies in security.

IPS aims to prevent attacks and breaches. These are classic security systems like firewalls, users & passwords, etc. IPS are critical components in any security strategy.

In contrast, IDS aims to detect attacks and alert or report on them. These include measures like auditing, SIEM, etc. IDS are also a critical component in any security strategy as they inform security personnel, allow them to initiate a response, and provide forensic information.

It’s important to understand the core differences between IPS and IDS:

  • Tolerance for false positives.
    False positives in IPS mean the system prevents users from performing legitimate activities. Therefore, IPS cannot tolerate false positives.
    False positives in IDS mean that reports or alerts require investigation by security personnel while users continue doing their jobs. While we don’t want too many false positives, most IDS are designed and calibrated to have a certain level of false positives.
  • Response time.
    IPS must determine whether activities are valid before letting them through. A slow IPS means IT systems have longer response times, and users complain. An IPS is, therefore, always designed to use real-time algorithms capable of making split-second judgment calls.
    IDS, on the other hand, can take their time to report or alert. IDS usually use more complex algorithms and can analyze large volumes of data to identify intrusions. They can refer to historical information or wait for future events to occur.
  • Circumvention.
    When an IPS system prevents an attack, the attacker is inevitably aware of the failed attempt and can try again. Therefore, attackers constantly challenge IPS systems until they find a way to penetrate them.
    IDS inform security personnel of the attack allowing them to respond in various ways. Responses can include taking systems offline, diverting attacks to honeypots, tracing the attack back to its source, and more. In all cases, attackers don’t get a second chance to try again against an identical defense. IDS systems are, therefore, much more difficult to circumvent.

Consequently, while IPS could prevent an intrusion, IDS is more likely to detect it and less likely to be circumvented. It’s always recommended, when possible, to deploy both types of systems. Deploy an IPS to block most attacks and deploy an IDS to detect the ones that got through.

Tailored Security

Every environment is different. Differences include the application architecture, technology stack, the number of users, administrator access, and more. Other parameters include the user roles, the programs in use, the network segments, the types of data, the activity profiles, etc.

Leveraging such parameters while implementing preventive and detective measures yields more effective results. Such tailored security cannot be out of the box as it requires evaluation, consideration, and planning. However, it’s vital for achieving high effectiveness in attack detection and a low false positive rate.

Customizing security this way takes time and effort, and there’s no magic bullet. But the results are far more likely to withstand an attack and prevent a data breach.

DCSA

Data-Centric Security Assessment is a service offered by Blue Core Research to help security departments quantify the risk to each of their systems and evaluate the effectiveness of different controls and strategies.

DCSA is an interview-based evaluation that uses your opinion about your systems. It will combine multiple parameters about each system in your environment and yield your risk level.

DCSA aims to help you choose the solutions and strategies with the most impact on your security to minimize the risk.

Final thoughts

The quality and strength of your security strategy depend on the time and effort that went into it. You will have a good security posture if you have a good balance between your perimeter and your data-centric, you created layered security, you combined both IPS and IDS on every system with a proper response strategy, you tailored your solutions to each environment, and you spread your resources according to the risk, effectiveness, and attack paths attackers are likely to follow.

That sounds like a lot, but it’s not that difficult. However, it means not following trends and buying solutions just because they are in fashion. It means putting efforts into implementations and not just looking to follow best practices. That requires effort and careful thought, but it’s the best way to lower the risk of a data breach.


SQL Injection Attack Detection

SQL Injection Attack Detection

This is a true story of a SQL injection attack on our website. Learn about the attack and why the Core Audit anomaly analysis database defense is the most effective way to combat this type of threat.

Introduction

We got an alert two days before New Year’s. It was shortly after midnight on December 30, 2021. It was a daily anomaly alert relating to the database backend of an old website, but it was clearly an attack.

As it later turned out, this was the first of several such attack attempts. There were three more in January of 2022, two more in February, and one more in August, October, and November. All in all, nine similar attacks over the course of a year.

Background

We use WordPress (an open-source content management system) on many of our websites and always with a MySQL backend (the most popular setup). The website in question was old, and due to compatibility problems, we haven’t updated some of the software components in a while. So when I first saw this alert, it seemed highly plausible that there was a vulnerability that led to a successful attack.

As it later turned out, this was a false positive. The reason for this false positive was that in addition to an old version of WordPress and plugins, this website also used an old version of Core Audit. But more on this later.

None of our websites contain sensitive information, but as this paper will show, protecting the database of any application with Core Audit is a highly effective means of detecting attacks and protecting the application.

The Anomaly

The anomaly alert had 168 lines, and the first line was this:

SELECT * FROM wp_users WHERE user_login = '') UNION ALL SELECT NULL-- HupP'

While each line was different, every line started with one of these:

SELECT * FROM wp_users WHERE user_login =''

SELECT * FROM wp_users INNER JOIN wp_usermeta ON user_id = ID WHERE meta_key = '' AND user_login = ''

What made it clear that this was an attack were the many end variations of the SQLs that looked like these:

;SELECT SLEEP(9)#' LIMIT 9

) AND SLEEP(9) AND (\\''=\\''

WAITFOR DELAY \\'' AND \\''=\\'' LIMIT 9

;SELECT SLEEP(9)#'

);SELECT PG_SLEEP(9)--'

;SELECT PG_SLEEP(9)--' LIMIT 9

);WAITFOR DELAY \\''--'

;WAITFOR DELAY \\''--' LIMIT 9

) AND 9=(SELECT 9 FROM PG_SLEEP(9))

UNION ALL SELECT NULL-- HupP'

) UNION ALL SELECT NULL-- HupP' LIMIT 9

UNION ALL SELECT NULL,NULL,NULL-- iIWs'

UNION ALL SELECT NULL,NULL,NULL,NULL-- ynbe'

Bear in mind that the empty strings ('') and the 9’s are not part of the original SQL. They relate to how the Core Audit security repository operates. This repository automatically collects all the SQLs in the database, so to reduce storage space and eliminate anomalies from embedded literals, it automatically strips all the numbers and strings.

The attempts listed above were scanning for a SQL injection vulnerability. They were trying to detect whether SQL injection was possible and discover details that would facilitate an attack:

  • The various sleep/delay statements would only work on particular databases. So a delayed response tells the attacker both that the injection was successful and the type of database used.
  • The various UNION and NULL permutations were trying to determine the number of fields in the query and whether they could append data from other tables.

In addition to many more variations of SLEEP and UNION, there were other colorful expressions like:

) AND (SELECT 9 FROM(SELECT COUNT(),CONCAT (0x7178707671,(SELECT (ELT(9=9,9))),0x71717 67171,FLOOR(RAND(9)9))x FROM INFORMATION_SCHEMA.CHARAC

) AND 9=CAST((CHR(9)||CHR(9)||CHR(9)||CHR(9)||CHR(9))||(SELECT (CASE WHEN (9=9) THEN 9 ELSE 9 END))::text||(CHR(9)||CHR(9)||CHR(9)||CHR(9)||C

) AND 9=CAST((CHR(9)||CHR(9)||CHR(9)||CHR(9)||CHR(9))||(SELECT (CASE WHEN (9=9) THEN 9 ELSE 9 END))::text||(CHR(9)||CHR(9)||CHR(9)||CHR(9)||CHR(9)) AS NUMERIC) AND (\\''=\\'')

;SELECT DBMS_PIPE.RECEIVE_MESSAGE(CHR(9)||CHR(9)||CHR(9)||CHR(9),9) FROM DUAL--'

AND 9=CONVERT(INT,(SELECT CHAR(9)+CHAR(9)+CHAR(9)+CHAR(9)+CHAR(9)+(SELECT (CASE WHEN (9=9) THEN CHAR(9) ELSE CHAR(9) END))+CHAR(9)+CHAR(9)+C

) AND (SELECT 9 FROM(SELECT COUNT(),CONCA T(0x7178707671,(SELECT (ELT(9=9,9))),0x717 1767171,FLOOR(RAND(9)9))x FROM INFORMATION_SCHEMA.CHARACTER_SETS GROUP BY x)a) AND (\\''=\\'')

) AND 9=CONVERT(INT,(SELECT CHAR(9)+CHAR(9)+CHAR(9)+CHAR(9)+CHAR(9)+(SELECT (CASE WHEN (9=9) THEN CHAR(9) ELSE CHAR(9) END))+CHAR(9)+CHAR(9)+CHAR(9)+CHAR(9)+CHAR(9))) AND (\\''=\\''

These SQLs are clearly unusual and seem to attempt to bypass a SQL injection protection system like a WAF.

Forensics

We experienced an attack. The alert left no doubt about that. The two remaining questions were:

  • Was the attack successful?
  • Which part of the application was targeted?

Database Forensics

The first question was easy to answer. I started by locating the queries in the reduced SQL forensic view. The reduced SQL repository has a 5-minute resolution, and all these SQLs executed within the 5-minute window of 8:20 am to 8:25 am.

Once I located the queries, I also saw the good news – they were all successful (no errors), and the number of rows retrieved was zero for all.

Why are successful executions good news? Because this attack was trying to test whether the application was vulnerable to SQL injection. In this scan, most of the queries were supposed to fail, with the few successful ones indicating the method to exploit a vulnerability. For example, the SQLs with PG_SLEEP() could never be successful on our MySQL database since this function only exists in PostgreSQL databases. Therefore, successful executions mean the injection attempt failed to modify the SQL construct.

Additionally, these SQLs didn’t retrieve data, so there was no leak. While most of these SQLs were not attempting to extract anything, it’s comforting to know nothing was retrieved.

In other words – this attack failed to break through the literal boundary. We’ll get back to this subject later and also answer the more interesting question of why we got the anomaly alert in the first place.

Application Forensics

Now that we have more information, it’s easy to search the Apache logs for more details on the attack and what it targeted.

The attack was between 8:20 am and 8:25 am and accessed the wp_users table. Looking at the Apache log, we can see this attack started at exactly 8:20 am:

[29/Dec/2021:08:20:00] "GET /wp-login.php?log=1&pwd=-9696%20UNION%20ALL%20SELECT%2024%2C24--%20ptzf"

This attack was a GET request to the wp-login.php script. That is the WordPress login page. The attack delivered its injection attempt through the password field that, in this first attempt, included:

-9696 UNION ALL SELECT 24,24-- ptzf

The last attack attempt was using this POST request at 8:21:51 am:

[29/Dec/2021:08:21:51] "POST /wp-login.php?Phws%3D5963%20AND%201%3D1%20UNION%20ALL%20SELECT%201%2CNULL%2C%27%3Cscript%3Ealert%28%22XSS%22%29%3C%2Fscript%3E%27%2Ctable_name%20FROM%20information_schema.tables%20WHERE%202%3E1--%2F%2A%2A%2F%3B%20EXEC%20xp_cmdshell%28%27cat%20..%2F..%2F..%2Fetc%2Fpasswd%27%29%23"

The Apache logs don’t record the POST parameters, but we can see this payload on the request:

Phws=5963 AND 1=1 UNION ALL SELECT 1,NULL,'<script>alert("XSS")',table_name FROM information_schema.tables WHERE 2>1--/**/; EXEC xp_cmdshell('cat ../../../etc/passwd')#

This payload is a little funny as it contains a SQL injection with a cross-site scripting scan (alert(“XSS”)) and an attempt to have SQL Server execute a shell command (EXEC xp_cmdshell) with a Unix/Linux command printing the content of a Unix/Linux password file (cat …/passwd).

That is a mix of attack fragments that could never work together. And there are several other things wrong with this last request and payload.

That indicates the person running the attack had little understanding of what they were doing. Combined with the fact that the whole scan lasted just under 2 minutes, it suggests this was a script or, more likely, several scripts they downloaded off the internet and executed against various websites.

False Positive

So why did the attack fail, and why did we get an anomaly alert anyway?

A SQL injection attack attempts to modify the SQL construct by breaking through the literal boundaries. In other words, when a SQL contains a literal like this:

SELECT * FROM wp_users WHERE user_login = 'JOHN'

A SQL injection attack attempts to send a string other than “JOHN” so the SQL construct will change. For example, the string “X' or 'Y'='Y” will result in this SQL:

SELECT * FROM wp_users WHERE user_login ='X' or 'Y'='Y'

By changing the SQL construct and adding or 'Y'='Y', the database will run something not intended by the developer who wrote the code. Putting a tag (') in the input broke through the literal boundary and allowed the attacker to alter the SQL construct. Escaping literals before embedding them into a SQL prevents this vulnerability.

In the SQL standard, you can escape tags (') by using double tags (''). When escaped, the above example will yield:

SELECT * FROM wp_users WHERE user_login = 'X'' or ''Y''=''Y'

In this case, the database will compare the user_login field to the entire string, and the word OR is just part of the user name, not part of the SQL construct.

WordPress must have escaped the input correctly in our attack, and the SQL construct was not modified. That’s why the SQLs executed without error, and the attack failed.

But why did we receive an anomaly alert if the attack didn’t break the literal boundary?

Unlike other databases, in MySQL, there are two ways to escape tags in strings. The first is with double tags ('') according to the SQL standard. But there’s a second method of preceding the tag with a backslash (\'). WordPress uses the second method.

That’s where the old version of Core Audit comes into play. That old version was released shortly after the release of MySQL support, and the literal stripping in the Reduced SQL repository did not support the backslash escape method for MySQL databases. As a result, the old version thought the tags were not escaped and didn’t strip the literals correctly.

Once we discovered the unintended consequence of falsely detecting these SQL injection attacks, we purposely kept that old Core Audit version on that website. We wanted to see how many more failed attacks we’ll experience. As stated earlier, this old website experienced nine attempts over the following year. Once we upgraded the Core Audit version, we stopped receiving these false positive anomaly alerts.

Application attack surface area

You might wonder why someone attempted a SQL injection attack that doesn’t work against WordPress. The attackers, like anyone else, had access to WordPress. So why didn’t they know this attack would fail?

That is an interesting question that highlights the complexities of supply chain attacks in some 3rd party applications, like WordPress.

The first thought that comes to mind is that maybe some versions of WordPress are susceptible to this attack. However, SQL injection is a well-known threat, and the WordPress development team is strict about escaping inputs before embedding literals in SQLs. While using bind variables would be safer, web developers have a penchant for embedding literals.

However, it’s worthwhile noting that multiple versions and patches significantly increase the attack surface area. The reason is that most organizations upgrade and patch applications from time to time and can never be sure which vulnerabilities they eliminated or introduced and at what point.

But since it’s unlikely that any WordPress version was susceptible to a SQL injection on the login screen, that brings up another interesting feature of WordPress – Plugins.

A big part of the power and flexibility of WordPress are the tens of thousands of plugins that can extend it. These plugins are code that can attach to various places in WordPress and modify its behavior in almost any way imaginable.

Plugins offer power and flexibility to the users, but they also pose a security risk. WordPress plugins introduce additional vendors, developers, coding standards, changes to the database data model, new execution paths in the application, and, of course, new vulnerabilities.

The risk in plugins is not only vulnerabilities in 3rd party software but also the risk of supply chain attacks. Such attacks could be initiated by the plugin authors or by a hacker who altered their source code.

It’s unlikely that WordPress was ever susceptible to this attack, but it’s highly plausible that some plugin was. The attacker was probably targeting a plugin that was not installed in our WordPress.

Final thoughts

SQL injections are notoriously hard to detect and, even more so, to prevent. However, Core Audit anomaly analysis was able to easily alert on those attacks.

Not only was anomaly analysis able to detect the attack, but it did it without any attack signatures or support for PHP or WordPress. And, actually, without even looking for a SQL injection attack.

And that is the reason anomaly analysis is so effective against SQL injection – it’s not looking specifically for that. It’s searching for SQLs that are new to the application and are, therefore, suspicious. SQL injection can masquerade in many ways but, by definition, is not part of the SQL vocabulary of the application. Anomaly analysis will, therefore, always flag it as suspicious.

WordPress Attack Detection

WordPress Attack Detection

WordPress is a common application for managing websites. But this story is about how we detected an attack on a generic application.

On Sunday morning, we got an anomaly alert. It was March 19, 2023. This story is about what happened.

Background

The Blue Core Research website uses WordPress (a free and open-source content management system). WordPress usually uses MySQL as a backend database, and our installation is no different.

While our WordPress doesn’t contain sensitive data, we still protect its database with Core Audit. We do it both to protect our server and prevent a breach from internet attack, and to use our own software.

As you’ll see in this true story, protecting the database also protects the application. WordPress happens to be the application we protected, but this defense is effective for any application.

The Anomaly

Our Core Audit installation has a relatively simple configuration with daily anomaly detection alerts. We get a few false positives every few days, but on Sunday morning of March 19, 2023, we got an alert of a new error:

Access denied for user ''@'' (using password: YES)

It immediately raised a red flag in my mind. That’s because anomalies only alert of something different, like a change in the application activity profile. But a change that causes an access-denied error seems suspicious.

The Investigation

Core Audit anomalies rely on the Core Audit Security Repository. And this particular anomaly is based on the Reduced SQL portion of that repository. The Reduced SQL repository contains every SQL construct executed in the database at a 5-minute resolution.

Once I got the alert, I logged into Core Audit and looked at the relevant forensic view. I quickly found what I was alerted about:

Date:     Sat 2023/03/18
Time:     18:55 - 19:00
Count:    1 per 5 minutes
Rows:     0
Command:  ERROR TEXT
Activity: Access denied for user ''@'' (using password: YES)

That is an internal MySQL error resulting from a failed login. The user and host in this error are between single quotes (‘) and are, therefore, stripped from the Reduced SQL repository. Striping literals is part of how the Reduced SQL repository operates.

My next stop was to look for the failed session that caused it. Switching to the session forensic view, I found this session:

Start:    2023/03/18 18:55:59
End:      2023/03/18 18:55:59
Type:     No Login
Username: username_here
Machine:  localhost

That seems unusual since we don’t have a user in the database called username_here. That is, obviously, an invalid username. My next stop was to look at the web server logs to see who or what triggered this unusual login.

Here is an excerpt from the Apache log:

[18:55:36] GET /.wp-config.php_copy
[18:55:38] GET /.wp-config.php.rar
[18:55:39] GET /.wp-config.php.7z
[18:55:41] GET /.wp-config.php.tmp
[18:55:42] GET /.wp-config.php_tmp
[18:55:44] GET /.wp-config.php.old
[18:55:46] GET /.wp-config.php.0
[18:55:47] GET /.wp-config.php.1
[18:55:49] GET /.wp-config.php.2
[18:55:50] GET /.wp-config.php.zip
[18:55:52] GET /.wp-config.php.gz
[18:55:53] GET /.wp-config.php~
[18:55:55] GET /wp-config.php.templ
[18:55:56] GET /wp-config.php1
[18:55:58] GET /wp-config.php2
[18:55:59] GET /wp-config-sample.php
[18:56:00] GET /wp-config-backup.txt
[18:56:02] GET /wp-config.php.orig
[18:56:03] GET /wp-config.php.original
[18:56:05] GET /wp-license.php?file=..%2F..
[18:56:06] GET /wp-config.save
[18:56:07] GET /wp-config.txt
[18:56:09] GET /wp-config.dist
[18:56:10] GET /.wp-config.php.swo
[18:56:12] GET /wp-config%20copy.php
[18:56:12] GET /wp-config_good
[18:56:14] GET /wp-config-backup
[18:56:15] GET /wp-config-backup.php
[18:56:16] GET /wp-config-backup1.txt
[18:56:18] GET /wp-config-good
[18:56:19] GET /wp-config-sample.php.bak
[18:56:21] GET /wp-config-sample.php~
[18:56:22] GET /wp-config.backup

The culprit is clearly wp-config-sample.php, and looking inside this PHP file, we can find these lines:

define( 'DB_USER', 'username_here' );
define( 'DB_PASSWORD', 'password_here' );
…
require_once ABSPATH . 'wp-settings.php';

We now understand what happened: someone scanned the website for vulnerabilities and called wp-config-sample.php. That PHP script contains an invalid user and password that triggered a failed connection to the database. Core Audit detected this unusual behavior and raised an alert.

WordPress?

While many won’t consider WordPress a good example of application security, we think it’s valuable because of several important factors:

3rd Party & Supply Chain

First and foremost – WordPress is a 100% third-party application. We didn’t code it, have not reviewed the source code, and have almost no knowledge of the code, the data model, or any other aspect of this application.

Being open-source software, WordPress is exposed to both vulnerabilities hackers may discover in the source code and to supply chain attacks.

To make matters worse, WordPress websites almost always use plugins. Much of the power of the WordPress platform is in its extensibility by the tens of thousands of plugins that are out there. Plugin use significantly expand the surface area of 3rd party software and supply chain attacks. Plugins mean a lot of additional unknown code from multiple vendors or developers, different coding standards, enhancements to the data model with more tables, and more.

In other words – WordPress is what you might consider and extreme example of a 3rd party application and protecting it is more challenging than most other applications.

Dynamic SQL

WordPress takes the concept of dynamic SQL to an extreme. Not only do all the SQLs embed literals, but SQLs are often dynamically generated with where clauses created on the fly based on user input.

The challenge is not only dynamic SQL attacks but also the large SQL vocabulary where it’s impossible to predict all the SQL combinations WordPress may generate.

Again, plugins worsen the problem due to an even larger and unknown SQL vocabulary and more potential vulnerabilities in the code that generates those dynamic SQLs.

Popular & Internet facing

WordPress is very popular and is an internet-facing application. Hackers are, therefore, highly incentivized to find vulnerabilities. Exploiting such vulnerabilities can also be easy since the internet is full of WordPress websites with little or no additional security.

Bottom line – we provided effective defense to a large, dynamic, and extensible application like WordPress without relying on known vulnerabilities or the specifics of the application.

Resolution

Once we knew what the attack was, the next challenge was finding a way to protect the application from this and similar attacks.

In WordPress, the file the was attacked (wp-config-sample.php) is part of the WordPress installation package. The WordPress installation process copies that file into wp-config.php and defines some parameters like the database user and password in the copy.

Therefore, wp-config-sample.php is expected to exist in any WordPress installation. And if the WordPress development team has done its job right, there is probably no means of breaching WordPress by calling it.

The attacker was clearly targeting some vulnerability in the code. And even if that vulnerability doesn’t exist in our installation today, as we all know, bugs exist, and zero-day attacks can always occur. Having that file sitting around seems like asking for trouble. That file serves no purpose in WordPress other than as a template for wp-config.php, so shouldn’t we delete it?

That takes us to another WordPress behavior – the WordPress update. When WordPress updates, it overwrites all the files that are part of the WordPress installation package. Therefore, deleting wp-config-sample.php will be undone as soon as WordPress automatically updates.

Another option is to leave the file in place but change file permissions so that Apache can’t access it. That would prevent that PHP script from executing, but may also cause errors when WordPress attempts to update and overwrite it.

The best solution might be to deny access to it using .htaccess:

<Files ~ "wp-config-sample.php">
deny from all
</Files>

Going Further

As previously explained, wp-config-sample.php is used to create wp-config.php during installation. That means that the copy (wp-config.php) could also be vulnerable to the same attack or other similar zero-day attacks.

More troubling is that if a WordPress update fixes a vulnerability in wp-config-sample.php, that fix will not update in the wp-config.php copy since that copy is never modified.

wp-config.php is meant to be included by other WordPress files like index.php. After defining the database connection parameters, the script calls wp-settings.php to set up the WordPress environment, including a database connection. That’s how we got a connection to the database during the attack.

But wp-config.php is never meant to be called directly. To protect it against direct execution attacks like we saw in this attack, we can add a line at the top of wp-config.php to stop the processing on direct execution:

if (get_included_files()[0] == '/path_to_site/wp-config.php') exit();

Final thoughts

Despite the challenges presented by protecting an application like WordPress, Core Audit anomaly analysis provides effective security. It has a reasonably low false positives level and, in this case, has alerted of an attack against the application.

It’s important to note that Core Audit doesn’t have special support for PHP or WordPress. It doesn’t use signatures and this attack could have been a zero-day vulnerability. Also, our configuration wasn’t looking for any particular vulnerability in the database or the application. We only had a general-purpose anomaly detection that alerted us as soon as the application behaved differently. That was enough to identify the attack.

As you’ve seen, monitoring changes in the database activity profile of the application lets us detect many changes in the application behavior. While some changes occur naturally, others indicate a security problem or an attack.

SQL Injection

SQL Injection

SQL Injection is one of the most well-known attack vectors and it poses a significant security challenge. Learn more about how SQL injection works, and the different approaches to solving it.

Introduction

SQL Injection is one of the most well-known attack vectors and it poses a significant security challenge. The only way to understand the benefits and deficiencies of each solution is to understand the problem and the approach each solution uses to solve it.

The Problem

The best way to understand the problem is by looking at examples. To demonstrate how SQL injection works, here’s an example of a simple application.

Application Example

Account
0
First Name
John
Last Name
Doe
Last 4 of social
5555
Sumbit
SELECT card FROM cards WHERE
  account = 0 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = 5555

Let’s assume our application is a simple web page that allows customer support representatives to see the credit card numbers of customers who call in.

For security reasons, the representative must enter the customer’s account number, first name, last name, and the last 4 digits of the customer’s social security number.

When all the information matches the customer’s data, the credit card numbers used by the customer will be shown.

Seems like a pretty secure design as the customer support representative has to get the information from a real customer to see their credit cards. How could a customer support representative use this form to get information about any other customer?

The problem is not in the conceptual design but in the actual implementation of the code.

To make this page work, the application running inside the web server needs to run a SQL against the database. That SQL might look like the above example.

However, this SQL seems like the correct SQL to run and there’s no apparent way to get credit card information without entering the customer’s information.

So what’s the problem with the implementation?

The Attack

Account
0
First Name
John
Last Name
Doe
Last 4 of social
0 OR 1=1
Sumbit
SELECT card FROM cards WHERE
  account = 0 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = 0 OR 1=1

When the form is filled this way and the SQL is executed, the database compares the conditions against every row in the table.

Let’s assume the first row is for customer 1001 named JOE SMITH with SSN 5555:

  • Is the customer number 0? No, it’s 1001
  • Is the first name JOHN? No, it’s JOE
  • Is the last name DOE? No, it’s SMITH
  • Is the SSN 0? No, it’s 5555
  • Is 1=1? Yes, 1 is always equal to 1.

In SQL, the order of operations will evaluate the AND before the OR, resulting in something like this:

(FALSE and FALSE and FALSE and FALSE)
or
(TRUE)

The ultimate result of this comparison is TRUE! That means that the credit card number of JOE SMITH will be displayed.

Since this condition will be true for all the rows in the table, every credit card in the table will be shown.

Because we filled the form in a way the programmer did not expect and did not protect against, we were able to trick the application into showing us every credit card in the database.

The String Variation

The database will consider ssn = ‘5555’ as a valid comparison and automatically convert the string ‘5555’ into the number 5555. So maybe if we embed the value as a string things would look different. Let’s fix the application to run this SQL:

SELECT card FROM cards WHERE
  account = 0 AND
  first = ’John’
  AND last = ’Doe’
  AND ssn = '5555'

In this case, the database will return an error if we fill the form with “0 OR 1=1” because it cannot convert this string into a number. Have we prevented SQL injection?

Account
0
First Name
John
Last Name
Doe
Last 4 of social
0' OR 'a'='a
Sumbit
SELECT card FROM cards WHERE
  account = 0 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = '0' OR 'a'='a'

With this slight modification, the attack will still extract all the credit cards because ‘a’ is always equal to ‘a’.

Truncating the SQL

Account
0 OR 1=1 --
First Name
John
Last Name
Doe
Last 4 of social
5555
Sumbit
SELECT card FROM cards WHERE
  account = $account2 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = 5555

Another complication an attacker might face is if there’s another AND condition after the injected code. In the above example, we injected into the account number that has several AND conditions after it.

The double dash (- -) in SQL, comments out the rest of the line, so any additional conditions can be easily disposed of.

In MySQL, # is another way to comment out the rest of the line with identical effect to – –

Another way of adding a comment to a SQL is by enclosing it in /* */. If a comment starts, most databases will consider it to be a comment until the end of the SQL (with any number of lines). So adding /* instead of – – will also comment out the rest of the SQL.

Another consideration in MySQL is that /*! */ is not a comment as the ! symbol disables the comment. This can be a useful means of bypassing security measures that ignore comments.

Parameter Fragmentation (HPF)

Account
0 OR /*
First Name
*/ 1 /*
Last Name
*/ = /*
Last 4 of social
*/ 1
Sumbit
SELECT card FROM cards WHERE
  account = $account2 AND
  first = ’$first2’ AND
  last = ’$last2’ AND
  ssn = $ssn2

Another option is to comment out entire sections of the SQL and split the attack across multiple parameters.

In SQL, any text between /* and */ is a comment and is ignored.

The nice thing about this attack is that not only did it disable entire portions of the SQL, but it also split the attack across parameters making it more difficult to detect by WAF technology. More on that later.

Splitting the SQL

Account
0
First Name
John
Last Name
Doe
Last 4 of social
1 UNION SELECT SSN FROM USERS
Sumbit
SELECT card FROM cards WHERE
  account = 0 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = 1 UNION SELECT SSN FROM USERS

What if the vulnerable form doesn’t access the table or columns we want? Let’s assume we want to read the SSN column from the USERS table because that table contains the full social security number and not only the last 4 digits.

How can a form that reads the cards table show us information from a completely different table?

The above SQL uses the UNION operator to combine the output from two different SQLs into one result set:

In this case, the first SELECT returns no data, but the second will dump the entire SSN column from the USERS table. The union operator will combine both into a single result set and the application will display all the rows in the SSN column as if they were credit cards.

SQL Batch

Account
0
First Name
John
Last Name
Doe
Last 4 of social
0; CREATE LOGIN hacker WITH PASSWORD = 'password'
Sumbit
SELECT card FROM cards WHERE
  account = 0 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = 0; CREATE LOGIN hacker WITH PASSWORD = 'password'

In Microsoft SQL Server and Sybase, SQLs are sent to the database in batches. A batch is a sequence of SQLs sent to the database in a single request, and all requests are batches.

You can use a semi-column to separate SQLs in the batch:

SELECT c1 FROM t1; SELECT c2 FROM t2

However, the semi-column is optional and this batch will yield the same result:

SELECT c1 FROM t1 SELECT c2 FROM t2

In these databases, SQL injection can take on an entirely new level of malevolence. The injected fragment can run an entirely different SQL allowing it to change data, alter the database, and more.

This example assumes the application is connected to SQL Server with a privileged account. Note that the semi-column is optional but used for clarity

When executed, no credit cards will be returned but a new user called ‘hacker’ will be created in the database with the password ‘password’.

This method can be used to modify data, permissions, and much more. Anything that the application account has privileges to do can be done via this type of attack. The attacker can even execute the shutdown command and shut down the database.

If complete control over the database isn’t enough, SQL Server also has the xp_cmdshell procedure that allows running operating system commands and xp_regread that can read the registry. This grants the attacker complete access to the machine.

While applications should never use a privileged account in any database, this rule becomes extremely important in SQL Server databases. Application using SQL Server databases must never use a privileged account. Unfortunately, many applications in SQL Server use the highly privileged SA account.

Variations

So far we’ve mostly discussed using OR 1=1. That is the classic example that is always used and it might create the impression that if a SQL does not contain this expression there is no SQL injection.

In reality, there are endless variations on conditions that are always true and this is just a little taste of those.

We’ll start with trivial examples that use simple operators:

ExpressionExplanation
17 != 2117 is always different than 21
99 < 10099 is always smaller than 100
9-1 > 39-1 is always greater than 3
‘ABC’ = ‘ABC’The string ‘ABC’ is always identical to ‘ABC’
‘hello’ != ‘world’The string ‘hello’ is always different than the string ‘world’

We can also use some database functions to create slightly more complex expressions:

ExpressionExplanation
‘XYZ’ like ‘%Y%’XYZ contains a Y
‘B’ in (‘A’, ‘B’, ‘C’)B is in the list
7 in (select 7)7 is in the result set
translate(‘AB’,’AB’,’XY’) = ‘XY’replacing A with X and B with Y in AB results in XY

However, all these examples use literals (constants). If we had a tool that was smart enough to calculate every expression the database can perform, it would be able to determine that these expressions are always true.

This means that in theory, it could be possible to detect that the above examples are SQL injections. But that just means we need to take another tiny step forward.

The expressions below use column values and are, therefore, no tool can calculate the expressions to determine they are always true. It is impossible to distinguish these expressions from proper application logic.

ExpressionExplanation
last != ‘XYZ’last name is not XYZ
ssn > 0SSN is greater than 0
id not in (0, 1, 2)ID is not 0, 1, or 2
first != lastfirst and last names are different

While these expressions will probably always be true, it’s impossible to determine that without understanding the data in each column. There will be additional discussion on this subject in the section on static analysis.

Non-Trivial Input

All the examples so far entered data into fields in the form. This is the most trivial way to interact with the application, but far from being the only one.

The above HTML form will usually send the query to the application web server using a GET or POST request method. A GET request might look like:

…/cards?id=0&first=JOHN&last=DOE&ssn=5555

You’ve probably seen strings like that in the URL of your web browser. The same GET request can also be sent using AJAX in a way that is not visible in the URL but can be seen just as easily in the developer console (try hitting F12 in your web browser).

POST requests are usually formatted the same as GET requests but are not sent in the URL. They can also be seen easily in the developer console of any web browser.

When attackers hack, they usually send GET and POST requests directly. This allows them to manipulate the fields in ways that the application form might not allow.

These non-trivial inputs are slightly more difficult to identify and modify, and as a result slightly more difficult to exploit. However, this lack of accessibility also means that developers are unlikely to properly protect or test them.

A few examples of information that cannot be edited in a form but can be changed in a GET or POST request (among other ways):

  • Hidden input fields – these are input fields that are not visible to the user. Applications use these to keep context information about the state of the application or the user’s flow through it. While the user cannot see these fields on the screen, it is not difficult to modify their content.
    For example, a session ID could be stored in a hidden field and the application could use it in a query to the database.
  • Select fields – These are single-choice combo boxes or multiple-choice selections. While the user can only select from the options displayed, an attacker can send any value.
    For example, if you can only choose between two options that send a value or 0 or 1, an attacker can send the value of 0 or 1=1.
  • Links – We often click on links, buttons, trees, images, and other components in web applications. These links often have values embedded in them, and those values can be easily manipulated by an attacker.
    For example, the “Next Page” link we often see would have embedded in it the first item number of the next page (e.g. start=10). That first item number could be modified to include a SQL injection.
  • Cookies – Cookies are another way of storing state information in the web browser and sending it to the web server. For example, this is the standard place applications use to keep session IDs.

These types of inputs are never tested by QA or by regular users and most automated vulnerability detection systems will also fail to test these. This means that flaws in these types of inputs can lie dormant for years without even being noticed or addressed.

As a result, these non-trivial inputs tend to be far more likely to be vulnerable and provide fertile ground for targeted attacks by experienced hackers.

Parameter Pollution (HPP)

One type of attack that can only be accomplished by manipulating the GET/POST requests is Parameter Pollution. For example, what happens if the above application receives two SSN fields in the request instead of one? In other words, the GET request would look like:

?id=0&first=JOHN&last=DOE&ssn=5555&ssn=7777

The answer depends on the technology stack used by the application and the methods used to extract the parameters. Some technologies will return a comma-separated list of the duplicate parameter. So the above request will return:

5555,7777

This creates an opportunity for an attack similar to Parameter Fragmentation (HPF) but without requiring multiple parameters or knowing their order in the SQL. For example, sending these 4 SSNs (broken into lines for clarity):

…/cards?id=0&first=JOHN&last=DOE
&ssn=0 OR /*
&ssn=*/ 1 /*
&ssn=*/ = /*
&ssn=*/ 1

Will result in this SQL:

SELECT card FROM cards WHERE
   account = 0 AND
   first = ’JOHN’ AND
   last = ’DOE’ AND
   ssn = 0 OR /*,*/ 1 /*,*/ = /*,*/ 1

That is a fragmented SQL injection attack using a single parameter.

No Keywords

Account
account/*
First Name
John
Last Name
Doe
Last 4 of social
1
Sumbit
SELECT card FROM cards WHERE
  account = $account2 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = 1

So far all the attacks used SQL keywords like OR and UNION. Even though they are simple words in English that could appear in normal data they make the input look suspicious.

But with a little investigation and effort, it’s possible to create a SQL injection that has no keywords or even spaces.

In this example, we just check whether a column is equal to itself and comment out the rest of the SQL. Since the account is always equal to itself, this SQL will return all the cards in the table.

Truncating without Comments

Account
1 OR 1=1 OR 1=1
First Name
John
Last Name
Doe
Last 4 of social
1
Sumbit
SELECT card FROM cards WHERE
  account = $account2 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = 1

Another repeating motif in these examples has been the usage of comments. We many times used comments to remove unwanted portions of the SQL. But there’s another way.

We highlighted the OR in blue. In the order of precedence, ORs are last so anything separated by an OR is evaluated separately. The red portion always evaluates to true, and so the entire WHERE clause is always true.

Misleading Numerical Inputs

Account
cast(1 as varchar(10)) OR 1=1
First Name
John
Last Name
Doe
Last 4 of social
1
Sumbit
SELECT card FROM cards WHERE
  account = cast(1 as varchar(10)) OR 1=1 AND
  first = ’John’ AND
  last = ’Doe’ AND
  ssn = 1

One approach for defeating some defense systems is to create confusion about the nature of the parameter.

The Account parameter doesn’t look like numerical input and automatic defenses that aim to validate it as a string will consider this to be a safe string to execute as it has no tags (‘) that terminate the string.

cast() is a standard database operator that converts the number to a string. The database will consider this to be valid input and convert the string back to a number to compare it to the account.

Login Screens

Login screens are also vulnerable to SQL injection attacks. There are two ways a user can be verified by the application, and either can be breached.

Query Validates Password

Username
admin'--
Password
x
Login
SELECT username FROM users WHERE
  $w

One option is for the query to validate the password. If the password matches the row is returned, otherwise, nothing is returned.

The SQL will contain the hash or a hash function in the where clause. When filling out the login screen like in the above example, the password-validating portion of the SQL is disabled.

This double dash (- -) will comment out the rest of the SQL and the row for admin will be returned suggesting that the password has been authenticated.

Query Retrieves Password

Username
' union select '5f4dcc3b5aa765d61d8327deb882cf99
Password
password
Login
SELECT password FROM users WHERE
  username = '' union
select '5f4dcc3b5aa765d61d8327deb882cf99'

Another option for validating passwords is to retrieve the hash from the database and validate it in the code.

This SQL will return the MD5 of the word ‘password’. When the application compares this md5 to the password we provided they will match. We simply return the password hash we want instead of whatever is stored in the database.

Introduction to Solutions

It is important to understand that SQL injection is, fundamentally, an exploitation of a security flaw in the code. Just like there are no guaranteed ways to ensure the code has no bugs, it is difficult to ensure the code has no SQL injection vulnerabilities. However, just like with any other bug, the best solution is to fix the flaw in the code. The alternative is to try and prevent, detect, or make it more difficult to exploit the vulnerabilities, but all these approaches have limitations.

The next sections will review different approaches and discuss the benefits and limitations of each. But before diving into the details, let’s perform a high-level review to get the lay of the land.

Solutions in the Code

As mentioned above, the best solution is to have high-quality code and proper coding techniques that avoid the possibility of SQL injection.

The best way to avoid a SQL injection attack is to use bind variables. This is the perfect solution and it completely prevents SQL injections. Slightly less robust methods include escaping input and input validation. It is also recommended to reduce client error reporting.

More on all these techniques, their value, limitations, and how to implement them later.

Database Solutions

SQL injection is ultimately a database attack that takes advantage of the flexible nature of SQL. Therefore, the most obvious place to look for a solution is in the database.

Several technologies can help detect and/or prevent SQL injection in the database. The problem with most is that they are not effective. Unlike most other database solutions for detecting SQL injection, the unique technology in Core Audit is extremely effective.

More on these various database technologies and their differences later.

Web Application Firewall (WAF)

The difficulty in identifying SQL injection at the database level created a new product category — Web Application Firewalls (WAF).

The idea is that it would be easier to detect the SQL injection attack when examining the application input rather than when looking at the SQL that was sent to the database.

It’s a nice theory, but it only works on simple or specific examples. When looking to prevent all the different types of SQL injection attacks described in this paper the problem becomes insurmountable. More on that later.

In-Application Firewall

A new technology developed by Blue Core Research aims to take WAF technology to the next level and incorporate it into the application.

The Core Audit Java Agent provides detailed auditing data about everything that happens in the application. It can also apply firewall-type blocking rules around the entire application or sensitive areas in the code.

An In-Application Firewall can benefit from context and information that are only available inside the application to protect the right methods correctly.

Detect

This wide range of approaches don’t only differ in their technology and effectiveness, but also in the value they deliver. However, there is also an inverse correlation between value and effectiveness.

Detective approaches aim to alert you of a possible SQL injection. The downside is that they don’t block it and you’ll need to take action against the attacker as well as fix the code.

The perceived value seems to be lower but the benefit is that the detection can be tuned to be sensitive and provide a fairly effective mechanism. A powerful detective engine such as the one found in Core Audit is likely to notify you of any SQL injection attempt.

Prevent

Preventive approaches are divided into two primary categories: blocking and curing. Depending on the type of solution both options might be available, like in the Core Audit Java Agent.

The benefit of blocking is that SQL injection attempts that can be recognized by the blocking engine will be prevented.

While there are significant differences between various products, none can claim to automatically pass all legitimate activity and block all attacks. More on that later.

In addition to their limited effectiveness, preventive solutions run the risk of blocking legitimate activity. The more versatile the activity and the more strict the blocking, the more likely it is that some valid activity will be blocked.

The Core Audit Java Agent, for example, offers calibration between lenient and strict methods so that you can apply the appropriate level of filtering to the relevant activity.

Sanitize

Finally, some solutions aim to cure the problem without blocking it. With these solutions, malicious injections are removed from the input and allowed to run safely. Core Audit also has this capability.

While this is a very nice advertisement, in many solutions it comes at a heavy price of introducing additional vulnerabilities. The problem is that an attacker can craft an attack that will leverage the sanitizing process as part of the attack.

This means that In addition to the limited effectiveness of SQL injection detection, using sanitizing can reduce the effectiveness of the solution by offering the attacker additional tools to bypass the security. More on that later.

Escaping the Parameter Cage

When looking at the various examples it becomes clear that SQL injection is about changing the SQL contract by escaping the parameter cage.

If the parameter is a string, then a SQL injection will need to terminate the string before it can modify the SQL. That usually means having a tag (‘) in the input.

Numerical parameters are terminated by whitespace (space, tab, newline, etc.) or other characters that are not part of the literal (for example, a /).

Detecting an escape from the parameter cage is one method for recognizing SQL injection in application firewalls but it relies on correctly identifying the parameter type.

Another common method is the identification of suspicious keywords or special characters. This method also has significant limitations.

Solutions in the Code & Best Practices

As mentioned above, the best solution is to have good code. Proper coding techniques can reduce and even eliminate the risk of SQL injection. This section will discuss these in detail.

Bind Variables

All SQL injection attacks involve escaping the previously mentioned parameter cage. Fortunately, all databases come with a full-proof API for preventing that. It’s called bind variables.

Bind variables are a way of sending the parameters separately from the SQL rather than embedding the constants into it. For example, the previous SQL used in the examples would look like this with bind variables:

SELECT card FROM cards WHERE
  account = :1 AND
  first = :2 AND
  last = :3 AND
  ssn = :4

When this SQL is sent to the database, the programmer needs to also send the values for the bind variables (:1, :2, :3, :4).

Writing SQLs this way avoids the database misinterpreting part of the value for being part of the SQL construct. The values can contain any character and the database will not be confused.

Using bind variables also has a performance side benefit. When the same SQL executes multiple times with different bind variables it only has to be parsed once. That is, actually, the original purpose for bind variables.

Unfortunately, rewriting the entire application to use bind variables might be expensive, difficult, or entirely impractical.

Escaping Input

The most common approach for fixing the application code is to escape all input fields. For example, if an input field contains a tag (‘), replace it with two tags (‘ ‘). The escaped quote can be embedded in a SQL without causing an injection attack.

Escaping the input John O’Brian in our example will yield this SQL:

SELECT card FROM cards WHERE
  account = 0 AND
  first = ’John’ AND
  last = ’O''Brian’ AND
  ssn = 5555

Escaping input is only half the solution because it only solves the problem for string input. Numerical input requires a different approach mentioned below.

While this is not the best method for preventing SQL injection, it is the most commonly used one. This is due to the relatively small impact on the code and, consequently, cheaper implementation costs.

Numeric Input

Fixing numerical input is, on the one hand, a very simple solution. However, this solution has to be individually implemented for every numerical input making it potentially time-consuming, difficult, and costly to implement.

Every programming language contains a function for converting strings to numbers and means of embedding those numbers into a string. By converting the input into a number before embedding it into the SQL string we ensure that it is an actual number.

For example, unsafe code in PHP might look like this:

$sql .= " id= ".$_GET["ssn"];

While the safe version looks like this:

$sql .= " id= ".intval($_GET["ssn"]);

$_GET[“ssn”] reads the GET parameter ssn and wrapping it in intval() will ensure that it’s a number.

In Java, the unsafe looks like this:

sql += " id = "+ssn;

While the safe version is this:

sql += " id = "+Integer.parseInt(ssn)

Similar examples can be made for any programming language from Python to C. The idea is always the same — convert to a number and then back into a string.

It should be noted that there are slight differences in the way this technique works in different languages and those depend on how the conversion function works.

For example, the intval() function in PHP would convert part of the string if possible, so the string “7 OR 1=1” would be converted to 7. If the input could not be parsed, the function would return the value 0.

In Java, however, the Integer.parseInt() function would attempt to convert the entire string to a number and any obstacle would cause it to throw an exception.

Regardless of the differences, all these implementations are safe and none would allow a SQL injection to pass through the numerical parameter.

Input Validation

Input validation is the concept of ensuring the input is acceptable and fits the operational parameters of the function. It is a much wider concept than stopping SQL injection aimed at preventing many types of bugs including other types of attacks.

Because of its wide nature, it’s hard to consider code that performs input validation as resilient to SQL injection. Input validation could include tests that ensure strings only have escaped tags (‘) and numbers contain valid numbers. However, these could just as easily not be included in the input validation tests.

For example, consider this little piece of PHP code:

if ($_GET['ssn'] > 0 &&
$_GET['ssn'] < 9999) return;

$sql .= " id= ".$_GET["ssn"];

This code checks that the ssn is between 0 and 9999 and will reject any input that is not within range. While this seems like a safe piece of code, sending it “7 OR 1=1” will bypass the test and result in a SQL injection.

The reason is that to perform the comparison, PHP must convert the string input to a number. It is as if each of the $_GET[‘ssn’] expressions in the if() statement is wrapped by the intval() function. Therefore, “7 OR 1=1” will be converted to the number 7 which falls within the valid range.

Therefore, if input validation is used to help prevent SQL injection, it is important to specifically require and test that inputs are valid from a SQL injection standpoint. This would usually entail escaping the input as described above.

Error Messages & Blind Injection

Application developers often display to the user the errors they got from the database. While this is useless information to the user, it is valuable information to the developers that need to debug the problem.

However, this same information is also valuable to a hacker that wants to perform a SQL injection attack. Database error messages can contain fragments of the SQL and other indications that can help a hacker tune the attack to work.

Not having these database error messages requires a hacker to do what’s known as Blind Injection. This involves more complex and difficult techniques that attempt to obtain information without seeing the database error messages.

One such technique is to use if() conditions that call a sleep function (e.g. waitfor) if they are true. In other words, know whether something is true or not based on how long it takes the query to complete.

The bottom line is that Blind Injection is more difficult and there’s no good reason to display error messages to the users. When a developer needs to debug a problem, they can always enable error messages for a particular scenario while performing their investigation.

Errors & Vulnerability Detection

Most applications have many pages with even more input fields and most hackers are far too lazy to try them all. So hackers use scanning tools that go through every input on every page and try to enter the type of input that might cause vulnerable code to return an error.

If certain types of inputs result in the application showing an error, there’s good reason to further investigate the vulnerability of that page.

It is important to remember that this discussion about Errors and Vulnerabilities assumes that the hackers have to “experiment” on the live application. Hackers that have another type of access to the application software will use it to detect vulnerabilities and learn how to exploit them.

Other types of access include:

  • Off-the-shelf software that can be downloaded and installed
  • Access to non-production systems (Dev, QA, pre-prod, etc.)
  • Access to the source code or free software

This leads to several interesting conclusions:

  • Don’t show errors to the users. Users will complain when they don’t get data. Showing users cryptic error numbers or messages does not improve their experience.
    Errors are useful for developers that need to debug the problem and for hackers that want to find places in the application that are vulnerable. Not showing errors at all is one way of hiding where the code is vulnerable.
  • Monitor for database errors. When operating properly, applications should not generate errors. However, when hackers try to locate vulnerable code and exploit it, you are likely to see database errors.
    When concerned about SQL injection, you should especially look out for errors that involve invalid SQL. This is one of the many things Core Audit can help you achieve.
  • Run the scanning tool yourself. Vulnerability scanning tools used by hackers are freely available on the internet. Running such tools against your application will show you at least some of the places where your code is vulnerable.

Database Solutions

SQL injection is ultimately a database attack that takes advantage of the flexible nature of SQL. This section will discuss various technologies that can help detect and/or prevent SQL injection in the database.

Static SQL Analysis

SQL analysis is the oldest technology that attempted to detect SQL injection. The severe limitations of this technology gave rise to WAF technology.

Static-analysis aims to analyze SQL to identify suspicious expressions in the where clause. The classical example of OR 1=1 will always be detected by these tools, but what about more complex ones?

The above section about Variations is a good place to start exploring the strength of a static SQL analyzer.

No static analyzer has support for such a wide array of operators, functions, and expressions. But even if one did, it would still be limited to expressions that use literals.

Expressions that take advantage of column names (such as first != last) or SQL Batch are outside the theoretical limits of static analyzers and they would be unable to detect those.

A shortlist of expressions that could be identified by static analyzers includes:

  • Operators — a static SQL analyzer needs to have a complete list of all database operators including +, -, *, /, =, !=, <, <=, >, >=, <>, IN, NOT IN, ANY, SOME, ALL, BETWEEN, LIKE, etc.
  • Functions — a static SQL analyzer needs to have a complete list of all database functions including string manipulation (Concat, Substr, Lower, Upper, Trim, Replace, Translate, etc), numerical functions (Ceil, Round, Floor, Trunc, Mod, etc), character functions, date functions, conversion functions, and more.
  • Expressions — a static SQL analyzer needs to evaluate expressions the same way the database would, including if conditions, subqueries, and more.

AI & SQL Intention

More advanced solutions than static analyzers use machine learning and AI to learn the difference between a “good” and “bad” SQL. These rely on complex language analysis that aims to understand the intention of a SQL.

It’s difficult to determine the effectiveness of this technology because it relies on every customer teaching it what is good and bad in their particular environment.

As a result, any SQL that is misclassified can simply be attributed to the fact that the customer hasn’t taught it yet. Similarly, any SQL injection that is detected doesn’t mean that the next one will be detected as well. It’s just part of the unpredictable nature of AI.

However, it is difficult to imagine that even the best AI taught by the most diligent customer will be able to detect OR first != last as a SQL injection since it can be a legitimate part of the application logic.

Anomaly Analysis

Blue Core Research developed a unique technology for detecting SQL injection. Unlike static analyzers and AI, we don’t attempt to understand the SQL. As we’ve demonstrated above, the simple truth is that it’s impossible to do this well.

Our solution to SQL injection detection is to leverage our Full Capture technology and our repository technology to create a list of all the SQL constructs that run in the database.

Since applications repeat the same SQLs over and over, over time we can learn every SQL the application uses. All that’s left is to identify SQL constructs that we haven’t seen before and flag those as possible SQL injection attempts.

For example, if we’ve never seen the application run a particular SQL with OR first != last in the where clause, we’ll alert you of a possible SQL injection attempt when we see it.

Anomaly Analysis is also the only technology that will always be able to successfully detect the injection of an entire SQL using a batch. This type of injection is especially difficult to detect because there’s nothing suspicious about the injected SQL itself. The only thing suspicious is that this SQL is something the application never does. Anomaly Analysis will easily detect that.

The Core Audit Anomaly Analysis engine can do a lot more than identifying SQL injections, but that’s just one more benefit of this technology. Anomaly Analysis is the most effective technology for detecting SQL injection.

Web Application Firewall (WAF)

As mentioned above, Web Application Firewall is a technology that was born to deal with the threat of SQL injection. It started when the only other technology at the time was static SQL analysis which was quickly proven to be ineffective.

Some vendors go as far as claiming they can fix the SQL injection problem without fixing the application code. These claims are very appealing to customers, but they are entirely false. These vendors market their solutions as WAF which includes the word “Firewall” in the title and gives customers a false sense of security.

The idea behind WAF was that while it is impossible to detect SQL injection through static analysis of the SQL sent to the database, it should be possible to detect the injection when doing static analysis on the input sent to the application. That’s a nice theory, but the examination we’ll perform below will show it is incorrect.

WAF Concept

The concept behind WAF is to examine the user input before it’s incorporated into the SQL. The idea is that a SQL fragment will look suspicious in the user input but impossible to detect once it blends into the complete SQL.

For example, having OR first!=last in the where clause of a SQL could be a normal part of the application logic and impossible to distinguish from it. However, that phrase would be highly suspicious in regular user input.

Many hackers are lazy and use SQL Injection scanning tools. These tools go through every parameter on every page searching for potential vulnerabilities. Accordingly, many WAF implementations turned from something that protects against SQL injection to something that detects the type of unusual input these tools generate. This isn’t protection against SQL injection nor is it effective since there are many tools (including custom ones).

However, this brings up an interesting question – Why purchase a WAF that checks user input against known patterns in scanning tools when you can find all those vulnerabilities yourself? All you need to do is run a SQL injection scanning tool. These tools are available on the internet for free.

There are several fundamental problems with this concept:

  • SQL Injection doesn’t have to look like a SQL fragment — A SQL Injection could be as simple as “id –” (see the section on Keywords). It doesn’t have to have a keyword like OR, UNION, etc. It doesn’t have to have special symbols like = or !=. It could be something that will trick even a human inspection of the input.
  • Lack of Grammar — Unlike SQLs, user input doesn’t follow any predictable grammar. There’s no way to know what the input means or why the user entered it. This makes any type of static analysis extremely limited and entirely based on heuristics, signatures, and search patterns.
  • Different parameters have different meanings — User input varies a lot across different parameters in the application and across all applications. Sometimes it’s a page number, sometimes a search request, and sometimes someone’s name. Therefore, it’s difficult to define rules that are always true for any input.
  • Safety is code dependent — While some parameters may be vulnerable to SQL injections, others might be properly handled and safe. The input O’BRIAN might be suspicious because it has a tag (‘) and, therefore, is trying to search for code vulnerabilities. At the same time, it might be safe if the code handles this particular input properly.

WAF Methods & Challenges

All WAF solutions face the same basic challenges. The difference between the tools is in how they balance those challenges and how good is the implementation.

Parameter Type

As mentioned above, SQL injection is all about escaping the parameter cage. However, the parameter cage depends on the parameter type. To escape the cage of a textual parameter you need a tag (‘). To escape the case of a numerical parameter, you need a literal delimiter character such as space, slash, etc. Numerical parameters can also be replaced by expressions that change these rules.

Detecting the parameter type might not be as simple as it seems considering problems such as Misleading Numerical Inputs mentioned above.

If we prevented any character that is not a letter or a digit, we could completely prevent SQL injection. However, that means that, for example, no one can enter a space anywhere in the application input. That’s not a problem for numerical input, but it is a problem for textual one.

By guessing the parameter type, a WAF can apply different filters to different parameter types and allow a wider range of input increasing the risk of SQL injection.

Special Characters

The parameter type problem expands into various special characters that can be used in SQL injection as well as other types of injections (e.g. file name injections like LFI, RFI, etc).

There are many special characters and when relying solely on filtering special characters we will quickly prevent many types of valid application input.

Examples of special characters are

  • Comments can be a double dash (- -), a /*, and in MySQL also a hashtag (#).
  • Parenthesis can be a function call or a subquery.
  • Strings are terminated by a tag (‘), but in MySQL, strings can also be enclosed by double quotes (“)
  • Numeric literals end with whitespace (space, CR, LF, Tab), a slash (/), a dash (-), and many other characters.
  • A semi-column (;) can terminate a SQL in a batch

Keywords

The other strategy is filtering keywords that appear in SQL like OR and UNION. However, in batch injection a semi-column (;) is optional and a new SQL can start at any point. That means that SELECT, INSERT, UPDATE, DELETE, CREATE, ALTER, DROP, and a few other keywords also need to be filtered.

But even without the influx of keywords, “UNION BANK OF CALIFORNIA” contains the word UNION. And it’s not difficult to think of phrases that contain the word OR.

Mixed Strategy

Ultimately, a good WAF solution needs to balance various strategies of looking at special characters, keywords, parameter types, and more.

Unfortunately, even with all the strategies combined, it is difficult to strike the right balance that will allow most of the legitimate inputs and prevent most of the malicious ones.

Automatic or Configurable

One of the challenges WAF solutions face originates in customer expectations. Customers want a one-click solution. Push one button and get protection from SQL injection. Since customers want, salespeople promise to deliver.

Unfortunately, as we’ve explained so far, it’s simply impossible. A solution that will work on the entire application and pass more legitimate input will also pass at least some types of SQL injection attacks. It only takes is one successful type of attack to compromise all the data in the database.

Instead of looking for the “One button” solution, customers that are willing to customize different strategies for different types of input can enjoy superior protection with less false positives.

The more configuration the WAF solution has and the more granular the control, the more likely it is that it can be tailored for your particular application and offer better protection.

Implementation

Just like any software solution, there are significant differences between good and bad WAF implementations. Here are a few examples that might seem funny but are entirely real:

  • WAF might filter the word “union” in lower case, but fail to filter “UNION” in upper case or “UnIoN” in mixed case.
  • WAF might filter the word “OR” but failed to see that %4F%52 is the same thing in hex.

There are many other potential flaws and weaknesses in WAFs and they ultimately become yet another piece of software whose bugs can be exploited.

Unfortunately, there is no simple and reliable way to measure the quality of a WAF solution and no unbiased publication. Testing it yourself with the many examples on this page can give you an idea of the quality of a solution.

The WAF Sanitizer

Some WAF solutions block malicious activity, others clean it, and some offer both options. The sanitizer is the portion of the WAF that cleans the malicious activity. Like any piece of software, this too can have weaknesses.

Sanitizers have a particular type of weakness that happens a lot – by removing one malicious attack they expose another malicious attack.

If the WAF solution has the option to block potentially malicious activity and avoid the sanitizer, it is safer to do so.

For example, these inputs:

1 O/**/R 1/**/=/**/1
un/**/ion SEselectLECT * F'RO'M

Could be sanitized by removing comments, symbols, and keywords to produce:

1 OR 1=1
union SELECT * FROM

In-Application Firewall

Blue Core Research developed a new type of technology that can supplement or replace a WAF. This technology runs inside the application instead of outside and offers control over the protection applied to different parts of the code without changes to the source.

The Core Audit Java Agent uses native JVM capabilities to add security at runtime to the regular JAR files you’re currently using.

The Core Audit Java Agent can add several types of security, but since this paper is about SQL Injection, we will only go into detail about the Basic Blocking capabilities that aim at addressing this threat.

The types of security the Core Audit Java Agent can add are:

  • Auditing (L2-3) — Full Capture of the application sessions and the activity they perform. Core Audit native application auditing levels 2-3 include report generation, alerting, forensics, analysis of behavioral profiles, and more.
  • Basic Blocking — Apply WAF-style firewall rules at various points in the application. More details below.
  • Blocking Policies (L4) — Apply custom rules about who can do what in the application. For example, restrict which IPs or users can run what activity at what times.
  • Reactive Blocking — Configure policies that automatically block IPs or users with too many failed logins or errors across multiple application servers.

Filter & Parameter Control

The Core Audit Agent comes with multiple filters that can be applied as needed to different parameters. You can apply tight SQL injection filters to some parameters and more lenient to others.

There are numerical filters for numerical data, String filters for textual data, general Safe filters that auto-detect the data type, Keyword filters, and more. Having multiple filters that can be applied in different ways gives you the flexibility to tailor the In-Application firewall to your application.

Depth Control

Filters can be applied to any method in the application. They can be applied to URL parameters that are received from the user or to a method that creates a SQL from them.

You can apply lenient filters to search parameters you believe to be safe, and strict filters when those parameters are added to the SQL to ensure they are as safe as you thought.

Areas in the application that have fragile code can be better protected, while well-tested portions of the code can be granted more latitude in their interaction with the user.

Since filters can be applied to any method within the application, they are not limited to filtering web-related activity. Filters can be applied to data read from files, from various network sources, from databases, internally generated activity, and more.

Context & Action Control

The In-Application firewall is part of the Core Audit generic application auditing technology. As such, all information reported about potential SQL injections is linked to the full user session context.

That means that every blocked activity will be related to the IP address, user, program, and every other piece of information reported by the application. You will be able to know what happened before and after even if those actions did not trigger an alert.

The In-Application firewall can block actions by throwing exceptions. These exceptions can be customized to fit the type of exception handled by each area of the code.

Additionally, the In-Application firewall has a high-quality sanitizer that can eliminate the identified threat without introducing another.

In-Application Firewall or WAF

One of the benefits of WAF is that it filters the activity before it reaches the application. The In-Application firewall cannot make this claim.

However, since WAF always runs outside the application, it’s missing the context in which the activity runs, and the ability to apply different filters at different depths in the application stack.

The In-Application firewall offers granular control over the types of filters so they can meet the specific needs of each portion of the application. There’s no need to compromise on a filter that is “generally okay” and doesn’t give the appropriate level of protection.

The In-Application Firewall runs within the application so there’s no need to deploy additional machines and filter the network traffic. It all runs on the same machine and as part of the application.

The In-Application Firewall comes with full application auditing delivered by Core Audit Levels 2-3 and can be upgraded to apply context-based blocking policies or reactive policies with Core Audit Level 4. All these capabilities are only possible with the Core Audit technology stack.

Summary

SQL injection is a difficult attack to combat, and rewriting the application using bind variables is the only true solution to the problem.

If converting the application to entirely use only bind variables is not a viable course of action, Core Audit technologies in the application and the database are the most effective in detecting and preventing possible SQL injection attacks.

Each independent approach has limitations, so the best solution is a mixed approach of improving the code, deploying effective detection in the database, and preventive measures in the application.

Beyond SQL injection, Core Audit technologies can help secure databases and applications to ensure your sensitive data is not compromised by external or internal actors.