Structure and Development of Web Applications

In this part, we will look into dynamic content in the browser, the use of databases, and briefly visit the underlying HTTP protocol.

Web Applications continued..

A few of the participants have asked for information on how to reload the changes automatically to the web application. There exists a few ways to do this, one of which is the use of Spring Boot developer tools. They can be enabled by including the following dependency to the pom.xml file.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-devtools</artifactId>
</dependency>

When the developer tools are in use, changes in the code trigger an automatic restart of the web application. This depends on the used programming environment. In NetBeans, one may need to additionally take the Spring Boot Maven plugin into use. This can be done by adding the following snippet to the pom.xml.

<build>
    <plugins>
        <plugin>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-maven-plugin</artifactId>
            <configuration>
                <fork>true</fork>
            </configuration>
        </plugin>
    </plugins>
</build>

Now, the application can be restarted from the command line using the command mvn spring-boot:run (when in the same folder with the pom.xml file). This launches the application. As the program is developed in NetBeans, changes will trigger reloads.

When the above plugins are in use, it is possible to also use LiveReload to trigger a site refresh in the browser. Chrome plugin for LiveReload can be found at https://chrome.google.com/webstore/detail/livereload/jnihajbhpnppcggbcgedagnkighmdlei.

Dynamic Content in the Browser

The pages that are shown to the user are defined using the HTML language. A single HTML document consists of nested and sequential elements that define the structure and the content of the page.

Each element may contain attributes that may have one or more values. For example, an input element typically has the attribute name, and the value of that attribute is then used to identify the data from that input element as it is being sent to the server. Other common attribute names include id that is used to define a unique identifier for the element and class that is used to define a classification for that element.

Javascript

While HTML is the language for defining the structure and content of a web page, Javascript is a language for defining dynamic content to the page. javascript is a programming language, and like almost any programming language, it is executed one command at a time, top to bottom, left to right.

Javascript file names typically end with .js and they are included to a HTML page using the script element. The element script has an attribute src, which defines the location of the source code file.

When adding Javascript code to a Spring Boot project, it is typically added to the folder src/main/resources/public/javascript/. All the files in the folder src/main/resources/public are made publicly available and downloadable through the server. Given that a Javascript file -- say code.js is in the folder javascript, the script-element is used as follows: <script th:src="@{/javascript/code.js}"></script>.

Assume that the file code.js has the following content, i.e. a function that displays an alert pop-up with the text "Hello there!".

function sayHello() {
    alert("Hello there!");
}

The Javascript file can be included to a HTML site as follows. Note that we load the Javascript file at the end of the document. This is both to allow the browser to parse the page before including the Javascript code, and to avoid any blocking downloading of content.

<!DOCTYPE html>
<html>
    <head>
        <title>Title (shown in the browser bar)</title>
    </head>
    <body>
        <header>
            <h1>Title on the page.</h1>
        </header>

        <article>
            <p>Text content (within a paragraph, p). By pressing
            the button below, a javascript function "sayHello" is called.</p>
            <input type="button" value="Boom!" onclick="sayHello();" />
        </article>

        <!-- Ask the browser to load the Javascript -->
        <script th:src="@{javascript/code.js}"></script>

    </body>
</html>

Modifying page content with Javascript

Each element in a web page can be accessed and modified using Javascript. Specific elements can be identified using the querySelector method. It allows identifying elements based in the id-attribute value (identified with a hash), as well as the class attribute value (identified with a dot). If multiple elements with the same class exist, querySelectorAll is used.

<!DOCTYPE html>
<html>
    <head>
        <title>Title (shown in the browser bar)</title>
    </head>
    <body>

        <article>
            <input type="text" id="content" value="0"></input></p>
            <input type="button" value="Add!" onclick="increment();" />
        </article>

        <!-- Ask the browser to load the Javascript -->
        <script th:src="@{javascript/code.js}"></script>

    </body>
</html>

In the above HTML document, the input field can be identified with the id value "content". Using Javascript, the value of the field could be changed as follows.

function increment() {
    document.querySelector("#content").value = "new value";
}

The above functionality would change the field value to "new value", which is not exactly what the function promises. We can, also, change the functionality so that it increments the previous value by one.

function increment() {
    var value = Number(document.querySelector("#content").value) + 1;
    document.querySelector("#content").value = value;
}

Setting a value as a part of text

In the previous example, the value attribute of a text field was altered. However, some elements do not have an attribute called value. Instead, for some elements, the value of the element is inside the element. Changing the value inside an element can be changed -- for example -- using the innerHTML parameter of an element.

As an example, one could create a simple validator that verifies that an input field is not empty.

<!DOCTYPE html>
<html>
    <head>
        <title>Title (shown in the browser bar)</title>
    </head>
    <body>

        <article>
            <p>Type in your username</p>
            <p id="error"></p>
            <input type="text" id="content"></input></p>
            <input type="button" value="Add!" onclick="validate();" />
        </article>

        <!-- Ask the browser to load the Javascript -->
        <script th:src="@{javascript/code.js}"></script>

    </body>
</html>
function validate() {
    var content = document.querySelector("#content").value;
    if(!content) {
        document.querySelector("#error").innerHtml = "No content to process";
        return;
    }

    // do something else
}

Adding elements to a page

New elements can be created using the createElement function. In the example below, a new p element is created, and text content is added to it. Finally, the paragraph is added to an element with the id "messages".

var paragraph = document.createElement("p");
var textContent = document.createTextNode("content");

paragraph.appendChild(textContent);

document.querySelector("#messages").appendChild(paragraph);

JSON dataformat and retrieving data from a server

Objects and data in Javascript are typically represented using the Javascript Object Notation (JSON) format. The format follows essentially a key: value structure, where variables are separated using commas. The definition of an object starts and ends with a bracket. For example, a person could be represented as follows.

var person = {name: "Jack Bauer", age: 24};

Assuming that the page would have an element with an id "user", the person details could be added to the page dynamically. Variable values can be retrieved using dot notation.

var person = {name: "Jack Bauer", age: 24};
document.querySelector("#user").innerHtml = person.name;

Often, we wish to retrieve data from the server. This can be done using the XMLHttpRequest object as follows.

var xmlHttp = new XMLHttpRequest();
xmlHttp.onreadystatechange = function() {
    if (xmlHttp.readyState == 4 && xmlHttp.status == 200) {
        var response = JSON.parse(xmlHttp.responseText);
        document.querySelector("#content").innerHTML = response.value.joke;
    }
}
xmlHttp.open("GET", "http://api.icndb.com/jokes/random/", true);
xmlHttp.send(null);

In the example above, a query is made to the address "http://api.icndb.com/jokes/random/". When a response is received, it is processed and content from the response is shown to a user in an element with the id "content".

Returning JSON Data from the Web Application

The controller classes can return JSON data as well. If a method that returns an object is annotated using the @ResponseBody annotation, then the object will be returned as a JSON object by default. Let us assume that we have a class Book, which is as follows:


public class Book {
    private String name;

    public String getName() {
        return this.name;
    }

    public void setName(String name) {
        this.name = name;
    }
}

Now a controller that returns a Book object and has been annotated with the @ResponseBody annotation would return a JSON object as a response to the request made to that address.


@RequestMapping("/books")
@ResponseBody
public Book getBook() {
    Book book = new Book();
    book.setName("The Book of Eli.");
    return book;
}

The assignment template has some functionality for adding tasks. Your task is to alter the loadTasks function so that the existing tasks are loaded when the page is shown to the user. Do this using Javascript -- note that the server will return a list of objects.

If you wish an additional challenge, add the functionality to remove tasks as well.

Note that the application has no automated tests. Once you are able to list the tasks, return your solution to TMC.

The Data has Value -- Let's store it!

The web applications that we built in the previous part of this course handled and stored data within the application. This means that when the web server or the application is restarted, the data is lost.

Almost every web application has the need for storing and retrieving data. The data can be stored into the file system as files by the application, or the responsibility of storing the data can be given out to a separate application such as a database management system. Database management systems have been built with the explicit purpose of storing and retrieving data in a robust and efficient manner, and they can reside both on the same server as the web application or somewhere elsewhere on the internet.

Although there are vast differences between the term database management system and database, we will use the term 'database' to cover these both. Similarly, although there are many types of database systems, we will mostly focus on relational databases.

Java Database Connectivity API

We will use the H2 Database Engine for getting started with databases. H2 Database Engine can be added to a Maven project by adding the following dependency to the pom.xml file.

<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <version>1.4.193</version>
</dependency>

The previously added dependency provides H2-specific support for interacting with the database.

A program that uses a database needs to (1) create a database connection, (2) execute a query to the database, (3) do something with the query results, and (4) close the connection. When using Java and JDBC, a program that does the previously mentioned steps could be as follows -- we assume, that there exists a database table called "Book" with the columns "id" and "name".

// Open connection
Connection connection = DriverManager.getConnection("jdbc:h2:file:./database", "sa", "");

// Execute query and retrieve the query results
ResultSet resultSet = connection.createStatement().executeQuery("SELECT * FROM Book");

// Do something with the results -- here, we print the books
while (resultSet.next()) {
    String id = resultSet.getString("id");
    String name = resultSet.getString("name");

    System.out.println(id + "\t" + name);
}

// Close the connection
resultSet.close();
connection.close();

Perhaps the most important part here is the class ResultSet, which provides an access to the query results. The method next moves to the next row in the result table, and the method getString("column name") retrieves the value for column "column name" for that row as a String.

The data in a database is typically organized so that it represents the problem domain and follows a specific structure. This structure, i.e. schema, defines the database table names, the columns in each table, and the datatypes for each column. In addition to the schema, a database contains data.

H2 Database Engine provides support for loading schemas and data using the RunScript class. In the example below, the content of database-schema.sql and database-import.sql is inserted to the database after the database connection has been made.

// Open connection to database
Connection connection = DriverManager.getConnection("jdbc:h2:file:./database", "sa", "");

try {
    // If database has not yet been created, create it
    RunScript.execute(connection, new FileReader("database-schema.sql"));
    RunScript.execute(connection, new FileReader("database-import.sql"));
} catch (Throwable t) {
    System.out.println(t.getMessage());
}
// ...

You have a database with the following schema at your disposal.


CREATE TABLE Agent (
    id varchar(9) PRIMARY KEY,
    name varchar(200)
);

Write a program that outputs all the agents and their identifier codes from the database. The output format should be as follows:

agent_id agent_name
agent_id agent_name
...

Once completed, return the assignment to the TMC server.

The same database schema from the previous assignment is at your disposal here. Implement the functionality for adding an agent to the database. The application should function as follows (input from the user given in red):

Agents in database:
Secret	Clank
Gecko	Gex
Robocod	James Pond
Fox	Sasha Nein

Add one:
What id? Riddle
What name? Voldemort

Agents in database:
Secret	Clank
Gecko	Gex
Robocod	James Pond
Fox	Sasha Nein
Riddle	Voldemort

Now, when the application is started again, agent Voldemort is within the database and the details of a new agent is queried from the user.

Agents in database:
Secret	Clank
Gecko	Gex
Robocod	James Pond
Fox	Sasha Nein
Riddle	Voldemort

Add one:
What id? Feather
What name? Major Tickle

Agents in database:
Secret	Clank
Gecko	Gex
Robocod	James Pond
Fox	Sasha Nein
Riddle	Voldemort
Feather	Major Tickle

Objects and Databases

When working with classes, objects and databases, the need for transforming database query results into objects arises (see Object-relational mapping, ORM). As this is a very typical task, an abundance of tools have been created for the task.

ORM tools offer -- among other things -- the functionality needed to transform existing classes into database schemas and to build queries using objects. This has created a situation where large parts of the typical database interaction is no longer implemented by the developer, but by a framework. The developer effectively only needs to implement the database interaction in those parts, where the existing approaches are not sufficient.

One standard for ORM technique in Java is the Java Persistence Api (JPA), which has been implemented by a set of frameworks. When using JPA and an implementation of the standard such as Hibernate, developing basic database queries becomes quite straightforward.

Classes and Entities

The JPA standard states that each class that represents a database table should be defined as an entity; this can be done with the annotation @Entity. Moreover, each class that represents a table should have an identifier that can be used to identify a specific instance of that class. Such an identifier is typically an object variable which is annotated using the @Id annotation. Finally, the class should implement the Serializable-interface.

For example, the following class that represents a person would be transformed into a database table on the fly, and instances of it could be stored to that database table.

// package

import java.io.Serializable;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;

@Entity
public class Person implements Serializable {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Long id;
    private String name;

    // getters and setters
}

If the programmer wishes to, the column names and the database table name can be included using the annotations @Column and @Table.

// package

import java.io.Serializable;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.Table;

@Entity
@Table(name = "Person")
public class Person implements Serializable {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    @Column(name = "id")
    private Long id;
    @Column(name = "name")
    private String name;

    // getters and setters
}

The above configuration defines a database table called "Person" that has the columns "id" and "name". The column types are inferred from the variable types (but can be also defined through the @Column annotation).

The above examples follow the JPA specification. The Spring project called Spring Data JPA provides a superclass AbstractPersistable that can be inherited. It provides functionality that makes the previous definitions a bit more straightforward.

// package

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Table;

@Entity
@Table(name = "Person")
public class Person extends AbstractPersistable<Long> {

    @Column(name = "name")
    private String name;

    // getters and setters
}

Now, creating the queries that alter the data in table "Person" is rather straightforward. We need to implement an interface that extends the interface JpaRepository. This provides us all the basic functionality needed for altering the database contents.

// package

import org.springframework.data.jpa.repository.JpaRepository;

public interface PersonRepository extends JpaRepository<Person, Long> {
}

Note that we only created an interface, but not the actual implementation. The Spring framework takes care of the rest for us, given that we tell it to autowire -- i.e. include -- the implementation of the interface to our application. This is done using an annotation called @Autowired.

The database functionality can be included to a controller as follows:

// package and imports

@Controller
public class PersonController {

    @Autowired
    private PersonRepository personRepository;

    // when a request is made to the address "/persons"
    @RequestMapping("/persons")
    public String listAll(Model model) {

        // find all persons from the database and add them to the model
        model.addAttribute("persons", personRepository.findAll());

        // then create a view from a file called "persons.html" and
        // send it as a response to the request
        return "persons";
    }

    // etc ...
}

The assignment template contains an application that always returns the message "Hello Web!" to the user. Change the implementation so that the message content is randomly selected from the messages in the database. That is, if the database has messages "Hello" and "World", the message "Hello" should be shown now and then, as should the message "World".

Once finished, return your solution to TMC.

Database Transactions

Transactions are used to verify that all the database operations in a group are executed, or that none of them are. Database management systems offer support for implementing transactions, but, as we often work outside the database, additional steps are needed.

Spring provides transaction support on both class- and method-level through the @Transactional annotation. If a method has been annotated using the @Transactional annotation, then all the database functionality within that method will be performed within a single transaction. If the annotation is on the class level (i.e. before the class definition), then all the methods in that class are transactional.

Perhaps the most classic transaction example is shown below. If the execution of the method fails (e.g. an exception is thrown) after money has been withdrawn from one account and the money has not yet been added to another, then the original withdraw will be also canceled. Without the annotation @Transactional, the money would disappear.

@Transactional
public void bankTransfer(Long fromAccount, Long toAccount, Integer amount) {
    Account from = accountRepository.findOne(fromAccount);
    Account to = accountRepository.findOne(toAccount);

    from.setBalance(from.getBalance() - amount);
    to.setBalance(to.getBalance() + amount);
}

The annotation @Transactional also indicates that the entities are managed within the method. That is, the entities that have been loaded from the database are tracked, and the changes that are made to them are written to the database at the end of the method.

If the method would not have been annotated with the @Transactional annotation, the accounts would have to be separately saved if we want to commit the changes to the database.

@Transactional
public void bankTransfer(Long fromAccount, Long toAccount, Integer amount) {
    Account from = accountRepository.findOne(fromAccount);
    Account to = accountRepository.findOne(toAccount);

    from.setBalance(from.getBalance() - amount);
    to.setBalance(to.getBalance() + amount);
}

The assignment template has a simple application for managing accounts and transfers. There is, however, small things to be fixed in the transfer functionality. Think about the fixes that are needed, perform them, and return the assignment to the TMC server.

Note that the assignment has no tests; this means that you get to define what types of changes are needed.

Handling object relationships

When working with databases, information in one table can refer to information in another table. A customer -- for example -- can have multiple orders, and each order points to a specific customer. In Java, the references for such a case would be written as follows.

// package and imports

public class Customer {
    // variables

    private List<Order> orders = new ArrayList<>();

    // getters and setters
}
// package and imports

public class Order {
    // variables

    private Customer customer;

    // getters and setters
}

When working with JPA and databases, the programmer needs to define the relationships with annotations. These relationship types are @OneToMany, @ManyToOne and @ManyToMany. The above classes would be transformed into the following entities.

// package and imports

@Entity
public class Customer extends AbstractPersistable<Long> {
    // variables

    // the field customer in Order points here
    @OneToMany(mappedBy = "customer")
    private List<Order> orders = new ArrayList<>();

    // getters and setters
}
// package and imports

@Entity
public class Order extends AbstractPersistable<Long> {
    // variables

    @ManyToOne
    private Customer customer;

    // getters and setters
}

The assignment template has the entities for managing accounts and clients, but they are missing a connection. Modify the application so that a customer may have multiple accounts, but each account belongs only to a single client. Adding an account must also add the account to a client.

Look for hints and tips in the existing classes and templates. When the application works as intended, return it to TMC.

The HTTP protocol

Almost everything that we've done so far has relied on the HTTP protocol. The HTTP protocol is the protocol that browsers and servers use for communication. It defines eight separate request methods, from which the GET and POST are most widely used. Each request method has a set of restrictions and suggestions on the content of the message and on how the message should be processed by the server. For example, Java Servlet API (version 2.5) includes the following suggestion for handling the normal GET requests, i.e. the ones that users use for retrieving data.

The GET method should be safe, that is, without any side effects for which users are held responsible. For example, most form queries have no side effects. If a client request is intended to change stored data, the request should use some other HTTP method.

Data is retrieved using the GET method

The GET method is used for retrieving data. When you type in an address to the browser input field and press enter, the browser performs a GET request. No additional parameters are required. In the protocol, assuming that we are using HTTP/1.1, information on the host machine needs to be sent as well -- this is needed so that each server can handle multiple domains.

GET /page.html HTTP/1.1
Host: f-secure.com

We have used the @RequestMapping annotation for handling GET requests that are sent to the server. Actually, the same annotation can be used for any of the HTTP request methods; the annotation can be configured using attributes. For example, the following annotation would capture GET requests to the path "/salmiakki": @RequestMapping(value = "/salmiakki", method = RequestMethod.GET).

Data is sent using the POST method

The practical difference between the POST and GET requests is that whilst GET methods may contain information within the path and the request parameters, content can be added to the body of the POST method. The type of the data in the body is sent as a part of a request header, and it can contain images, videos, music etc.

POST /page.html HTTP/1.1
Host: f-secure.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 10

...data...

HTTP is stateless

HTTP is a stateless protocol which means that each request that is sent to a server is processed individually, and from the point of view of the server, the requests are not linked with each others. This design decision was made to make it straightforward to retrieve content from multiple servers and to increase the performance of the HTTP protocol (Basic HTTP as defined in 1992). The decision was initially solid as most of the web traffic was related to transmitting static content.

Identifying users is however needed. For example, web shops and other services that identify users require means to maintain knowledge of the user. A classic -- but rather poor -- way to maintain knowledge of the user was to use parameters in the GET request that could be used to identify the user. This is not recommended however, as these parameters can also be modified or tampered with.

Currently, many web applications include user-specific functionality that expect that the users can be identified. Here, cookies that were introduced in the HTTP/1.1 protocol are handy. When the server adds a cookie to the response that it sends to the user, the browser of the user will always send the cookie back to the user. This way, sessions across multiple requests can be maintained.

The assignment template has a very basic web shop functionality. Study the code and add a vulnerability to it that makes it possible to conduct a session highjacking attack on the web site. Limit your approach to predictable session tokens and client-side attacks such as inserting malicious Javascript codes. The template already contains a mistake that will make your job a bit easier.

Once finished, submit the assignment to TMC. There are no tests for the assignment, so you can modify the application as much as you want.

During this part of the securing software course, we briefly visited frontend and backend functionality and looked into how databases are used when developing web applications. During the next week we will look deeper into the typical security issues in web applications.

Table of contents

Some parts of this page might not work on your current browser. Consider switching to either Chrome or Firefox. Got it!