Web Servers and Web Applications
Web servers are essentially programs that listen to incoming connections (typically, on port 80) and follow the HTTP-protocol. Whilst at first, web servers were mainly used to serve static content such as HTML pages and funny GIF images, they nowadays serve dynamic content that is often built separately for each user.
Web applications run on the web servers. Web servers listen for the incoming connections and forward the connections to the web applications. Developers who work with web applications very rarely implement functionality within the web servers.
Web applications include both client- and server-side functionality. Client-side functionality is executed in the browser of an user, i.e. you, whilst the server-side functionality is executed on the web server.
Whenever the user makes an action such as types in an URL and presses enter or clicks a link in the browser, the user's computer sends a request to the server. Once the server receives the request, it processes it and builds a corresponding response. The response may be, for example, HTML-code, JSON data, or an image that the browser should display to the user.
When constructing the functionality that is shown in the browser, development is typically focused on three separate and intertwined fronts. The structure of the view is constructed using HTML, the layout and theme using CSS, and possible dynamic functionality with JavaScript.
On the other hand, when constructing the functionality that is executed on the backend — i.e. the server —, the developer typically focuses on the functionality that is needed to retrieve and construct the response that needs to be sent back to the user. This often involves connecting to other software such as a database server, and retrieving data from there. The server may — again — run either locally or on a separate machine. Naturally, the client- and server-side development can be interlinked, and software developers typically work on tasks that are related to both sides. That is, these days it is rather rare that a developer focuses solely on e.g. maintaining a database.
When the user accesses an online page using a browser, a request is sent to the server. The server creates the content of the response and sends it back to the user. If the content has links to other resources such as images or scripts, each of these resources are retrieved separately by the browser (except for the HTTP/2 Server Push-model).
Building a Simple Web Application
The main functionality of a web application is to create a response to each request. Developers do not typically implement the web-server functionality and the HTTP-protocol specifics, but use a framework that abstracts away many of the existing tasks. Here, we look at one such framework, called Django. For Java(TM) enthusiasts, a de facto web framework is Spring.
Let us look at the Hello Web exercise in TMC (the actual exercise assignment is given in below). The exercise contains a barebone webserver that can be start up with
python3 manage.py runserver
and should be available at http://localhost:8000/
if the port is free.
There are a lot of files in the project, most of them are automatically generated skeleton files, and provide functions that are not of our concern for the moment.
We are interested in two files.
The first file is src/pages/views.py
,
from django.http import HttpResponse
def homePageView(request):
return HttpResponse('Hello World!')
and the second file is src/pages/urls.py
from django.urls import path
from .views import homePageView
urlpatterns = [
path('', homePageView, name='home')
]
Before we continue, few words about path conventions. The src
part of the path
is needed for TMC to work, and would normally not exist in a typical Django project.
The pages
is a name of our app. Apps are Django's way to group
services provided by the server. The name pages
could have been different
and a single web server can have different apps at the same time.
The file urls.py
tells the server that if the client (browser) requests for a
root web page, the ouput of homePageView
should be provided as a response.
The optional name
parameter is sometimes handy if you need to refer to the path from your code.
The file views.py
contains the actual definition of homePageView
.
Requests and Responses
Web applications typically respond to requests to multiple paths, where each path has specific functionality. In the example below, there are three separate paths, each of them returning a different string as a response to the user.
# urls.py
from django.urls import path
from .views import pathView, trailView, routeView
urlpatterns = [
path('path/', pathView, name='path'),
path('trail/', trailView, name='trail'),
path('route/', routeView, name='route')
]
# views.py
from django.http import HttpResponse
def pathView(request):
return HttpResponse('Path')
def trailView(request):
return HttpResponse('Trail')
def routeView(request):
return HttpResponse('Route')
Each request may contain information that is being sent to the web application. In principle, there are two ways to handle this: (1) by adding parameters to the address, or by (2) adding parameters to the request body. We will look into the request body later in the course.
There are two ways of passing arguments in an address. The first way is to add
parameters directly into a path, for example, a path
http://localhost:8000/greet/ada/
, may be parsed such that ada
is a
parameter of the view that handles the request. The parsing of this parameter
is done in urls.py
.
# urls.py
from django.urls import path
from .views import greetView
urlpatterns = [
path('greet/<str:user>/', greetView, name='greet')
]
# views.py
from django.http import HttpResponse
def greetView(request, user):
return HttpResponse('Hi ' + user)
The other, and more conventional way, is to use the GET parameters.
In this case, the path looks like
http://localhost:8000/greet/?user=ada
. The corresponding code would be then
# urls.py
from django.urls import path
from .views import greetView
urlpatterns = [
path('greet/', greetView, name='greet')
]
# views.py
from django.http import HttpResponse
def greetView(request):
user = request.GET.get('user')
return HttpResponse('Hi ' + user)
We will be mostly using this approach throughout the course.
Views to the Users
The applications that we have worked on so far have received a request to a specific path and responded with a string. Whilst it is exactly the same end-to-end functionality that applications that send users HTML content use, HTML content is typically created using templates that include embedded commands that are used for determining the content that should be added to those templates. Here, we use a Django's own template language. For Java(TM) aficionados Thymeleaf is the correct choice.
The templates for pages
app should be placed in src/pages/templates/pages
folder.
In the example below, we have created an application that listens to the root
path /
. When the user makes a request to the application, a HTML page that
has been created based on a template is returned to the user. The template that
is used for creating the site is determined based on the string that the method
returns — here "pages/index.html"
. This will lead to the framework looking for a
template called index.html
at src/pages/templates/pages/
. If the page is
found (make sure the name is correct!), Django template engine will handle the page and
return it to the user.
# urls.py
from django.urls import path
from .views import homePageView, videoPageView
urlpatterns = [
path('', homePageView, name='home')
]
# views.py
from django.http import HttpResponse
from django.template import loader
def homePageView(request):
template = loader.get_template('pages/index.html')
return HttpResponse(template.render())
<!-- index.html -->
<html>
<head>
<title>Hi</title>
</head>
<body>
Hello from the template side.
</body>
</html>
Adding data to the view
The purpose of using templates is to be able to generate content dynamically. First, we will need to pass data to the template. The easiest way to do this is to pass a context parameter using the render helper function.
# views.py
from django.shortcuts import render
def homePageView(request):
return render(request, 'pages/index.html', {'msg' : 'Hi!', 'from' : 'Ada'})
The context variable is a dictionary, and it can also contain nested dictionaries and lists.
The context can be then rendered using the {{}}
syntax in the template
<!-- index.html -->
<html>
<head>
<title>Hi</title>
</head>
<body>
{{msg}} from {{from}}
</body>
</html>
Forming up: Content from Forms
Web applications may contain forms that are used to send content to the
application. Forms are defined in HTML (see
form) using the form
-element.
The form-element will contain the path to which the content will be sent to,
the type of the request, and the data. For now, the type of the request will be
POST. We will discuss POST and GET later.
The data is defined using fields such as the input field (<input
type="text"...
), and the content is sent to the server using a button (<input
type="submit"...
). The form below is submitted to the root path of the
application, and the field that the user can input content to has a name
"content".
<form action="/" method="POST">
{% csrf_token %}
<input type="text" name="content"/>
<input type="submit"/>
</form>
Note that above code contains a django special tag {% csrf_token %}
. The tag
will generate a hidden input with a random value that server remembers.
Whenever the form is submitted the hidden input must match what server
remembers. This is a security counter measure against CSRF attacks which we
will discuss in great details later.
Handling Lists
One may also include lists to the template.
# views.py
from django.shortcuts import render
def homePageView(request):
return render(request, 'pages/index.html', {'msg' : 'Hi!', 'senders' : ['Ada', 'Alice', 'Bob']})
The list can be enumerated using the {% for %}
syntax in the template
<!-- index.html -->
<html>
<head>
<title>Hi</title>
</head>
<body>
{{msg}} from
{% for user in senders %}
{{user}}
{% endfor %}
</body>
</html>
In the previous exercise the server had to keep a track of the current list.
This is done using sessions
, a functionality provided by Django. Essentially,
Django allows to maintain a state for a user. The state is a dictionary-like object
that can be accessed via request.session
. Since a web server typically has multiple
users at the same time, sessions need to be identified for different users.
This is done via cookie: id is given to the browser by the server, the id is stored
in a cookie by the browser, and is sent to the server every time a browser makes a request.
Sessions are stored in a database by default, meaning that if you restart the server,
the session will still be intact.
During this part of the securing software course, we have taken the first steps into understanding web applications. The next part will look into using databases and the underlying HTTP protocol.
Remember to check your points from the ball on the bottom-right corner of the material!