This is the tenth article in the series where I describe my experience of writing a Python web application using the Flask mic framework.
The purpose of this guide is to develop a fairly functional microblog application, which I decided to call
microblog
in the absence of originality.
')
Brief repetition
In the previous article, we improved our queries so that they return posts to the page.
Today we will continue to work with our database, but with a different purpose. All applications that store content should provide the ability to search.
For many types of websites, you can simply enable Google, Bing, etc. index everything and provide search results. This works well with sites that are based on static pages, such as a forum. In our small application, the basic unit of content is a short user post, not a whole page. We want a more dynamic search result. For example, if we search for the word “dog”, we want to see all user posts that include this word. Obviously, the search result page does not exist until no one searches, so search engines will not be able to index it.
Introduction to Full-Text Search Systems
Unfortunately, support for full-text search in relational databases is not standardized. Each database implements full-text search in its own way, and SQLAlchemy does not have a suitable abstraction for this case.
We are now using SQLite for our database, so we could just create a full-text index using the capabilities provided by SQLite, bypassing SQLAlchemy. But this is a bad idea, because if one day we decide to switch to another database, we will have to rewrite our full-text search for another database.
Instead, we are going to leave our database for working with ordinary data, and create a specialized database for search.
There are several open source full-text search systems. Only one, as far as I know, has a Flask extension called Whoosh, and its engine is also written in Python. The advantage of using pure Python is the ability to install it and run wherever Python is available. The disadvantage is the efficiency of the search, which does not compare with the engines written in C or C ++. In my opinion, it would be an ideal solution to have an extension for Flask that can connect with different systems and abstract us from details, as Flask-SQLAlchemy does, freeing us from the nuances of various databases, but there is nothing like that in the full-text search area. Django developers have a very good extension that supports various full-text search systems called django-haystack. Maybe one day someone will create a similar extension for Flask.
But now, we implement our search using Whoosh. The extension we are going to use is Flask-WhooshAlchemy, which combines the Whoosh base with the Flask-SQLAlchemy model.
If you do not yet have Flask-WhooshAlchemy in your virtual environment, it's time to install it. Windows users should do this:
flask\Scripts\pip install Flask-WhooshAlchemy
All others can do this:
flask/bin/pip install Flask-WhooshAlchemy
Configuration
The configuration of Flask-WhooshAlchemy is very simple. We just have to tell the extension the name of our base for full-text search (the
config.py
):
WHOOSH_BASE = os.path.join(basedir, 'search.db')
Model changes
Since Flask-WhooshAlchemy integrates Flask-SQLAlchemy, we need to specify which data should be indexed in which models (file
app/models.py
):
from app import app import flask.ext.whooshalchemy as whooshalchemy class Post(db.Model): __searchable__ = ['body'] id = db.Column(db.Integer, primary_key = True) body = db.Column(db.String(140)) timestamp = db.Column(db.DateTime) user_id = db.Column(db.Integer, db.ForeignKey('user.id')) def __repr__(self): return '<Post %r>' % (self.body) whooshalchemy.whoosh_index(app, Post)
The model now has a new field
__searchable__
, which is an array with all the fields of the
__searchable__
that should be included in the index. In our case, we need only the index of the body field of our post.
We also initialize the full-text index for this model by calling the
whoosh_index
function.
Since we did not change the format of our database, we do not need to do a new migration.
Unfortunately, all the posts that were in the database before adding the full-text search engine will not be indexed. To make sure that the database and the search engine are synchronized, we must remove all posts from the database and start over. First, run the Python interpreter. For Windows users:
flask\Scripts\python
For everyone else:
flask/bin/python
With this request, we delete all posts:
>>> from app.models import Post >>> from app import db >>> for post in Post.query.all(): ... db.session.delete(post) >>> db.session.commit()
Search
Now we are ready to search. Let's first add some posts to the database. We have two ways to do this. We can start the application and add posts via a web browser as a regular user, or we can do it through an interpreter.
Through the interpreter, we can do this as follows:
>>> from app.models import User, Post >>> from app import db >>> import datetime >>> u = User.query.get(1) >>> p = Post(body='my first post', timestamp=datetime.datetime.utcnow(), author=u) >>> db.session.add(p) >>> p = Post(body='my second post', timestamp=datetime.datetime.utcnow(), author=u) >>> db.session.add(p) >>> p = Post(body='my third and last post', timestamp=datetime.datetime.utcnow(), author=u) >>> db.session.add(p) >>> db.session.commit()
The Flask-WhooshAlchemy extension is very cool because it connects to Flask-SQLAlchemy automatically. We do not need to maintain a full-text search index, everything is done transparently for us.
Now we have several posts indexed for full-text search and we can try to search:
>>> Post.query.whoosh_search('post').all() [<Post u'my second post'>, <Post u'my first post'>, <Post u'my third and last post'>] >>> Post.query.whoosh_search('second').all() [<Post u'my second post'>] >>> Post.query.whoosh_search('second OR last').all() [<Post u'my second post'>, <Post u'my third and last post'>]
As you can see in the examples, requests do not have to be limited to single words. In fact, Whoosh supports
excellent search language .
Full-text search integration in our application
To make the search available to users of our application, we need to make a few small changes.
Configuration
In the configuration, we must specify how many search results should be returned (
config.py
):
MAX_SEARCH_RESULTS = 50
Search form
We are going to add a search form to the navigation bar at the top of the page. The location at the top is very good, since the search will be available from all pages.
First we need to add a search form class (
app/forms.py
):
class SearchForm(Form): search = TextField('search', validators = [Required()])
Then we need to create a search form object and make it available to all templates. We put it in the navigation bar, which is common to all pages. A simple way to achieve this is to create a form in the
before_request
handler, and insert it into the global variable
g
(file
app/views.py
):
from forms import SearchForm @app.before_request def before_request(): g.user = current_user if g.user.is_authenticated(): g.user.last_seen = datetime.utcnow() db.session.add(g.user) db.session.commit() g.search_form = SearchForm()
Then we will add the form to our template (
app/templates/base.html
):
<div>Microblog: <a href="{{ url_for('index') }}">Home</a> {% if g.user.is_authenticated() %} | <a href="{{ url_for('user', nickname = g.user.nickname) }}">Your Profile</a> | <form style="display: inline;" action="{{url_for('search')}}" method="post" name="search">{{g.search_form.hidden_tag()}}{{g.search_form.search(size=20)}}<input type="submit" value="Search"></form> | <a href="{{ url_for('logout') }}">Logout</a> {% endif %} </div>
Please note we display the search form only when the user is logged in. In the same way, the
before_request
handler will create the form only when the user is logged in, since our application does not show any content to unauthorized guests.
View. Search function
The
action
field for our form was set above to send all requests to the
search
function of our view. This is where we will execute our full-text queries (
app/views.py
):
@app.route('/search', methods = ['POST']) @login_required def search(): if not g.search_form.validate_on_submit(): return redirect(url_for('index')) return redirect(url_for('search_results', query = g.search_form.search.data))
This function is actually not so big, it simply collects the request from the form and redirects it to another page that accepts the request as an argument. We do not search directly in this function so that the user's browser does not issue a warning about re-submitting the form if the user tries to refresh the page. This situation can be avoided by making a redirect to a POST request, then when the page is updated, the browser will update the page to which the redirect was, and not the request itself.
Results page
After the query string is submitted by the form, the POST handler passes it through redirection to the
search_results
handler (
app/views.py
):
from config import MAX_SEARCH_RESULTS @app.route('/search_results/<query>') @login_required def search_results(query): results = Post.query.whoosh_search(query, MAX_SEARCH_RESULTS).all() return render_template('search_results.html', query = query, results = results)
The
search_result
function sends a request to Whoosh, passing along with the request a limit on the number of results in order to protect against a potentially large number of search results.
The search is completed in the search_result template (
app/templates/search_results.html
):
{% extends "base.html" %} {% block content %} <h1>Search results for "{{query}}":</h1> {% for post in results %} {% include 'post.html' %} {% endfor %} {% endblock %}
And here we can again reuse our
post.html
.
Final words
We have now completed another very important, albeit often overlooked feature, which a decent web application should have.
Below I post the updated version of the microblog application in all the changes made in this article.
Download
microblog-0.10.zip .
As always, there is no database, you have to create it yourself. If you follow this series of articles, you know how to do it. If not, go back to the database article to find out.
I hope you enjoyed this tutorial.
Miguel